Fraud Analysis Techniques Using ACL
BECOME A SUBSCRIBER! Did you purchase this product from a bookstore? If you did, it’s important for you to become a subscriber. John Wiley & Sons, Inc. may publish, on a periodic basis, supplements and new editions to reflect the latest changes in the subject matter that you need to know in order to stay competitive in this ever-changing industry. By contacting the Wiley office nearest you, you’ll receive any current update at no additional charge. In addition, you’ll receive future updates and revised or related volumes on a 30-day examination review. If you purchased this product directly from John Wiley & Sons, Inc., we have already recorded your subscription for this update service. To become a subscriber, please call 1-877-762-2974 or send your name, company name (if applicable), address, and the title of the product to: mailing address:
Supplement Department John Wiley & Sons, Inc. One Wiley Drive Somerset, NJ 08875
e-mail: fax:
[email protected] 1-732-302-2300
For customers outside the United States, please contact the Wiley office nearest you: Professional & Reference Division John Wiley & Sons Canada, Ltd. 22 Worcester Road Etobicoke, Ontario M9W 1L1 CANADA Phone: 416-236-4433 Phone: 1-800-567-4797 Fax: 416-236-4447 Email:
[email protected]
John Wiley & Sons, Ltd. The Atrium Southern Gate, Chichester West Sussex PO 19 8SQ ENGLAND Phone: 44-1243-779777 Fax: 44-1243-775878 Email:
[email protected]
John Wiley & Sons Australia, Ltd. 33 Park Road P.O. Box 1226 Milton, Queensland 4064 AUSTRALIA Phone: 61-7-3859-9755 Fax: 61-7-3859-9715 Email:
[email protected]
John Wiley & Sons (Asia) Pte., Ltd. 2 Clementi Loop #02-01 SINGAPORE 129809 Phone: 65-64632400 Fax: 65-64634604/5/6 Customer Service: 65-64604280 Email:
[email protected]
Fraud Analysis Techniques Using ACL David Coderre
John Wiley & Sons, Inc.
This book is printed on acid-free paper. ∞ C 2009 by John Wiley & Sons, Inc. All rights reserved. Copyright
Published by John Wiley & Sons, Inc., Hoboken, New Jersey. Published simultaneously in Canada. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400, fax 978-646-8600, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, 201-748-6011, fax 201-748-6008. Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
For general information on our other products and services, or technical support, please contact our Customer Care. Department within the United States at 800-762-2974, outside the United States at 317-572-3993 or fax 317-572-4002. Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books. For more information about Wiley products, visit our Web site at www.wiley.com. Library of Congress Cataloging-in-Publication Data: Coderre, David G. Fraud analysis techniques using ACL / David G. Coderre. p. cm. Includes index. ISBN 978-0-470-39244-7 (paper/cd-rom) 1. Fraud. 2. Fraud investigation. 3. Fraud–Prevention. 4. Auditing, Internal–Data processing. I. Title. HV8079.F7C627 2009 657 .45028553–dc22 2009010846 Printed in the United States of America 10 9 8 7 6 5 4 3 2 1
Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix About This Toolkit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix Who Should Use This Toolkit? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix What Is Included in This Toolkit? . . . . . . . . . . . . . . . . . . . . . . . . . . . ix System Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix How to Use This Toolkit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix Install the Fraud Toolkit Application . . . . . . . . . . . . . . . . . . . . . . . . x How This Book Is Structured . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x Script Code Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi Flowcharts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi Modifications/Updates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii Contacting the Author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Using Data Analysis to Detect Fraud . . . . . . . . . . . . . . . . . . . . . . . . . 2 Fraud: Risks and Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Why Do People Commit Fraud? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Why Use Data Analysis Software? . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Identifying Fraud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Proactive Fraud Investigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Benefits of Data Analysis with CAATTs . . . . . . . . . . . . . . . . . . . . . . 6 About Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 What Is a Script? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Benefits of Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Preparing Scripts for Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Copying Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Copying Table Layouts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Working with Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Launching the Fraud Toolkit Application . . . . . . . . . . . . . . . . . . 8 Running the Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Customizing Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Customizing: An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Creating Your Own Fraud Application . . . . . . . . . . . . . . . . . . . . . . 13 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Chapter 1: Start and Menu . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Launching the Fraud Toolkit Tests . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Starting the Fraud Toolkit Application . . . . . . . . . . . . . . . . . . . . . . 16 Placement of Start and Fraud Menu Scripts . . . . . . . . . . . . . . . . . 17 How the Scripts Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Start . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Fraud Menu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Log Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Exiting a Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Working without the Fraud Menu . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Chapter 2: Completeness and Integrity . . . . . . . . . . . . . . . . . . 22 Checking for Blanks and Data Type Mismatches . . . . . . . . . . . . . 22 Running Completeness and Integrity . . . . . . . . . . . . . . . . . . . . . . . 23 How the Scripts Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Carriage Returns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Understanding the Verify Command . . . . . . . . . . . . . . . . . . . . . . . 24 Understanding the Group Command . . . . . . . . . . . . . . . . . . . . . . . 25 Deleting Temporary Variables, Fields, and Files . . . . . . . . . . . . 26 Variables for Complete . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Review and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Case Study: General Ledger Accounts Unaccounted For . . . . . . 29
Chapter 3: Cross-Tabulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Organizing Your Data to Find Trends . . . . . . . . . . . . . . . . . . . . . . . 30 Running Cross-Tabulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Benefits of Cross-Tabulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
v
Contents
How the Scripts Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Challenges of Cross-Tabulation . . . . . . . . . . . . . . . . . . . . . . . . . . 34 X-Axis Labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Workspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Deleting Temporary Variables, Fields, and Files . . . . . . . . . . . . 36 Variables for Cross Tabs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Review and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Case Study: Not Enough Clients . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Case Study: Calling Cards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Chapter 4: Duplicates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Finding Higher-Risk Items . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Payroll . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Accounts Payable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Running Duplicates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 How the Scripts Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 The Role of Subscripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Case Study: Duplicate Payments . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Dup Dialog Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 If Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 Dup Multiple Keys1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Macro Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 KeyChange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Define Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 LENGTH() and HEX() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 Deleting Temporary Variables, Fields, and Files . . . . . . . . . . . . 46 Review and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Checking for Duplicates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Payroll Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 Accounts Payable Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
vi
Chapter 5: Gaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Identifying Transactions Missing from a Sequence . . . . . . . . . . . 52 Running Gaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 How the Scripts Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 Deleting Temporary Variables, Fields, and Files . . . . . . . . . . . . 54 Review and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 Case Study: Free Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Chapter 6: Data Profile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 Establishing Normal Values and Investigating Exceptions . . . . . 56 Running Data Profile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 How the Scripts Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Flow of Data in Data Profile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Data Profile Test Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Deleting Temporary Variables, Fields, and Files . . . . . . . . . . . . 61 Review and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Stratify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 Round Amounts, Exact Multiples, and Frequent Values . . . . . . 65 Round Amounts: Multiples of 5, 10, 25, or 100 . . . . . . . . . . . . . . 65 Exact Multiples of . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Frequently Used Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Profiling with Character Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Items with the Most Exact Multiples . . . . . . . . . . . . . . . . . . . . . . . . 67 Least/Most Used Items . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 Case Study: Receipt of Inventory . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 Case Study: Exact Multiples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 Filtering and Drilling Down . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 Filtering before Profiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 Filtering after Profiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Contents
Chapter 7: Ratio Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 Pinpointing Suspect Transactions and Trends . . . . . . . . . . . . . . . 74 Running Ratio Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 How the Scripts Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 Max/Max2 and Max/Min Ratios . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 Two Fields: Num field1 / Num field2 Ratio . . . . . . . . . . . . . . . . . 80 Deleting Temporary Variables, Fields, and Files . . . . . . . . . . . . 80 Review and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Case Study: Dormant but Not Forgotten . . . . . . . . . . . . . . . . . . . . 84 Case Study: Doctored Bills . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Chapter 8: Benford’s Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 Identifying Anomalous Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 Understanding Benford’s Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 Identifying Irregularities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 Running Benford Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 Running Benford Custom Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 90 Creating the Custom Distribution . . . . . . . . . . . . . . . . . . . . . . . 90 Division by Zero . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 Testing against the Custom Distribution . . . . . . . . . . . . . . . . . . 92 How the Benford Scripts Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 Standard Benford Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 Benford Custom Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 Deleting Temporary Variables, Fields, and Files . . . . . . . . . . . . 96 Review and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 Case Study: Signing Authority . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
Chapter 9: Developing ACL Scripts . . . . . . . . . . . . . . . . . . . . . . 100 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 Data Analysis: Generic Approach . . . . . . . . . . . . . . . . . . . . . . . . . 100
ACL Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 DISPLAY Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 SET Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 DELETE Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 OPEN Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 Commands for Defining Relationships . . . . . . . . . . . . . . . . . . 103 Basic ACL Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 IF Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 User-Defined Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 Using System and User-Defined Variables . . . . . . . . . . . . . . . 106 DEFINE Field/Expression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 Workspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Sharing Workspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 What Is a Script? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Creating Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Commenting Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 COMMENT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 END . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 Editing Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 Running Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 RUN ACL Script from DOS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 Saving a Script to a .BAT file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 Interactive Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 ACCEPT Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 Dialog Boxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 Adding Selections to Drop-Down and Project Item Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 Macro Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 Editing Dialog Boxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
vii
Contents
Subscripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 Special Uses for Subscripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 Repeating a Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Error Trapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Consolidation Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 Advanced ACL Scripting Techniques . . . . . . . . . . . . . . . . . . . . . . 125 GROUP Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Simple GROUP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 Conditional GROUP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 Nested GROUP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 LOOP and OFFSET() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 Applications Menu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 Building an Application Menu . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 Creating Submenus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
viii
Chapter 10: Utility Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 Auto Execute . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 Extract Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 Ending Balance Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 Running Total . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 Maximum and Minimum Values . . . . . . . . . . . . . . . . . . . . . . . . . 143
Appendix: ACL Installation Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
Preface About This Toolkit Fraud Analysis Techniques Using ACL is designed to help auditors and fraud investigators find the truth hidden in an incredible volume of transactions. Usually, the search ends with the auditor or fraud investigator gaining a better understanding of how an organization conducts its business. However, it can also end with the discovery of fraud.
The scripts in the second set are “utility” scripts designed to answer common problems. These scripts are stored in an ACL project called Utility Scripts.ACL.
System Requirements The scripts included with this book require:
ACL for Windows Version 8.0 or later. While a slightly modified version of the scripts will work for ACL for Windows Version 6.57 or later, the scripts as supplied require Version 8.0, and specific instructions are only included for that version.
Microsoft Windows R 95/98/ME/NT/2000/XP, Vista or Windows NTR 4.0/2000
Who Should Use This Toolkit? This Toolkit is designed for ACL users from novice to expert. To use Fraud Analysis Techniques Using ACL, you need some basic knowledge of ACL, including how to access data files by building table layouts and how to launch a script. Fraud Analysis Techniques using ACL tells you how to get valuable results using ACL scripts, and demystifies the code behind these scripts. Some readers may want to design their own applications using this Toolkit as an inspiration or guide. In this case, Chapter 9, on Developing ACL Scripts, will be useful.
Note: While many of the scripts work with the unicode version of ACL, they were not designed to be used with this version of ACL.
How to Use This Toolkit What Is Included in This Toolkit? The Toolkit includes two sets of ACL scripts. The first set of 36 scripts is integrated into a menu-driven application. These simple but robust scripts use dialog boxes to prompt you for input as they perform a variety of fraud tests. The scripts are stored in an ACL Project called Fraud Toolkit.ACL, which is included on the CD-ROM at the back of this book. The CD-ROM also contains several sample data files (already defined in the Fraud Toolkit Project) that you can use to try out the tests.
Using the provided ACL scripts involves five essential steps: 1. Install the application (from the CD-ROM to your system). 2. Define your data files to ACL. 3. Read about the tests you want to use. 4. Run the tests (launch the Start script and follow the menus). 5. Investigate the transactions that are highlighted by the tests that
you ran.
ix
Preface
Note: If you do not have ACL installed on your computer, see the Appendix for instructions on how to install the educational version of ACL.
Install the Fraud Toolkit Application: 1. Insert the Fraud Toolkit CD into the CD reader. This will
automatically start the install program.
How This Book Is Structured The introductory sections of this book explain how to install, run, and customize the Fraud Toolkit scripts. These sections also provide information on using scripts and other computer-aided techniques for fraud detection and deterrence. The chapters that follow deal with the scripts included with the Toolkit. Chapter 1 explains the Start and Fraud Menu scripts, while Chapters 2 through 8 explain the scripts that “power” the various fraud tests. The beginning of each of the first eight chapters includes step-by-step instructions on how to run the chapter-specific tests. These instructions are followed by an in-depth discussion of how the scripts work. The code for every script is displayed in its entirety and is accompanied by explanations of what key lines of code are accomplishing. Each of the first eight chapters includes a Review and Analysis section that suggests ways to interpret your results. Also included are case studies that offer real-world examples of how the scripts can be used to search for fraud.
2. Click NEXT to accept default directory (C:\ACL Data), or click
Browse and select another directory where the Fraud Toolkit will be installed; then click NEXT.
3. Click OK.
x
Novice users may want to read only the instructions on how to run the tests and the section in each chapter about interpreting the results. Expert users may also want to examine the script code and read about how the scripts work. The categories of the fraud detection tests are:
Completeness and integrity
Cross-tabulation
Duplicates
Gaps
Data profile
Preface
Ratio analysis
Benford analysis
Chapter 9 is a primer on designing and developing your own ACL scripts. It includes instruction, guidance, and exercises. Chapter 10 describes the utility scripts that are also included on the CD-ROM.
program flow in general. The key below shows a generic example of an operation that starts with one file, operates on it using an ACL command, and produces another file. Typical Data Flow Diagram
Input File
Script Code Conventions The script code presented in this book is formatted in a fixed-width font in a way that allows the reader to easily recognize key elements:
UPPERCASE
Denotes commands, functions, and reserved keywords such as command parameters and field modifiers.
Title Case
Denotes file names and field names, including computed fields.
lowercase
Denotes variables.
bold
Denotes items of special interest.
Note: Script code is not actually case-sensitive. For example, the results of a command are the same regardless of whether it is entered as “GAPS” or “gaps.” However, consistent formatting conventions make code easier to write and interpret.
Flowcharts There are several types of diagrams in this book, the most common being flowcharts. These charts are specifically designed to show operations on data, as opposed to interactions with the user, or
CARD NO
REFNUM
63211011101111
700757556
AMOUNT 3644.20
63211011101121
700757556
2703.93
63211011101131
700757556
4191.31
63211011101141
700757556
4635.92
63211011101151
700757556
2540.42
63211011101161
700757556
4703.93
Command Output File CARD NO 63211011101129 63211014927153 63211014927154
REFNUM 677540 270090507 270702270
AMOUNT 35645.44 25744.51 46160.00
63211014927155 63211264927156 63211014927157 63211014927157
270777767 270722577 270726074 270728200
36455.00 45113.00 80000.00 25645.33
xi
Preface
Modifications/Updates
Acknowledgments
Any modifications or updates to the scripts provided with the Toolkit will be posted at www.caats.ca. You are encouraged to check this site from time to time.
Many people contributed to this project. I want to offer special thanks to Jean-Guy Pitre for proposing major improvements to several of my scripts and Porter Broyles for his ongoing support.
Contacting the Author
Thanks to my family—Anne, Jennifer, and Lindsay—for not resenting the time I spent working on this project.
I can be reached at this e-mail address:
[email protected]. Comments, questions, and suggestions about this book or my other books are most welcome.
xii
Fraud Analysis Techniques Using ACL
Introduction Using Data Analysis to Detect Fraud To cope with thousands, even millions, of transactions, and pick out the few that may be fraudulent, auditors and fraud investigators need powerful data analysis tools. However, the data analysis techniques that comprise state-of-the-art fraud detection tools can sometimes be perplexing. I hope that by shedding light on these techniques and providing easy-to-use fraud tests, Fraud Detection Techniques Using ACL will help auditors and fraud investigators discover fraud and take measures to prevent it.
In his 1997 book, Occupational Fraud and Abuse, Joseph Wells offers statistics on the median loss due to fraud. As you can see, the median losses are significant, and, if you include fraudulent financial statements, the figures are much higher. The median loss for fraudulent financial statements is $4 million. Although this represents the size of the misstatement rather than an actual loss, it is still considered fraud. Fraud Type False voids—cash registers
Median Loss $50,000
Fraud: Risks and Costs
Ghost employees
$275,000
What is the risk or likelihood of a fraud occurring in the organizations you audit? Though it is not the easiest question to answer, many studies have tried. A 1997 survey by the audit firm of KPMG states that 63 percent of organizations had at least one fraud in the past two years. A 1999 KPMG survey has 57 percent of respondents reporting fraud in their company. It seems clear that the risk of fraud is high.
Commissions
$200,000
What about the cost of fraud? A 2008 study by the Association of Certified Fraud Examiners (ACFE) states that, in the United States, financial losses due to fraud are a staggering $994 billion a year—and some studies quote even higher figures. The KPMG Forensic Integrity Survey 2005–2006 found that 74 percent of employees reported that they had observed misconduct in the prior 12 months; and 50 percent reported that what they had observed would result in “a significant loss of public trust if discovered.” When you add to this the intangible costs, such as negative publicity, reduced morale and shareholder confidence, and loss of goodwill, it is easy to see why organizations are concerned about fraud.
2
Skimming receivables Kickbacks
$52,000 $250,000
Why Do People Commit Fraud? Interviews with people who have committed fraud show that most did not set out with the intent to do so. Often, they simply took advantage of an opportunity. Many times the first fraudulent act is an accident. For example, they may process the same invoice twice. But when they realize that nobody has noticed, the fraudulent acts become deliberate and more frequent. The “10-80-10” law, popular among fraud investigators, states that 10 percent of people will never commit fraud, 80 percent of people will commit fraud under the right circumstances, and 10 percent actively seek out opportunities for fraud. So we need to be vigilant about the
Introduction
10 percent who are out to get us, and we should try to stop the 80 percent from making a mistake that could ruin their lives. Usually, fraud occurs through a combination of opportunity, rationalization, and pressure: An opportunity arises, the person feels that the act is not entirely wrong, and there is some sort of pressure—financial or otherwise—to commit the fraud.
Why Use Data Analysis Software? Most frauds are still discovered by outside sources such as police, anonymous letters, and customers. Others are discovered only by accident. This raises questions about the methods auditors are applying to seek out and investigate fraud. What’s more, the amount of undetected fraud invites another question: Are auditors making effective use of data analysis software to detect fraud?
has led to an increase in the complexity of internal controls. Distributed processing, worldwide networking, and remote access to corporate systems have made auditing a more dynamic, difficult, and challenging profession. The tried-and-true techniques and practices of auditing do not always work. Often a paper trail does not exist and the risks are completely different from those in manual or paper-based systems. For many years auditors and fraud investigators relied upon a manual review of the transactions. Auditors were often depicted with their shirtsleeves rolled up, poring over mountains of paper. For example, it was once standard practice to open the file cabinet and select every fifth file for examination, a technique known as interval sampling.
As more businesses store information on computers, more fraud is committed with the help of computers. Fortunately, the same technology gives auditors and forensic investigators a new set of weapons to use in their fight against fraud.
Today, however, more and more fraud auditors are making automated tools and techniques a part of their investigations. Data analysis software allows investigators to gain a quick overview of the business operations and easily drill down into the details of specific areas. As a result, examinations are much more detailed and comprehensive than a manual review of a sample of files.
In fact, fraud detection is a task ideally suited for computer-assisted audit tools and techniques (CAATTs). A 1982 study of hundreds of audit working papers found that such techniques identified almost 30 percent of all reported financial errors—more than any other audit method. So, for more than two decades, the use of CAATTs has been the most powerful audit tool for detecting financial errors.
Computer-assisted audit techniques can be used to develop an understanding of the relationships between various data elements, both financial and nonfinancial, and to examine the data for trends. Analytical procedures can be used to perform substantive tests and conduct investigations. Computers not only provide analytical opportunities, but also aid in the understanding of the business area and its associated risks.
In recent years, CAATTs have become more powerful and more widely used by auditors and fraud investigators. During the last decade, the use of computer-assisted tools and techniques has become standard practice. The degree of automation in business
Identifying risks and measuring losses electronically can improve the overall quality of a fraud investigation. Results can help fraud investigators focus their efforts and address areas of abuse and fraud, rather than waste time reviewing valid transactions.
3
Fraud Analysis Techniques Using ACL
Identifying Fraud To discover fraud, you must know what fraud looks like. You must be able to identify symptoms in the data that point to fraud. The saying “it takes one to know one” does not mean that auditors and investigators need to commit fraud. However, if they wish to prevent fraud from happening, they must know who could be involved, what is possible, and how fraud could occur. Often the people who commit fraud are simply opportunists, taking advantage of a weakness or absence of control. Auditors must identify the opportunities before fraud takes place, and address any weaknesses in the controls if they hope to prevent fraud. But they also must be able to think like a perpetrator of fraud in order to detect the fraud. Who could benefit from the identified weaknesses? Identified control weaknesses must be examined from the point of view of who can benefit. Without a clear understanding of the control weakness, and an assessment of who could take advantage of the weakness, auditors are still somewhat in the dark. Assessing the degree to which people could benefit from the weakness gives you a measure of “opportunity.” The fraud triangle—opportunity, rationalization, and pressure—is what drives people to commit fraud. The understanding of who could exploit the identified control weakness can focus the search for fraud on the persons with the greatest opportunity to commit the fraud. Since fraud is often largely a crime of opportunity, control gaps and weaknesses must be found and, if possible, eliminated, or reduced. Widely distributed audit guides and standards address such exposure concerns directly. For example, the IIA standard on proficiency (1210.A2) requires auditors to have sufficient knowledge of possible frauds to be able to identify their symptoms.
4
Auditors must be aware of what can go wrong, how it can go wrong, and who could be involved. Statement on Auditing Standards (SAS) #99 from the American Institute of Certified Public Accountants (AICPA), “Consideration of Fraud in a Financial Statement Audit,” was also developed to assist auditors in the detection of fraud. It goes further than its predecessor, SAS #82, to incorporate new provisions that include:
Brainstorming for the risks of fraud
Emphasizing increased professional skepticism
Ensuring managers are aware of fraud
Using a variety of analytic tests
Detecting cases where management overrides controls
It also defines risk factors for fraudulent financial reporting and theft, and can be used as a basic model for assessing the risk of fraudulent financial reporting. The risks outlined in SAS #99 include factors such as management conditions, the competitive and business environment, and operational and financial stability. Given that many businesses amass vast numbers of transactions, manually reviewing all documents is both costly and time consuming. Audit software can assist by highlighting transactions that contain characteristics often associated with fraudulent activity. Millions of transactions, including data from previous years or other locations, can be reviewed. Modern audit software continues to evolve, providing increasingly powerful tools and techniques. Although the software is designed primarily for auditors, others can also use it effectively to combat fraud. Best of all, the types of data interrogation these tools and techniques allow is almost unlimited.
Introduction
Examples of these techniques include:
Comparing employee addresses with vendor addresses to identify employees posing as vendors.
Searching for duplicate check numbers to find photocopies of company checks.
Searching the list of vendors to identify those with post office boxes for addresses. These can be easily extracted from the vendor file for further follow-up.
Analyzing the sequence of all transactions to identify missing checks or invoices.
Identifying all vendors with more than one vendor code or more than one mailing address.
Finding several vendors with the same mailing address.
Proactive Fraud Investigation The computer, while faster at many tasks, such as indexing, totaling, matching records, and recalculating values, is not the answer to all of the fraud investigator’s problems. Data analysis software can use relationships between data items and comparisons across years or locations to identify unusual transactions. However, verifying and interpreting the results will always demand judgment. Therefore, auditors and fraud investigators should not worry about being replaced by computers. Computers are not a substitute for human judgment and know-how; they merely help auditors apply their knowledge. The main role of computers is to give investigators an enhanced ability
to carry out queries and analyses based upon the results of their initial concerns, speculations, or identified control weaknesses. Fraud investigators and auditors can interrogate a company’s data and develop a detailed understanding of it. This understanding can help them recognize and identify data patterns that may be associated with fraud. Patterns, such as negative entries in an inventory-received field, voided transactions followed by a “no sale,” or a high percentage of returned items, may indicate fraudulent activity. Patterns can also serve as auditor-specified criteria. Transactions meeting the criteria can trigger automatic investigations. Ideally, the data patterns are used to develop a “fraud profile” that gives fraud investigators a way to detect fraud early in its life cycle. An understanding of how the fraud occurs, and what it looks like in the data allows investigators to search the data for the symptoms of fraud. Depending on the degree of risk, analyzing data for fraud symptoms can be a monthly, weekly, or even daily event. Systems can also be built for the continuous auditing of transactions to compare them with the fraud profile. One example is the use of cardholder profiles by major credit card companies. Each purchase by every cardholder is compared to the cardholder’s normal pattern of purchases. The analysis uses known symptoms of possible frauds, such as two purchases on the same card within hours of each other from stores that are thousands of miles apart. Continuous monitoring also tracks and compares patterns in the data, that is, typical dollar value of purchases, types of purchases, and even the timing of the purchases. Thus, cardholders can be called to verify purchases on their cards minutes after the purchases are made. The aim? To prevent the use of stolen cards, even when the cardholders have not yet reported their loss.
5
Fraud Analysis Techniques Using ACL
Benefits of Data Analysis with CAATTs
About Scripts
The main benefit of audit software is that it increases the ability of auditors and fraud investigators to probe the data, turning facts and figures into vital information.
The included scripts give you the flexibility to apply them in a variety of ways. The easiest way is to run them as they are, without any changes to the existing code. If you opt to use the scripts “as is,” you’ll find them efficient and effective. If you choose to modify the scripts for a specific purpose, you’ll see that they’re also highly adaptable. “Customizing Scripts” on page 11 explains how to modify an ACL script.
Audit software also makes extracting information from several files with different database management systems quick and efficient. Combining information from different sources can highlight relationships in the data. For example, reviewing data from an accounts payable file may identify a trend in the expenditures to a particular vendor. But combining the accounts payable data with information from the contracting database may reveal that all contracts with the vendor in question were raised by one contracting officer. This may prompt concerns about possible kickbacks. The interactive nature of audit software allows auditors to conduct a “what if ” analysis. They can form an initial hypothesis, test it, and then revise it based on the results. Automated routines that monitor key symptoms and track trends can be a major deterrent for fraud, identifying it almost as soon as it occurs. This kind of fraud auditing can be instrumental in reducing the number of, and losses attributed to, fraudulent acts. Finally, the sharing of automated techniques and jobs to continuously monitor transactions is easy and effective. Auditors and forensic investigators face increasing challenges brought on by the changing business environment and increased use of technology. But these same factors supply new auditing tools and techniques. To combat fraud, all investigators should consider adding audit software to their arsenal of fraud detection tools.
6
What Is a Script? In computing and accounting, the term script has multiple meanings. In ACL, the term script means something very specific: a set of commands stored under an assigned name within a particular ACL project. The Toolkit is built around an application, which is a collection of scripts that work together. In this Toolkit there are 36 distinct scripts forming one fraud-detection application; and 10 Utility scripts which were written to perform tasks often required but not available through the normal ACL command set. One script can call another script, or many other scripts. The commands recorded in each script are performed in the order the scripts are called. In the Fraud Toolkit, the Start script calls the Fraud Menu script, and the Fraud Menu script then calls some or all of the other scripts, depending on the choices you make.
Benefits of Scripts Scripts allow you to save analyses that have proven their value, so that you can rerun them whenever you like. This can save you time and ensure that periodic analysis is rigorous and consistent. Furthermore, scripts allow auditors or fraud investigators to share
Introduction
techniques, increasing efficiency and effectiveness and ensuring consistency. As this Toolkit demonstrates, scripts can also be interactive, meaning that you are prompted for the input required to complete the processing of the script. Even if you have limited knowledge of analysis software, and no knowledge of how to write or modify scripts, you can still learn how to perform useful tests. Scripts, through the use of a group, also allow the software to process many commands in one read of a data file, improving processing efficiency. The Group command, used within a script, allows you to create programs that:
Once the scripts have been transferred to your hard disk, you can use the scripts either by copying them into an existing ACL project or by copying the table layouts, for the data files you want to analyze, into the project that comes with this Toolkit. Note: The scripts will work only if they are in the same ACL project as the table layouts that are linked to the data to be tested. Also, the Fraud Toolkit project’s properties must not be “Read Only.” Note: The standard version of the scripts (Fraud Toolkit.ACL) is intended for ACL version 7.2 or higher. There is also a version for international use (Fraud Toolkit f.ACL)
Perform complex operations on a file.
Locate and use information from previous records.
Copying Scripts
Perform calculations involving a number of records.
To copy scripts into an existing ACL Project:
Process complex file structures.
1. Open your ACL Project.
With all that they have to offer, it’s easy to see why scripts are a boon to auditors and fraud investigators alike.
2. In the Project Navigator window, right click on the Scripts subfolder or project folder. 3. Select Copy from another Project.
Preparing Scripts for Use Before you can use the scripts included in Fraud Analysis Techniques Using ACL, you must install the ACL Project that contains the scripts on your hard disk. You cannot open an ACL Project directly from a read-only CD-ROM.
4. Select Script. 5. Double-click on either the Fraud Toolkit or Utility script ACL project—selecting it from the proper directory. 6. Click [Add All] or select scripts individually. ACL copies the selected scripts into your project.
To install the scripts on your hard disk:
Copying Table Layouts
1. Close all Windows programs. 2. Insert the CD-ROM into the computer’s CD-ROM drive.
To copy table layouts from your project into the Fraud Toolkit or Utility scripts project:
3. Click Setup.exe and follow the prompts.
1. Open the Fraud Toolkit or Utility script project.
7
Fraud Analysis Techniques Using ACL
2. In the Project Navigator window, right click on the Tables subfolder. 3. Select Copy from another Project. 4. Select Table. 5. Double-click on your ACL project, selecting it from the appropriate directory. 6. Click [Add All] or select the tables individually ACL copies the selected tables into the ACL project.
Working with Scripts Once the Fraud scripts are installed, you double-click the Start script to activate the Fraud Menu and proceed through a series of dialog boxes that prompt you for required information. The scripts work on any file that has been defined to ACL. See Chapter 10 for instructions on using the scripts in the Utility.ACL project. Periodically, the scripts prompt the user for basic information, such as the names of the input files (tables) or numeric fields and required parameters. Once they have run, many of the scripts write the results to both the log file and to the script dialog boxes. Where applicable, scripts may also send the output to a file. Only one script can be run at a time. The script that is running can include instructions to call one or more additional scripts, but at any given time, only the most recently called script will actually be running. For example, the Start script contains the instruction DO Fraud Menu, which causes the Fraud Menu script to run. While the Fraud Menu script is running, the Start script is waiting to resume. When the Fraud Menu script has completed its execution, the next command in the Start script that follows DO Fraud Menu executes.
8
The user cannot continue working in the current ACL session while a script is running. A script can be interrupted by clicking the Cancel button in any of the dialog boxes, or by pressing the ESC button on your keyboard. Keep in mind that interrupting a script may prevent proper deletion of some temporary variables, fields, or files at the end of the run. For example, the variable “error test” is used by several scripts in the Toolkit to flag whether the user has provided correct information. This variable is created when the application starts and is deleted just before the application ends. If a variable or file is not deleted, the user may encounter problems later in their session if another script user the same variable name or file name. All of the active variables can be viewed at any time by choosing Variables from the Edit Menu. The user can delete any stray variables by executing the command DELETE ALL OK from the command line.
Launching the Fraud Toolkit Application The Start script initiates a selection process for the scripts to be run, ensures that all parameters are defined, and creates and deletes temporary variables as required. Use the following method to run the Toolkit’s Start script: 1. In the Project Navigator window, right click on the Start script. ACL displays a box with options to Edit, Run, Cut, Copy, Delete, Rename, Export to OS/390 file, and Properties. 2. Click [Run] to run the Start script. The Fraud Toolkit for ACL dialog box appears, displaying copyright information.
Introduction
3. Click [OK] to continue. The Fraud Menu dialog box appears. The user can now select the tests to be run.
2. File selection (Input/Output). From the Select input file drop-down list, choose the file that to be tested. Some of the scripts also allow the user to name the output file for the results the script will produce. Notice that when ACL produces a data file as output, it automatically creates a table layout for it. The user can view output files in the View window by selecting them from the Project Navigator window. The output can also be used as input files for further processing and analysis in ACL.
Note: The application can be launched by using the Run Script command on the ACL Tools menu.
Running the Scripts The following are the main steps involved in running the Fraud Toolkit scripts:
1. Test selection. From the Fraud Menu, select the check box for each group of tests to be run.
9
Fraud Analysis Techniques Using ACL
3. Apply filter (optional). All scripts except the Complete script prompt the user for an optional filter. The filter can be used to eliminate specified records from the fraud test. To focus the test on one type of transaction, or one particular business unit, select the Apply filter (optional) check box. For more information, see “Filtering” on this page.
Filtering Creative use of filters can speed up the audit and offer more precise results. The Filter Selection script can be applied prior to any test, by selecting the appropriate check box.
Note: If a table has a default filter identified for it, that filter would be automatically applied whenever the file is opened. However, the optional filter will override the default filter. So, take care when working with files that have default filters. 4. Field/Criteria selection. Each script prompts the user to enter parameters in one or more drop-down lists. For example, the Field Selection dialog box for the Gaps test looks like this:
There are two ways to specify a filter: 1. Select a filter that has already been created and saved for the specified input file. A drop-down menu displays all named, saved filters for the selected table layout.
For comprehensive instructions about running a specific script, please refer to the appropriate chapter.
10
2. Enter the filter expression (e.g., Amount > 0). The Select Fields dialog box (displayed using the F2 key) can be used to select a field for the expression. The Date Selector (displayed using the F8 key) can be used to select a date for your expression.
Introduction
Filter_Selection filt1='' DIALOG (DIALOG TITLE "Specify Filter" WIDTH 540 HEIGHT 400) (BUTTONSET TITLE "&OK;&Cancel" AT 432 24 DEFAULT 1) (RADIOBUTTON TITLE "Select named filter (default);Enter filter expression" TO "filt_type" AT 60 108 WIDTH 201 HEIGHT 140 DEFAULT 1) (ITEM TITLE "L" TO "filt1a" AT 288 132 WIDTH 225) (EDIT TO "filt1b" AT 288 204 WIDTH 225) (TEXT TITLE "Click in box above and press F2 to select a field for the filter expression." AT 288 244 WIDTH 234 HEIGHT 47) (TEXT TITLE "Select one of the two choices below - then enter filter" AT 24 64)
Note: Users can also specify a filter by typing the expression they want to use directly into the Enter filter expression text box. However, it the user makes an error and creates an invalid expression, ACL will terminate the script. Selecting an existing filter or using the Select Fields dialog box and the Date Selector to help create expressions make it less likely that an invalid expression will be entered.
Customizing Scripts The scripts can be edited and fully customized to the user’s environment. The degree of customization depends on the requirements and the user’s level of expertise. Options include:
Using the Toolkit as is, without any customization.
Copying only the required scripts to your own ACL project.
Copying and modifying the scripts so that they run in your environment with limited or no prompting of the user.
Using the scripts to create your own scripts by editing the code to change the program logic.
IF filt_type=1 filt1=filt1a IF filt_type=2 filt1=filt1b SET FILTER %filt1% DELETE filt1a OK DELETE filt1b OK
The possibilities for customization are practically endless. If the Toolkit is used to work with the same files repeatedly, perhaps on a monthly or even weekly basis, the user will benefit from reducing the number of dialog boxes and interactive steps by setting specific file names or input values as defaults. Other examples of customization include adding code that automatically sets a particular filter or saves results to a particular directory. It is also possible to make use of the Notify command, which allows the results to be exported in the form of an e-mail message. This may be particularly helpful if there are plans to fully automate these tests. Users faced with new table layouts and business environments may want to maintain and even expand on the interactive aspects of the Toolkit to maximize their options. For example, it is possible to increase the number of
11
Fraud Analysis Techniques Using ACL
fields tested for blank values in the Completeness and Integrity test to accommodate unusually large or complex data sets. In many cases, customization may only require the removal of the script lines that create dialog boxes prompting users for input, and the adding of lines of code to create the necessary variables, such as input and output files, or specified field names. Although the Toolkit covers a wide range of fraud tests, it is far from exhaustive. More experienced users are encouraged to experiment with tests of their own. ACL is a powerful tool that allows users to perform many types of analyses. All scripts have been designed to work with any file defined in ACL. Note: Any time a script is going to be modified a backup copy of the original scripts should be created, and the modified script should be fully tested.
Customizing: An Example Suppose that the ratio analysis performed on the accounts payable data has proven successful. Consequently, it is decided to customize the Ratio Analysis1 script so that it runs automatically against the accounts payable file; that is, without requiring the user to provide any input other than requesting ACL to run the script. As delivered, Ratio Analysis1 prompts the user for the input file, output file, and the optional filter selection. It also calls Ratio Dialog1, prompting for the character and numeric fields, and Ratio Dialog2, prompting for the max/max2 and max/ min ratios and the minimum number of records. For the purpose of the following example, assume that:
12
The input file is Accounts Payable.
The results should be stored in AP Vendor Ratio.
No filter will be applied.
The character field is Vendor Name.
Original Ratio_Analysis1 code DIALOG (DIALOG TITLE "Ratio Analysis - Max/ Min File Selection" WIDTH 740 HEIGHT 400 ) (BUTTONSET TITLE "OK;Cancel" AT 612 24 DEFAULT 1 ) (ITEM TITLE "f" TO "infile" AT 240 168 WIDTH 288 HEIGHT 104 ) (TEXT TITLE "Select input file" AT 72 172 ) (TEXT TITLE "Specify output file (no spaces)" AT 72 280 WIDTH 122 HEIGHT 43 ) (EDIT TO "useroutfile" AT 240 288 WIDTH 287 DEFAULT "Ratio1_Results") (TEXT TITLE "Test determines the highest (max), second highest (max2), and lowest (min) values for a selected numeric field." AT 48 64 WIDTH 509 HEIGHT 40 ) (TEXT TITLE "Ratios: max/ max2 and max/min are calculated for each value of a selected character field." AT 48 112 WIDTH 483 HEIGHT 39 ) (CHECKBOX TITLE "Apply filter (optional)" TO "filt_sel" AT 240 348 )
New AP_Vendor_ratio code The code for the dialog box is deleted.
Original Ratio_Analysis1 code outfile=REPLACE('%useroutfile%',' ','_') OPEN %infile% IF filt_sel DO Filter_Selection DO Ratio1_Dialog1 WHILE error_test error_test=T DO Ratio1_Dialog2 WHILE error_test
Introduction
New AP_Vendor_ratio code infile='Accounts_Payable' outfile='AP_Vendor_Ratio' sumfield='Vendor_Name' numfield='Amount' maxrate = '5' minrat = '15' cnt = '10' OPEN %infile%
The numeric field is Amount.
The max/min ratio is 15.
The minimum count is 10.
The Ratio Analysis1 script would be customized as follows: 1. Copy the Ratio Analysis1 script and rename it AP Vendor Ratio. 2. Delete the line of code that creates a dialog box prompting the user for the input file, output file, and filter selection option. 3. Delete the Do ratio1 dialog1 and Do ratio1 dialog2 commands that prompt for the character and numeric fields, the max/max2 and max/min ratios, and the minimum count. 4. Create the appropriately named variables for the input file (Accounts Payable), output file (AP Vendor Ratio), character field (Vendor Name), numeric field (Amount), Maxrat (5), Minrat (15) and Count (10). Once the user launches the script AP Vendor Ratio, no additional input will be required by the user. The Fraud menu script can even be modified to include the AP Vendor Ratio script as one of the options.
Creating Your Own Fraud Application Experienced ACL users can use the scripts provided in the Toolkit to augment their existing scripts. In some cases, the user may want only to select certain scripts from the Toolkit and add them to their fraud application. Copy the desired scripts into your ACL Project and modify your menu selection script, or the Fraud Menu script, to include the copied scripts as selection options. When copying specific scripts into a project, remember that all related scripts must also be copied in order for the scripts to run properly. Scripts such as the Data Profile script call several subscripts, and all of the scripts call at least
13
Fraud Analysis Techniques Using ACL
one other script: a Dialog script that prompts for and tests user input. For example, the Gaps script calls the Gaps Dialog script. The following lists the scripts that are called by each of the main scripts and, therefore, should be copied when customizing one of these scripts:
Completeness and Integrity—script Complete calls only Complete Dialog.
Cross Tabs—script Cross Tabs calls only the Cross Tabs Dialog script.
Duplicates—script Dup Multiple Keys calls scripts Dup Dialog, Dup Multiple Keys1, and Keychange. The Duplicates test can be performed by copying all four scripts and running script Dup Multiple Keys that calls the other scripts as required.
14
Data Profile—script Data Profile, depending on the tests selected by the user, calls scripts Data Profile Dialog1 and Data Profile Dialog2, Stat, Strat1, Strat2, Round, Exactmult, Exactsum, Freqnum, and Freqchar. Data profiling can be performed by copying all 11 scripts and running the Data Profile script that calls the other scripts as required. None of the individual scripts called by Data Profile prompt for the file to be opened or the numeric field to be tested. Therefore, Data Profile must be run first, prompting for these parameters and calling the other scripts to be run. Ratio Analysis—script Ratio Analysis, depending on the test selected by the user, calls script Ratio Analysis1 for max/max2 and max/min ratios or Ratio Analysis2 for two-fields ratios. Each of these scripts calls their respective subscripts: Ratio1 Dialog1 and Ratio1 Dialog2 or
Ratio2 Dialog1 and Ratio2 Dialog2. Ratio analysis can be performed by copying all seven scripts and running script Ratio Analysis that calls other scripts.
Benford—The Benford’s Law tests employ five scripts. The script entitled Benford prompts you to select either Benford Analysis or Benford Custom Analysis. Benford Analysis calls Benford Dialog, and the Benford Custom Analysis script calls Benford Custom Dialog. A standard Benford analysis can be performed by copying Benford Analysis and Benford Dialog, and then running Benford Analysis. The customized Benford analysis requires that you copy both the two standard Benford scripts and the two Benford Custom scripts.
Note: The Filter Selection script is called by all scripts, except the Complete script, so it should always be copied when customizing the fraud application.
Further Reading David Coderre. Internal Audit: Efficiency through Automation. Hoboken, NJ: John Wiley & Sons. 2009. David Coderre: Computer-Aided Fraud Prevention and Detection. Hoboken, NJ: John Wiley & Sons. 2009. Joseph Wells. Occupational Fraud and Abuse. Austin, TX: Obsidian, 1997.
1
Start and Menu
Launching the Fraud Toolkit Tests Launching the _Start script leads to the Fraud Menu, which, in turn, is the gateway to the 36 ACL scripts that make up the Fraud Toolkit. The scripts are linked through a hierarchical menu structure. The _Start and Fraud Menu scripts allow the user to easily perform a variety of fraud tests. After the user selects tests from the Fraud Menu, the scripts that comprise the tests run, prompting the user for the required input through a series of easy-to-use dialog boxes. The results of these tests either appear in a dialog box or are written to an output file. In addition, the results of all tests are written to the command log. The scripts are part of an ACL project file named “Fraud Toolkit.” To begin using the scripts, the user must first open the Fraud Toolkit project, or the working project to which the scripts have been copied. See “Preparing Scripts for Use” on page 7 for detailed instructions on installing and using the Fraud Toolkit scripts. Next run the _Start script to open the Fraud Menu and view the available tests.
Starting the Fraud Toolkit Application To launch the _Start script and display the Fraud Menu: 1. In the Project Navigator window, click the plus + (Expand) next to the Scripts subfolder or the Scripts icon to expand the list of scripts. 2. Right-click _Start. ACL will ask if you want to run or edit the script. Click [Run]. The Welcome dialog box appears:
16
3. Click [OK] to display the Fraud Menu dialog box; then select the check box for each group of tests to be run. The tests will run in the order in which they appear on the menu. To return to the menu after the tests are complete, check the appropriate box:
Start and Menu
_Start SET ECHO NONE ASSIGN fraud=T ASSIGN exact_off=F IF 'A'='ABC' exact_off=T SET EXACT ON DIALOG (DIALOG TITLE "Fraud Toolkit for ACL" WIDTH 444 HEIGHT 325 ) (BUTTONSET TITLE "OK" AT 168 228 DEFAULT 1 HORZ ) (TEXT TITLE "Press Enter or click OK to continue" AT 120 196 WIDTH 198 ) (TEXT TITLE "WELCOME" AT 180 40) (TEXT TITLE "Fraud Toolkit for ACL" AT 156 88 HEIGHT 18 ) (TEXT TITLE "Copyright 2009 David G. Coderre" AT 300 244 WIDTH 94 HEIGHT 33 ) (TEXT TITLE "Fraud Analysis Techniques Using ACL" AT 120 124 WIDTH 195 ) (TEXT TITLE "Ver 1.0" AT 348 280 HEIGHT 23 ) (TEXT TITLE "John Wiley and Sons" AT 156 160 WIDTH 110 ) DO _Fraud_Menu WHILE fraud IF exact_off SET EXACT OFF DELETE ALL OK SET ECHO ON
Note: By default, a log file with the same name as the ACL Project file will be created when the Project is created. An alternate log file can be specified by typing the log file name, for example, Payroll Fraud Tests, in the Specify log file text box. This creates a log file called Payroll Fraud Tests.log, or, if that file already exists, appends the new results to the existing log file.
Placement of Start and Fraud Menu Scripts The Start script is preceded by a double underscore (_Start), and the Fraud Menu script is preceded by a single underscore ( Fraud Menu). This ensures that these two key scripts are easily found at the top of the Project Navigator window, and that _Start appears before Fraud Menu:
How the Scripts Work A menu-based application such as the Fraud Toolkit has a number of advantages. It conveniently organizes all the available tests and allows the user to run several tests consecutively. If used correctly, the menu also ensures that, once all testing is complete, any temporary variables, files, or settings are properly deleted or restored.
Start The two most important functions of the Start script are activating the system condition Exact Character Comparisons (SET EXACT ON) and creating a variable, fraud, that tests whether the Fraud Menu dialog box is still needed. The Start script checks whether it should reset the ACL system condition to SET EXACT OFF when done. If the user decides to exit the Fraud Menu, the fraud variable is set to False, which ends the Fraud Menu script and returns processing to the _Start script. As evident in the code, the _Start script then deletes the temporary variables that are no longer needed, and, if necessary,
17
Fraud Analysis Techniques Using ACL
resets system conditions. This efficient programming device is employed several times in the Toolkit scripts. The _Start script also includes a dialog box that introduces the application and displays important source and copyright information. Note: Be sure to read the copyright page before modifying these scripts for use in your own organization.
Fraud Menu The Fraud Menu script gives the user the option of selecting, from a series of check boxes, one or more tests to run. Notice that the default is to exit this menu once the tests have run. To return to the Fraud Menu, select the appropriate check box. Modifying the Fraud Menu is easy. Adding or removing menu choices is as simple as changing two lines in the script. For example, suppose the user wanted to add an option to run a script to generate confirmation letters or review user authorizations. Simply modify the dialog box in the Fraud Menu script by adding a check box for this purpose, and then add another If . . . Do statement to launch the new test whenever that check box is selected:
Log Files The Fraud Menu script can also be used to create a different log file to store the contents of the command log. This optional feature is extremely useful for occasions when it is important to separate the results from non fraud-related analyses that were performed. For example, the user might want to keep the results of an audit-related analysis separate from the results of a fraud investigation. By default, the results of all analyses are written to a log file with the same name as the ACL project. By entering a different file name when running the Fraud Toolkit, the user can direct the results to a file name of their choice. In cases where the person committing the fraud is prosecuted,
18
Fraud_Menu SET ECHO NONE DIALOG (DIALOG TITLE "Fraud Menu" WIDTH 740 HEIGHT 400 ) (BUTTONSET TITLE "&OK;&Cancel" AT 612 24 DEFAULT 1 ) (TEXT TITLE "Select one or more fraud tests:" AT 72 40 WIDTH 202 HEIGHT 22 ) (TEXT TITLE "Specify log file (optional)" AT 72 352 ) (EDIT TO "log_file" AT 240 348 WIDTH 317 ) (CHECKBOX TITLE "Completeness and Integrity" TO "menu1" AT 132 72 WIDTH 230 ) (CHECKBOX TITLE "CrossTabulation" TO "menu2" AT 132 96 ) (CHECKBOX TITLE "Duplicates" TO "menu3" AT 132 120 ) (CHECKBOX TITLE "Gaps" TO "menu4" AT 132 144 ) (CHECKBOX TITLE "Data Profile" TO "menu5" AT 132 168 ) (CHECKBOX TITLE "Ratio Analysis" TO "menu6" AT 132 192 ) (CHECKBOX TITLE "Benford Analysis" TO "menu7" AT 132 216 ) (CHECKBOX TITLE "Return to this menu when done" TO "menu8" AT 132 264 ) logout=REPLACE('%log_file%',' ','_') IF LEN(exclude(logout,'_'))>0 SET LOG TO %logout% SET ECHO NONE IF menu1 DO Complete SET ECHO NONE IF menu2 DO Cross_Tabs SET ECHO NONE IF menu3 DO Dup_Multiple_Keys SET ECHO NONE IF menu4 DO Gaps SET ECHO NONE IF menu5 DO Data_Profile
Start and Menu
SET ECHO NONE IF menu6 DO Ratio_Analysis SET ECHO NONE IF menu7 DO Benford SET ECHO NONE IF NOT menu8 fraud=F SET ECHO NONE DELETE log_file OK DELETE logout OK DELETE menu1 OK DELETE menu2 OK DELETE menu3 OK DELETE menu4 OK DELETE menu5 OK DELETE menu6 OK DELETE menu7 OK DELETE menu8 OK SET LOG SET ECHO ON
and you are required to provide evidence, having a separate log file with only the results of the fraud analyses can be highly beneficial. To create an alternate log file, the script performs these steps:
The Dialog command prompts the user for a command log name, and then captures and assigns the user input to a character variable named log file.
The script modifies the user-supplied name to eliminate any spaces (ACL file names cannot have spaces). The REPLACE() function is used to change all spaces to underscores in whatever character string was typed in the dialog box: logout= replace (‘%log_file%',",'_')
The script tests to see if the length of the modified name is greater than zero characters. If the test result is true, the command log results are temporarily redirected to the specified alternate log file. When the fraud tests have finished running and you have exited the Fraud Menu, any subsequent results are sent to the default log file. To review the results, open the alternate log file by using the Set Log command: SET LOG TO log_file name
Then, to set the log back to the default log file, specify: SET LOG OFF
If the application will be run several times during an investigation, the user can specify the alternate log name each time he or she select tests from the Fraud Menu. The alternate log will then be opened and the results will be appended. The log can be set to an alternate file before running the fraud application, in which case there is no need for further action. The new file is the default log file throughout the session.
19
Fraud Analysis Techniques Using ACL
Exiting a Script It is possible to exit a script that was launched from the menu by clicking either the Cancel button or the Close button (top, right corner of every dialog box). This will be necessary if, for example, a script is launched by mistake and you do not wish to run it. However, exiting by this method is not recommended because it interrupts the normal processing of the script application. When a script is interrupted, temporary variables, fields, or files may not be properly deleted, and other system conditions like ECHO ON may not be restored. Note: If a script is interrupted, check the project folder for added files, check the table layouts for computed fields that were not deleted, and remove temporary variables by typing DELETE ALL OK on the command line.
Preferences. If this option is not selected, tests that depend on an exact comparison may produce different results or may not run at all. If, for example, two fields, Amount and Amount2, are selected with the preference turned off, there is a risk that ACL will treat them both as the same field and only analyze the field named Amount. If the Fraud Toolkit is run from the Start script, the user doesn’t have to be concerned about this option; the Start script automatically activates Exact Character Comparisons. However, if the individual scripts are run by double-clicking them, or if the user builds his or her own application, this option should be selected. Set this preference as follows: 1. From the Tools menu select Options 2. On the Table tab select the Exact Character Comparisons check box:
Working without the Fraud Menu All instructions given in this book assume that the user will run tests using the Fraud Menu. However, experienced users may choose to bypass the menu and run certain scripts separately by selecting them from the Project Navigator window. The scripts for the main tests will work if run individually; they each contain commands to delete any temporary variables, fields, and files that they create. The supporting scripts, called by the main test scripts, can also be run individually, but most do not perform tests when run alone. For example, Complete Dialog collects input from the user, but if the Complete script is not run first, the collected information will not be processed. When working without the Fraud Menu, the user must ensure that the Exact Character Comparisons option is selected in ACL
20
The following chart depicts the relationships of the various scripts.
Start and Menu
Complete
Filter_Selection
Complete_Dialog
optional in all major batches except Complete Cross_Tabs
__Start
_Fraud_Menu
Cross_Tabs_Dialog
Dup_Multiple_Keys
Dup_Dialog
Gaps
Gaps_Dialog
Dup_Multiple_Keys1
Data_Profile_Dialog1 Data_Profile
Ratio_Analysis1
Strat1
Round
Strat2
Data_Profile_Dialog2
Ratio_Analysis
Benford
Stat
KeyChange
Ratio1_Dialog1
Benford_Analysis
Exactmult
Freqnum
Exactsum
Ratio_Analysis2
Ratio1_Dialog2
Benford_Dialog
Freqchar
Benford_Custom_Dialog
Ratio2_Dialog1 Ratio2_Dialog2
Benford_Custom_Analysis
21
2
Completeness and Integrity
Checking for Blanks and Data Type Mismatches
values, and character fields are checked for nonprintable, nonreadable data.
The completeness and integrity of a data file is of paramount importance when dealing with potential fraud. Absent records and fields that are blank could falsely indicate fraud or cause a potential fraud to go unnoticed. For this reason, the completeness and integrity of the data analyzed should be ensured.
The count of records with mismatches is shown. If a particular record has more than one field with data integrity problems, the Verify command still only counts the record once.
Key factors that determine whether the data can be relied on or whether more data integrity testing is required include the user’s familiarity with the data source, the strength of general and application controls, the reliance to be placed on the data, and the existence of corroborating evidence. Integrity errors may indicate that the table layout does not match the original data. Check for this kind of error by comparing the layout of the input file with the layout of the file as printed from the source application. You may also need to talk to your system administrator. The Completeness and Integrity tests are typically run first on each data file you intend to investigate. The Completeness test allows the user to search for blank values in character, numeric, or date fields. When running this test up to four key fields can be included in the search. During this test, the fields specified are also surveyed for blank values and a separate count of blanks is shown for each selected field. The Integrity test verifies the integrity of all data in the file. It uses the Verify command to compare the contents of every field, for every record, against the field type as defined in the table layout. If the user chooses the Verify option, the entire file is searched for data type mismatches (the Integrity test). Date fields are checked for invalid dates, numeric fields are checked for invalid (nonnumeric)
22
There are many ways in which data can be invalid, but the two most common data problems are blanks and type mismatches. These are particularly important to identify because there can be a number of reasons why they exist. They can occur because of:
Data entry error. For example, an operator types a name in a date field and doesn’t notice.
Data definition error. For example, the field positions are set incorrectly, causing a name field to be cut in half and mislabeled as a date field.
Transmission error. For example, problems with a portion of a downloaded file result in scrambled bytes.
Data structure maintenance error. For example, the new data format adds fields not identified in the “legacy” format, but the documentation does not make this clear.
There are many other causes. The frequency of the blanks or type mismatches will vary, depending on the kind of error. For example, almost all files will contain data entry problems such as the occasional blank field. However, if the data set was defined incorrectly, there is a good chance that all of the data in a particular field will display as blanks or invalid characters.
Completeness and Integrity
Running Completeness and Integrity 1. From the Fraud Menu, select Completeness and Integrity. Click [OK]. The File Selection dialog box appears:
2. From the Select input file drop-down list, choose the file that you want to analyze. Click [OK]. The Parameters dialog box appears. 3. Select the field(s) to be tested. The user must select at least the Key 1 field. Optionally, the user can select the Verify check box. Click [OK]. The Results dialog box appears:
Note: The unicode version of ACL should test for ‘2000’ and ‘4000’ instead of ‘20’ and ‘40’. Modify the four lines: count if match (key (%keyx%), REPEAT ('20', % len_ %keyx%), REPEAT ('40', % len_ %keyx%) replacing ‘20’ with ‘2000’ and ‘40’ with ‘4000’ (where x = 1,2,3,4). 4. After viewing the results, click [OK]. Results can be viewed in both the View and Command Log windows.
23
Fraud Analysis Techniques Using ACL
How the Scripts Work The main script, Complete, checks the data for blanks and, optionally, for data integrity errors. It accomplishes the first task by converting the values of the selected character, numeric, or date fields to hexadecimal format. It then compares the field contents to the hexadecimal equivalent of a blank (hex ‘20’). If you are using data in EBCDIC format, it also compares the field contents to the EBCDIC equivalent of a blank (hex ‘40’). Using the Group command, the script counts the number of records with blank values in the selected fields.
Carriage Returns When writing to the command log, or creating other formatted output, a script often requires a carriage return or a line feed character, or both. In this toolkit this character has been created by naming a variable “crlf” and setting it equal to CHR(10), which is the ASCII code for a line feed. It is a good idea to follow this script’s example of setting variables at the beginning of the script. For one thing, variables must be initialized before they can be put to use. Having the variables all in one place also makes for easier maintenance.
Understanding the Verify Command Notice that the script code specifies VERIFY ALL, which means that the Verify command examines all fields. You can see that an error limit of 1,000 has also been set. Consequently, the script will terminate when it encounters more than one thousand records with a data integrity problem, regardless of how many records are left to examine. The requirement to set a limit is a safety feature in ACL. That way, if a file has integrity problems in every field, ACL will stop processing after reaching the specified number of integrity
24
Complete SET ECHO NONE crlf=CHR(10) error_test=T filt1='' DIALOG (DIALOG TITLE "Completeness and Integrity - File Selection" WIDTH 510 HEIGHT 326 ) (BUTTONSET TITLE "&OK;&Cancel" AT 396 12 DEFAULT 1 ) (TEXT TITLE "Select input file" AT 72 220 WIDTH 103 HEIGHT 15 ) (ITEM TITLE "f" TO "infile" AT 192 216 WIDTH 250 ) (TEXT TITLE "Select an input file to test for completeness. "Completeness" means that the selected fields are not blank." AT 72 76 WIDTH 353 HEIGHT 52 ) (TEXT TITLE "You also have the option of checking all fields for integrity (agreement of data with field type, e.g., no February 31 in a date field)." AT 72 136 WIDTH 345 HEIGHT 50 ) OPEN %infile% DO Complete_Dialog WHILE error_test COMMENT ******* Integrity Check ******* IF verify1 VERIFY ALL ERRORLIMIT 1000 IF verify1 writea=STR(write1,8) COMMENT ******* Check for Blank Values ******* IF LEN(key2)=0 key2=key1 IF LEN(key3)=0 key3=key1 IF LEN(key4)=0 key4=key1 len_key1=STR(LEN(HEX(%key1%))/2,10) len_key2=STR(LEN(HEX(%key2%))/2,10)
Completeness and Integrity
len_key3=STR(LEN(HEX(%key3%))/2,10) len_key4=STR(LEN(HEX(%key4%))/2,10) GROUP COUNT COUNT IF MATCH(HEX(%key1%),REPEAT('20', %len_key1%),REPEAT('40',%len_key1%)) COUNT IF MATCH(HEX(%key2%),REPEAT('20', %len_key2%),REPEAT('40',%len_key2%)) COUNT IF MATCH(HEX(%key3%),REPEAT('20', %len_key3%),REPEAT('40',%len_key3%)) COUNT IF MATCH(HEX(%key4%),REPEAT('20', %len_key4%),REPEAT('40',%len_key4%)) END counta=STR(count2,8) count1a=STR(count3,8) count2a=' ' count3a=' ' count4a=' ' IF key2<>key1 count2a=STR(count4,8) IF key3<>key1 count3a=STR(count5,8) IF key4<>key1 count4a=STR(count6,8) IF key2=key1 key2=' ' IF key3=key1 key3=' ' IF key4=key1 key4=' ' COMMENT ***** Results display if Integrity test run ****
errors rather than examining millions of damaged records for no worthwhile reason. The default error limit for Verify is 10 integrity errors. You can adjust the limit by modifying this line of the script. You can also modify the command to examine only selected fields.
Understanding the Group Command To fully understand how the Complete script operates, it is necessary to understand how it uses the Group command, and how ACL assigns results to grouped variables. The Group command is essential for performing several tasks on one record before proceeding to the next record. The Count command appears five times within the group. Inside the group, variable names are assigned according to the line number of each command in the sequence. The Group command is on line number one; therefore, the first Count command on line number two outputs to variable count2, and the second to count3, and so on. Because each Count command applies only under certain conditions, the result can consist of as many as five unique counts, depending on how many fields you selected, with the first count being the number of records in the data file. For example, suppose there are 1,000 records and the user checks two fields for blanks. The result is three unique counts. If Field1 is blank in 10 cases and Field2 is blank in 50 cases, then the result will be:
count2 = 1000
count3 = 10
count4 = 50
The script then organizes and displays these results using one of two dialog boxes, depending on whether the Verify check box has been selected. The information is also written to the command log to provide a more permanent record.
25
Fraud Analysis Techniques Using ACL
Flow of Data for the Complete Script
IF verify1 DIALOG (DIALOG TITLE "Completeness and Integrity - Results" WIDTH 514 HEIGHT 392 ) (BUTTONSET TITLE "&OK;&Cancel" AT 408 12 DEFAULT 1 ) (TEXT TITLE "Completeness Test - Results" AT 72 172 ) (EDIT TO "count1a" AT 300 228 ) (EDIT TO "count2a" AT 300 264 ) (EDIT TO "count3a" AT 300 300 ) (EDIT TO "count4a" AT 300 336 ) (EDIT TO "key1" AT 72 228 WIDTH 164 ) (EDIT TO "key2" AT 72 264 WIDTH 165 ) (EDIT TO "key3" AT 72 300 WIDTH 163 ) (EDIT TO "key4" AT 72 336 WIDTH 163 ) (TEXT TITLE "Field name" AT 108 208 ) (TEXT TITLE "# of blank records" AT 300 208 WIDTH 128 ) (TEXT TITLE "Total number of records" AT 48 64 ) (EDIT TO "counta" AT 240 60 WIDTH 107 ) (TEXT TITLE "# of records with integrity problems" AT 48 112 WIDTH 186 HEIGHT 37 ) (EDIT TO "writea" AT 240 120 WIDTH 107 ) (TEXT TITLE "Results for file" AT 24 16 ) (EDIT TO "infile" AT 144 12 WIDTH 202 ) COMMENT ***** Results display if Integrity test NOT run ****
Deleting Temporary Variables, Fields, and Files The final step is cleanup. As noted in Chapter 1, deleting variables at the end of a script can be very important, particularly in a multiscript application like the Fraud Toolkit. The following is a list of the temporary variables, fields, and files that the Complete script automatically deletes.
26
IF NOT verify1 DIALOG (DIALOG TITLE "Integrity and Completeness Test" WIDTH 509 HEIGHT 371 ) (BUTTONSET TITLE "&OK;&Cancel" AT 408 12 DEFAULT 1 ) (TEXT TITLE "Completeness Test - Results" AT 72 136 ) (EDIT TO "count1a" AT 300 192 ) (EDIT TO "count2a" AT 300 228 ) (EDIT TO "count3a" AT 300 264 ) (EDIT TO "count4a" AT 300 300 ) (EDIT TO "key1" AT 72 192 WIDTH 164 ) (EDIT TO "key2" AT 72 228 WIDTH 165 ) (EDIT TO "key3" AT 72 264 WIDTH 163 ) (EDIT TO "key4"
Completeness and Integrity
AT 72 300 WIDTH 163 ) (TEXT TITLE "Field name" AT 108 172 ) (TEXT TITLE "# of blank records" AT 300 172 WIDTH 126 ) (TEXT TITLE "Total number of records" AT 72 76 ) (EDIT TO "counta" AT 252 72 ) (TEXT TITLE "Results for file" AT 24 16 ) (EDIT TO "infile" AT 132 12 WIDTH 179 ) COMMENT *********************Completeness and Integrity Results *********** SET ECHO ON IF verify1 DISPLAY crlf+crlf+crlf+'**************************** ***************************'+crlf+ '******* Completeness and Integrity results for '+infile+crlf+ '******* total records =' counta+crlf+ '******* for field '+key1+' '+count1a+' blank records'+crlf+ '******* for field '+key2+' '+count2a+' blank records'+crlf+ '******* for field '+key3+' '+count3a+' blank records'+crlf+ '******* for field '+key4+' '+count4a+' blank records'+crlf+ '*******'+crlf+ '******* '+writea+ ' records with integrity problems'+crlf+ '***************************************** **************'+crlf+crlf+crlf
Variables for Complete count1–6
error_test
len_key1–4
count1a–4a
filt1
verify1
counta
infile
write1
writea
crlf
key1–key4
Review and Analysis Once the script has run, the user must review the specific errors and decide how to proceed. Since the completeness and integrity of the data is critical to obtaining valid results, the user must judge whether the data is sufficiently valid before continuing the analysis:
IF NOT verify1 DISPLAY crlf+crlf+crlf+'** ******************************************** *********+crlf+ '****** Completeness and Integrity results for '+infile+crlf+ '***** total records =' counta+crlf+ '******* for field '+key1+' '+count1a+' blank records'
In some cases, it may be enough to create filters that set aside the damaged records and continue with your analysis. For example, to remove records with
27
Fraud Analysis Techniques Using ACL
invalid dates in the Invoice_Date field, create a filter, VERIFY (Invoice_Date). In other cases, it may be necessary to investigate why the records are damaged. Examine the results and determine whether the missing data is essential to the fraud analysis. This also is a good time to compare the data to source documents, printouts, or other electronic sources. The user should also verify the control totals, that is, the number of records and totals for key numeric fields. Finally, it is important to review the results of the Integrity test to determine if the data is sufficiently free of invalid entries to permit additional analysis. If there are large quantities of bad data and the user wants to get an exact measure of the problems in each field, the Verify command can be run to select one or more specific fields and verify each separately to obtain the exact error count. The VERIFY() function can also be used to create a filter that removes records with data integrity problems. Depending on the situation, it is often possible to overcome data integrity problems yourself. However, if a significant number of records in the file have integrity problems and they do not seem to be the result of simple data entry error, the first step is normally to go to the source—the system administrator or whoever is responsible for maintaining the original data. Make certain you have an up-to-date copy of the data dictionary, that is, the intended layout of the data file in terms of field and record lengths, data types, and so on. The next step is to examine the table layout and compare its content to the original data dictionary. Often the problem will become obvious at this point, and the user can simply change the table layout to fit the actual physical layout of the data. The ACL for Windows User Guide contains more than 60 pages of step-by-step guidance on how to define a file and edit the table layout, including how to correct problems encountered during the process. If one must start over, the Data Definition Wizard is very adaptable and can significantly reduce the effort involved in defining the data files.
28
+crlf+ '******* for field '+key2+' '+count2a+' blank records'+crlf+ '******* for field '+key3+' '+count3a+' blank records'+crlf+ '******* for field '+key4+' '+count4a+' blank records'+crlf+ '******* '+crlf +'*********************************** ********************'+crlf+crlf+crlf SET ECHO NONE COMMENT ****** Delete Variables ******* DELETE write1 OK DELETE writea OK DELETE count1 OK DELETE count2 OK DELETE count3 OK DELETE count4 OK DELETE count5 OK DELETE count6 OK DELETE counta OK DELETE count1a OK DELETE count2a OK DELETE count3a OK DELETE count4a OK DELETE key1 OK DELETE key2 OK DELETE key3 OK DELETE key4 OK DELETE len_key1 OK DELETE len_key2 OK DELETE len_key3 OK DELETE len_key4 OK DELETE verify1 OK DELETE crlf OK DELETE error_test OK DELETE filt1 OK SET ECHO ON
Completeness and Integrity
Complete_Dialog DIALOG (DIALOG TITLE "Completeness and Integrity Parameters" WIDTH 740 HEIGHT 400 ) (BUTTONSET TITLE "&OK;&Cancel" AT 624 12 DEFAULT 1 ) (TEXT TITLE "Select fields to test for blank values (Key 1 must always be selected)" AT 24 88 WIDTH 509 HEIGHT 25 ) (TEXT TITLE "Completeness Test - checks fields for blank values" AT 24 40) (TEXT TITLE "Key 1" AT 72 124 ) (TEXT TITLE "Key 2" AT 240 124 ) (TEXT TITLE "Key 3" AT 420 124 ) (TEXT TITLE "Key 4" AT 612 124 ) (ITEM TITLE "CND" TO "key1" AT 12 156 WIDTH 166) (ITEM TITLE "CND" TO "key2" AT 192 156 WIDTH 166 ) (ITEM TITLE "CND" TO "key3" AT 372 156 WIDTH 166 ) (ITEM TITLE "CND" TO "key4" AT 552 156 WIDTH 170 ) (TEXT TITLE "Integrity Test - verifies the integrity of fields in data file" AT 24 268 ) (CHECKBOX TITLE "Verify" TO "verify1" AT 24 300 )
Case Study: General Ledger Accounts Unaccounted For Automated controls over the use of certain general ledger (GL) accounts were very tight because the accounts were not subject to management review. The system only allowed users to code transactions against specific accounts—no others. However, the system did not include an edit check for blank GL account values. One of the accounts receivable clerks had been using this to his advantage. He was entering transactions against blank GL accounts, bypassing the controls. He created credit memos for a customer in exchange for kickbacks, and coded the offsetting entry to the blank GL account. The scheme was working perfectly. That is, until the auditors ran the Completeness and Integrity tests and checked for blank values. The results were surprising, but 100 percent accurate, since every transaction was reviewed. A filter was applied to show only the records with blank GL accounts, and a summary of accounts processed by accounts receivable clerk was also created. The test revealed that the same clerk had entered all the records with blank GL accounts. The fraud investigation was now in full gear.
IF LEN(key1)=0 PAUSE 'You must always select Key 1' IF LEN(key1)>0 error_test=F
29
3
Cross-Tabulation
Organizing Your Data to Find Trends Making sense of the data often means finding the best way to look at it. Cross-tabulation is a method that makes data easier to view. Data is often more understandable when presented in a twodimensional layout, such as a table. Displaying a cross-tabulated data file in the ACL View window is particularly useful because the user can apply ACL commands to the fields of the view, making it easy to ask additional questions about the data. To be suitable for cross-tabulation, a data file must include at least two character fields. The distinct values from one field will provide headings for the columns of the table (the x-axis), while the values from the second field will provide headings for the rows (the y-axis). For each cell (row-column combination), a selected numeric field will be totaled. The more distinct values there are, the more rows and columns in the resulting table. It is possible to cross-tabulate more than two fields at once, creating a table in three or more dimensions. In this toolkit, however, the number of dimensions has been limited to two.
1,000 rows, each with three columns. When necessary, it is possible to obtain an exact count of the categories present in a given field beforehand by using the Classify command to count them. After selecting the character fields that will serve as the x- and y-axes, the user must also select a third field containing the specific value that to analyze, such as units sold, sales revenue, costs, labor hours, or other numeric data. This value will be accumulated in a total.
Running Cross-Tabulation 1. From the Fraud Menu, select Cross-Tabulation. Click [OK]. The File Selection dialog box appears:
There are two ways to cross-tabulate two fields. Consider a payroll file containing 1,000 employees who are paid in three categories: regular, overtime, and holiday pay. The total pay for each employee in each category could be tabulated with three rows (regular, overtime, and holiday), each with 1,000 columns (one per employee) or 1,000 rows (one per employee) each with three columns (regular, overtime, and holiday). Most people prefer to read data in a series of short rows instead of a few long rows. So, if you know which one of the fields has fewer categories, it is usually better to use that field as the x-axis (your column headings). In the payroll example, most users would choose
30
2. From the Select input file drop-down list, choose the file to analyze. In the Specify output file text box, enter the name of the
Cross-Tabulation
output file. To apply a filter, select the Apply filter check box. Click [OK]. The Field Selection dialog box appears:
TWO Character Fields
ONE Numeric Field
9.2 11.3 0.1 10.5 4.7 3.3 1.5 ...
J04 K17 R88 J05 R88 K17 J04 ...
A
B
C
D ...
A
B
C
D ...
A
B
D B A B D C D ...
Step 1. Creating the X-Axis
3. Select the desired fields from the X-axis (horizontal), Y-axis (vertical), and Sum fields drop-down lists. The user must select a field from each of the three drop-down lists, Click [OK]. The cross-tabulated output is saved as a new data file with its own table layout. Open it in the View window to begin your additional analysis. As the following diagram illustrates, there are three steps to creating a cross-tabulation. The user selects the fields to be used for the xand y-axes. The actual data values of the field selected for the x-axis will become the horizontal field names in the cross-tabulation. The values for the y-axis are derived from the data in the field selected for the y-axis.
9.2 11.3 0.1 10.5 4.7 3.3 1.5 ...
J04 K17 R88 J05 R88 K17 J04 ...
J04 J05 K17 R88 ...
Step 2. Creating the Y-Axis
9.2 11.3 0.1 10.5 4.7 3.3 1.5 ...
D ... 10.7
J04 J05
10.5 11.3 3.3
K17 R88 ...
C
0.1
4.7
Step 3. Populating the Table
31
Fraud Analysis Techniques Using ACL
Benefits of Cross-Tabulation
Cross_Tabs
Cross-tabulation can lead to a huge variety of findings, depending on what patterns appear in the resulting table. Examples include:
Expense categories by department
Product sales categories by region or store or salesperson
Inventory categories by warehouse or division
The following example is taken from a payroll data file in which employees are paid regular, overtime, or holiday pay. It demonstrates how cross-tabulation can aid data analysis.:
SET ECHO NONE SET SAFETY OFF DELETE Xtab.wsp OK DELETE Xtab_Temp.fil OK DELETE FORMAT Xtab_Temp OK error_test=T DIALOG (DIALOG TITLE "Cross-Tabulation - File Selection" WIDTH 540 HEIGHT 400 ) (BUTTONSET TITLE "&OK;&Cancel" AT 420 12 DEFAULT 1 ) (ITEM TITLE "f" TO "infile" AT 192 120 WIDTH 310 HEIGHT 84 ) (TEXT TITLE "Select input file" AT 72 124 ) (TEXT TITLE "Specify output file (no spaces)" AT 72 208 WIDTH 108 HEIGHT 40) (EDIT TO "useroutfile" AT 192 216 WIDTH 310 DEFAULT "Xtab_File" ) (CHECKBOX TITLE "Apply filter (optional)" TO "filt_sel" AT 192 312 )
Employee
Pay Type
Amount
Jones
Regular
$1,213.00
Jones
Overtime
$251.97
Blackwell
Regular
$568.32
Blackwell
Overtime
$325.00
Blackwell
Overtime
$300.00
Summers
Overtime
$100.10
Summers
Overtime
$289.23
SUBSTR(REPLACE(TRIM(INCLUDE(UPPER(%X_axis%), 'ABCDEFGHIJKLMNOPQRSTUVWXYZ0987654321_ ')), ' ','_'),1,30) SUMMARIZE ON Xtab_Field AS '%x_axis%' ACC 1 TO Xtab_Temp1 PRESORT OPEN
Wilson
Holiday
$534.21
DEFINE FIELD X_Names COMPUTED
outfile=REPLACE(useroutfile,' ','_') OPEN %infile% IF filt_sel DO Filter_Selection DO Cross_Tabs_Dialog WHILE error_test=T EXTRACT %y_axis% %x_axis% %num_field% TO Xtab_Temp OPEN DEFINE FIELD Xtab_Field COMPUTED
'X_'+SUBSTR(%x_axis%,1,30)
32
Cross-Tabulation
GROUP EXPORT ASCII to xtab.wsp X_NAMES + ' comp' EXPORT ASCII to xtab.wsp NUM_FIELD + ' if Xtab_Field ="' + SUBSTR(REPLACE(TRIM( INCLUDE(UPPER(%X_axis%),' ABCDEFGHIJKLMNOPQRSTUVWXYZ0987654321_ ')), ' ','_'),1,30) + '"' The SUBSTR command should be indented the same amount as the EXPORT command on the line above it EXPORT ASCII to xtab.wsp '0.00' END OPEN Xtab_Temp ACTIVATE Xtab SUMMARIZE ON %y_axis% ACC ALL TO %outfile% PRESORT OPEN DELETE count OK DELETE error_test OK DELETE Xtab_Temp.fil OK DELETE FORMAT Xtab_Temp O DELETE Xtab_Temp1.fil OK DELETE FORMAT Xtab_Temp1 OK DELETE filt_sel OK DELETE filt1 OK DELETE useroutfile OK DELETE outfile OK DELETE infile OK DELETE x_axis OK DELETE y_axis OK DELETE num_field OK DELETE Xtab.wsp OK DELETE Xtab_Field OK DELETE X_Names OK SET ECHO ON
The Cross Tabs script produces a table with the amounts for each pay type by employee: Employee
Regular
Overtime
Holiday
Blackwell
$568.32
$625.00
$0.00
$1,213.06
$251.97
$0.00
Summers
$0.00
$389.33
$0.00
Wilson
$0.00
$0.00
$534.21
Jones
Normally, no employee would receive either overtime or holiday pay without also receiving regular pay. It is clear that both Wilson and Summers violate the rule about regular pay. The next step is to identify all employees who violated the rule, and determine why. All employees with questionable payments of this kind can be isolated simply by applying the filter: Regular = 0.00 and (Overtime > 0 or Holiday > 0)
How the Scripts Work The Cross Tabs script is one of the more complex scripts in the toolkit. In order to produce a cross-tabulation, ACL must create a computed field for each x-axis value. For the names of the computed fields, the script uses the actual data values contained in the field selected as the x-axis. In the payroll example, there are three types of pay (regular, overtime, and holiday) and three corresponding field columns. The script uses a workspace, a storage file for computed fields used by any number of table layouts, to dynamically assign field names to each of the columns. The workspace also computes the total of the selected numeric field for each of the newly defined columns. This creates several challenges. Field names in ACL must follow certain rules. Also, there is a maximum record size that ACL can manage.
33
Fraud Analysis Techniques Using ACL
Challenges of Cross-Tabulation The two main challenges of creating a cross-tabulation script in ACL are:
Generating appropriate x-axis category labels to serve as field names.
Correctly handling the ACL workspace where the table fields are created.
X-Axis Labels In creating a table, the Cross Tabs script sets up as many new temporary fields as there are unique values in the x-axis field. These field names are the column names in the cross-tabulation table. In the pay example, the field names created are Regular, Overtime, and Holiday. The workspace not only defines the names of the fields, but also supplies the information that allows ACL to determine the values for each record. In the example introduced earlier in this chapter, the values for the field Regular are calculated as follows. The second statement applies if the first does not: Regular = Amount if Pay type = 'Regular' Regular = 0.00
However, since ACL has a maximum length for field names and forbids certain characters to be included, you cannot simply use the contents of a field in their current format. The Cross Tabs script uses “X ” followed by the adjusted value of the character field as the name of the field for the x-axis. In the payroll example, the types of pay are used as field names on the x-axis, for example, Regular, Overtime, and Holiday. In many cases, however, the field will contain commas, colons, dollar signs, or other inadmissible characters. The script converts each x-axis value it finds using this expression:
34
Cross_Tabs_Dialog DIALOG (DIALOG TITLE "Cross-Tabulation Field Selection" WIDTH 740 HEIGHT 400 ) (BUTTONSET TITLE "&OK;&Cancel" AT 624 24 DEFAULT 1 ) (TEXT TITLE "Select a field from each of the three drop-down lists:" AT 36 64 WIDTH 378 ) (TEXT TITLE "X-Axis (horizontal)" AT 72 136 ) (TEXT TITLE "Y-Axis (vertical)" AT 300 136 ) (TEXT TITLE "Sum field" AT 576 136 ) (ITEM TITLE "C" TO "X_Axis" AT 36 180 WIDTH 195 HEIGHT 187 ) (ITEM TITLE "C" TO "Y_Axis" AT 264 180 WIDTH 202 HEIGHT 190 ) (ITEM TITLE "N" TO "Num_field" AT 504 180 WIDTH 199 HEIGHT 187 ) IF LEN(x_axis)>0 AND LEN(y_axis)>0 AND LEN(num_field)>0 error_test=F IF MATCH(0,LEN(x_axis),LEN(y_axis), LEN(num_field)) PAUSE 'You must select X-Axis, Y-Axis, and Sum field'
Cross-Tabulation
SUBSTR(REPLACE(TRIM(INCLUDE(UPPER(%X_axis%), 'ABCDEFGHIJKLMNOPQRSTUVWXYZ0987654321_ ')), ' ','_'),1,30)
The expression converts the field value to uppercase, then includes the wanted characters, which makes the remaining string shorter and trims off any spaces to the right of the last character. The expression also replaces any remaining spaces between words with underscores. Finally, it limits the string to 30 characters. If the original field width is greater than 30 characters, multiple x-axis fields with the same name may be created, even though the original values are different. For example, the x-axis is Supplier and two of the supplier names are: Alliston-Williamson Consolidated Professional Services (length=54) Alliston-Williamson Consolidated Prof Services (length = 46)
When the field names are created for the x-axis, both supplier names will be shortened to 32 characters and the two created fields would be the same: X_ALLISTONWILLIAMSON_CONSOLIDATE
ACL does not allow two fields to have the same name. To avoid this problem, the script summarizes the first 30 characters of the x-axis field. You can check the width of your fields by choosing Table Layout from the Edit menu in ACL. Another point to keep in mind when cross-tabulating is that the resulting tables can be enormous. If you cross tabulated 100,000 customer names with their addresses, the resulting table would have 10 billion distinct entries; with 99.999 percent of the entries having
nothing in them. These results are not only uninformative, but also can pose problems for the computer’s memory. Ten billion zeros (stored by ACL at 12 bytes apiece) are more than enough to fill a 100 GB hard drive. Also, while ACL can deal with unlimited numbers of records, the size of each record is limited to 32,767 characters, so cross-tabulations involving thousands of categories in the x-axis are likely to fail.
Workspaces A workspace is a special file in an ACL Project that stores field definitions and expressions. A workspace is not associated with any specific table layout. A field definition that is stored in a workspace can be reused with many different files and table layouts. These field definitions can be computed using field names from a table layout, or they can be created directly from the source file. The workspace can be used with other table layouts and source files if:
All computed fields (expressions) in the workspace refer only to field names that exist in the new table layout.
All fields in the workspace that are defined from a source file correspond exactly to the record layout in the new source file.
For example, consider a table layout containing two numeric fields, Quantity and Unit Price. A workspace that is based on this table layout might include a computed field (expression) with the name Total, defined as Quantity ∗ Unit Price. This workspace could then also be activated to work with other table layouts as long as they have numeric fields named Quantity and Unit Price. When a workspace is activated with a table layout, the fields it defines are available for use as if they formed part of that table layout. The default view can be modified to display all the fields
35
Fraud Analysis Techniques Using ACL
defined in the table layout and all of the fields defined in the workspace together. Workspaces save time. The user does not have to create a whole new set of field definitions for each new data file encountered. For example, a workspace might include a computed field that converts abbreviations for states into the full name for each. When used to store regular field definitions, as opposed to computed fields, a workspace can also assist the user in dealing with multiple-recordtype files. The definition for each record type can be stored in a separate workspace; when the user wants to process records of a specific type, they can activate the related workspace.
Deleting Temporary Variables, Fields, and Files As part of its cleanup, the Cross Tabs script deletes the following temporary variables, fields, and files:
Variables for Cross Tabs Crlf
infile
x axis
Xtab temp.fil
error test
num field
X Names
Xtab.wsp
filt
outfile
Xtab Field
y axis
filt sel
useroutfile
Xtab templ.fil
Input file f(x)
f(x)
f(x)
Step 1. Creating computed fields
Review and Analysis f(x)
f(x)
f(x)
Step 2. Collecting computed fields into workspace (.wsp) file Input file f(x)
f(x)
f(x)
Using a cross-tabulated view, the user can run ACL commands and other tests from the Fraud Toolkit to investigate the data in many different ways. Here are two examples:
Input file f(x)
f(x)
f(x)
Step 3. Applying workspace to other files
36
Examine the data file to look for unusual records. Sometimes you do not expect every cell to contain values for the numeric field. Other times, every cell should contain a value. Unexplained “holes” (empty cells) in the table or rows with too many filled cells may indicate that fraud has been committed.
Use the Statistics and Stratify options from the Test Selection dialog box in Data Profile to determine which location has the highest expense in each category.
Cross-Tabulation
Join your cross-tabulated file to a file containing total revenue figures for each location; and then use the Ratio Analysis Num1/Num2 test to assess location efficiency: What location has the highest rent expense as a proportion of revenue? What locations have a high proportion of discretionary expenses? Are any locations unusually low in their expenses as a percentage of sales?
Like many kinds of fraud analysis, cross-tabulation becomes even more effective when carried out consistently over longer periods of time. A larger time frame increases the likelihood that a trend or correlation is not just a coincidence. You can ask: Are the locations with relatively high or low expenses the same from year to year, or do they vary?
Case Study: Not Enough Clients All help desk analysts were expected to respond to at least 75 calls per day from among the 50 client areas—bonuses depended on it. The calls could be incoming (client phoning the help desk) or outgoing calls (returning a client’s call).
analysis to see how many client areas were served. This year, by running a cross-tabulation, the auditors produced a table with the volume of client calls by client area (horizontal axis) by analyst (vertical axis). The results showed that most analysts had answered calls from 30 to 40 client areas. However, one analyst had responded only to calls from a few client areas for the entire month. A check of the detailed phone logs revealed that more than 59 percent of the calls were outgoing, and that most of them were to one phone number. When confronted with the report, the analyst admitted to speaking on the phone with a friend—sometimes as much as three hours per day.
Case Study: Calling Cards Salespeople were given corporate calling cards to make long-distance calls while travelling. Each week a cross-tabulation analysis was produced to show, for each salesperson (vertical axis), the total number of calls by state (horizontal axis). Since the salespeople tended to travel only within their sales district, usually within a single state, the auditors were alarmed to find that one salesperson’s card had been used in 23 states the previous week. Further analysis proved that he had given his calling card number out to friends who were using it to make long-distance calls.
Every month, the auditors would check the total call volume for each analyst to see if volume quotas had been met. However, there was no
37
4
Duplicates
Finding Higher-Risk Items It is essential to have a clear understanding of the data to properly assess the results of the Duplicates test, as there may be a number of valid reasons for duplicate records to exist. For example, duplicate employee numbers may show up in a payroll file if overtime checks are paid separately from regular paychecks. Or the accounts payable system may include partial payments that look deceptively like duplicate transactions. Adjusting entries, payment stoppages, and credit transactions are other common examples of legitimate duplicate transactions. The Duplicates scripts enhance the power of the Duplicates command to find higher-risk duplicates—records more likely to be fraudulent. The Duplicates test can be run on one to four key fields. It also gives the option to select an additional “unique” key. Ordinarily, a search for duplicates returns all records in which the key field is not unique. These scripts go one step further by allowing you to specify an additional criterion using a technique commonly known as “same same different.” The main script, Dup Multiple Keys, presorts the file, and then checks for two or more records with the same value in the key fields. A subscript, Dup Multiple Keys1, checks for cases where two or more records are otherwise identical (keys one to four match) and the contents of the fifth field are different. Examples of tests for possible duplicates include:
Payroll
38
Same direct deposit number, but different employee number.
Accounts Payable
Same purchase order number, but different vendor number.
Same vendor number, date, and amount.
Same contract number and date.
Same invoice number, amount, and date, but different vendor number.
Same employee number, but different work department.
Running Duplicates 1. From the Fraud Menu, choose Duplicates. Click [OK]. The File Selection dialog box appears:
Duplicates
2. From the Select input file drop-down list, choose the file to analyze. In the Specify output file text box, enter the name of the output file. If a filter, is to be applied select the Apply filter check box. Click [OK]. The Parameters dialog box appears:
4. After viewing the results, click [OK]. The count of Duplicates found is displayed in the Results dialog box and written to the command log. The duplicate records are written to the specified file. The dialog box presents a summary of what was found. A new view is created, providing a detailed list of your duplicates.
3. Select the fields to test, a minimum of one to a maximum of four. The user also has the option of selecting a field that is expected to contain a different, nonduplicated value from the Key 5 drop-down list, and whether a summary file should be produced or not. Click [OK]. The Results dialog box appears.
39
Fraud Analysis Techniques Using ACL
5. If the user selected Produce summary results, then a file containing one record from each set of duplicates is produced. The file will have the same name as the duplicates result file beginning with “Sum .”
How the Scripts Work The Dup Multiple Keys script governs the Duplicates test, calling upon various subscripts as needed. The first task is to determine which test will be performed, and on what file. Once an input file is chosen, there are two ways that duplicates testing can proceed. If the optional unique key is left blank, the Dup Multiple Keys script executes the Duplicates command, using the key fields identified by the user. However, if the unique key is entered, then the script calls the subscript named Dup Multiple Keys1. The inclusion of the unique key changes both the nature of the test and the results. By requiring that this field contain a different value, the user can eliminate trivial, routine duplicates such as overtime versus regular pay, and highlight transactions that have a higher risk of being fraudulent. Note: A simple test is used to determine which key fields are active. If the length of the contents of the key field box is 0, meaning the box is empty, that key field is ignored.
The Role of Subscripts The main script, Dup Multiple Keys, relies on three subscripts: Dup Multiple Keys1, Dup Dialog, and KeyChange.
40
Dup_Multiple_Keys SET ECHO NONE SET SAFETY OFF filt1='' crlf=CHR(10) CLOSE SECONDARY error_test=T DIALOG (DIALOG TITLE "Duplicates - File Selection" WIDTH 540 HEIGHT 400 ) (BUTTONSET TITLE "&OK;&Cancel" AT 420 24 DEFAULT 1 ) (TEXT TITLE "Select input file" AT 36 112 ) (ITEM TITLE "f" TO "infile" AT 168 108 WIDTH 336 HEIGHT 120 ) (TEXT TITLE "Specify output file (no spaces)" AT 36 220 WIDTH 123 HEIGHT 36 ) (EDIT TO "dupout1" AT 168 228 WIDTH 338 DEFAULT "Dup_Results" ) (CHECKBOX TITLE "Apply filter (optional)" TO "filt_sel" AT 168 360 ) (CHECKBOX TITLE "Produce summary results" TO "dupsum" AT 168 324 ) dupout=REPLACE('%dupout1%',' ','_')+'.fil' OPEN %infile% IF filt_sel DO Filter_Selection DO Dup_Dialog WHILE error_test=T IF LEN(key5)<1 DUP ON %key1% %key2% %key3% %key4% OTHER ALL TO %dupout% PRESORT OPEN IF LEN(key5)>0 DO Dup_Multiple_Keys1 COUNT dupnum=STR(count1,12)
Duplicates
DIALOG (DIALOG TITLE "Duplicates Test Results" WIDTH 440 HEIGHT 300 ) (BUTTONSET TITLE "&OK;&Cancel" AT 336 24 DEFAULT 1 ) (EDIT TO "dupnum" AT 132 120 ) (TEXT TITLE "Number of duplicate records" AT 108 88 ) (TEXT TITLE "See file %dupout% for more details." AT 72 184 ) DISPLAY
crlf+crlf+crlf+'******************* ************************************'+crlf+ '******* Duplicates results for '+infile+crlf+ '******* Output File '+dupout+' created'+crlf+ '*******'+crlf+ '******* '+dupnum+' duplicates detected. '+crlf+'*********************************** ***'+crlf+crlf
SET ECHO NONE OPEN %infile% DELETE hex_%key1% DELETE hex_%key2% DELETE hex_%key3% DELETE hex_%key4% DELETE hex_%key5%
A subscript is any script called by another script. There are a number of advantages to using subscripts. The first is economy. For example, the Filter Selection subscript, which can be called by every main script in this toolkit, except the Complete script, automates a repetitive and simple task. Each time this seven-line script is called by another script, it saves the script designer the trouble of typing additional commands. In the Duplicates test, the main reason for using subscripts is to reduce the complexity of the code and break the processing flow into discrete parts. The use of subscripts simplifies the coding required to handle the variations in the commands in these situations. Flow of Data in Dup Multiple Keys
OK OK OK OK OK
OPEN %dupout% dup_key='hex(%key1%)' if len(key2)>0 dup key=dup key+' '+'hex(%key2%)' if len(key3)>0 dup_key=dup_key+' '+'hex(%key3%)' if len(key4)>0 dup_key=dup_key+' '+'hex(%key4%)' if len(key5)>0 dup_key=dup_key+' '+'hex(%key5%)' dup_key_sum=' ' dup key sum=dup key sum+'%key1% ' if len(key2)>0 dup_key_sum=dup_key_sum+'%key2% ' if len(key3)>0 dup_key_sum=dup_key_sum+'%key3% ' if len(key4)>0 dup_key_sum=dup_key_sum+'%key4% ' if len(key5)>0 dup_key_sum=dup_key_sum+key5
41
Fraud Analysis Techniques Using ACL
Case Study: Duplicate Payments The accounts payable clerk of ABC Limited, one of XYZ’s large vendors, called XYZ’s audit manager with a concern: An accounts receivable clerk at XYZ had called to say that he had accidentally processed an invoice twice and requested that ABC send him a refund. The A/P clerk was concerned because it was the fifth time this had happened in the last two months. The audit manager thanked him for the information and immediately launched a fraud investigation. The invoice-processing clerk for the accounts payable section at XYZ was well aware that the control over the payment of duplicate invoices relies on two fields. The accounts payable application would reject and flag any transaction in which the combination of vendor number and invoice number was not unique. However, the clerk also knew that it was quite easy to get around this control because there was no control over the vendor table, which assigned the vendor numbers. The clerk simply added additional vendor numbers for several vendors using slightly different names, such as “ABC Limited,” “ABC Ltd,” and “ABC Ltd.” As the following table demonstrates, a vendor could be assigned more than one vendor number, one for each spelling of the vendor name. Vendor Name
Vendor Number
Address
ABC Limited
N3450D12
101 Grey Rock
ABC Ltd.
N5478X23
101 Grey Rock
ABC Ltd
N5471C10
101 Grey Rock
This control flaw allowed the clerk to submit the same invoices for payment without the application controls detecting the duplicates. After the duplicate check was sent to ABC Limited, the clerk would call the vendor and say that
42
if dupsum SUMMARIZE on %dup_key% other %Dup_ key_sum% acc 1 to ztemp2 PRESORT OPEN if dupsum EXTRACT %dup_key_sum% count to Sum_%dupout% open if dupsum COUNT dupnumsum=str(count1,12) if dupsum DIALOG (DIALOG TITLE "Duplicates Test - Summary Results" WIDTH 440 HEIGHT 300) (BUTTONSET TITLE "&OK;&Cancel" AT 336 24 DEFAULT 1) (EDIT TO "dupnumsum" AT 132 120) (TEXT TITLE "Number of summary duplicate records" AT 72 88) (TEXT TITLE "See file Sum_%dupout% for more details." AT 72 184) if dupsum DISPLAY crlf+crlf+crlf+'********** ********************************************' +crlf+ '******* Duplicates - Summary results for '+infile+crlf+ '******* Output File Sum_'+dupout+' created'+crlf+ '*******'+crlf+'******* '+dupnumsum+' duplicates detected.'+crlf+'*************** ***********************'+crlf+crlf SET ECHO NONE CLOSE SECONDARY DELETE dupout OK DELETE dupout1 OK DELETE count1 OK DELETE dupnum OK DELETE dupnumsum OK DELETE keyn OK DELETE key1 OK DELETE key2 OK DELETE key3 OK DELETE key4 OK
Duplicates
DELETE key5 OK DELETE filt1 OK DELETE error_test OK DELETE filt_sel OK DELETE Zdup_Temp.fil OK DELETE Ztemp1.fil OK DELETE Ztemp2.fil OK DELETE FORMAT Zdup_Temp OK DELETE FORMAT Ztemp1 OK DELETE FORMAT Ztemp2 OK SET SAFETY ON SET ECHO ON
an error had been made, and that the invoice was paid twice. He would request a refund check, and cash it for himself. The auditors routinely checked for duplicate payments. However, they were testing the controls, checking to see if the same vendor number/invoice number combination had been paid. The auditors never found any duplicate payments, even though the clerk was paying thousands of dollars in duplicate payments every month. This month, they ran the Duplicates test using the criteria of same invoice number, same payment amount, and different vendor number. In doing so, they identified the duplicate payments and uncovered the clerk’s fraudulent scheme.
Dup Dialog Script The Dup Dialog script is called to collect user input. The Dup Dialog script continues to run repeatedly as long as the variable error test remains true. Once the user input is tested and found acceptable, that is, at least one key field name provided by the user is not blank, and no field is selected more than once, the variable error test is set to false. Setting error test to false ends this subscript and returns processing to the main script, where the user input can be processed.
If Statements Apart from the lengthy Dialog command, the only other commands in the Dup Dialog script are IF commands. There are two kinds of If statements in ACL, the IF parameter (or IF condition) and the IF command. Both are used in the Toolkit. The IF parameter evaluates a condition that determines which records are processed. ACL reads each record and executes the command if the condition is true for that record. In the following example, the command
43
Fraud Analysis Techniques Using ACL
EXTRACT RECORD is executed for each record where the condition is true: EXTRACT RECORD TO High_Amt IF Amount >1000
Whereas the IF parameter determines which records are processed, the IF command determines whether the command is executed at all. The IF command evaluates a condition once, and if it is true, the command following it is executed. In the following example, Script1 is executed if the variable run script is true when the IF command is first encountered. IF run_script DO Script1
An example of the IF command is highlighted in the code for the Dup Dialog, and another can be found in the Dup Multiple Keys1 code on page 46. An example of the IF parameter can also be found in the Strat1 script on page 62.
Dup Multiple Keys1 Even a relatively simple script with very few lines of code can demonstrate the power of scripts. The Dup Multiple Keys1 script organizes the summarized information needed for each case in which a unique key is used, checks for duplicates, and joins the duplicate results with the sorted file to produce the final output.
Macro Substitution Macro substitution allows a script command to be written without assigning the exact field name, file name, or character string that will be used when the command is executed. Macro substitution uses the contents of a variable to supply the necessary field name, file name, or string. Variables created with the Accept, Assign, and Dialog commands can be used in macro substitution. The use of macro substitution can be identified by a variable enclosed in
44
Dup_Dialog DIALOG (DIALOG TITLE "Duplicates Parameters" WIDTH 740 HEIGHT 400 ) (BUTTONSET TITLE "&OK;&Cancel" AT 636 12 DEFAULT 1 ) (TEXT TITLE "Select key fields (minimum of one to maximum of four) used to determine if a record is a duplicate." AT 48 64 WIDTH 503 HEIGHT 35 ) (TEXT TITLE "Key 1" AT 72 148 ) (TEXT TITLE "Key 2" AT 264 148 ) (TEXT TITLE "Key 3" AT 432 148 ) (TEXT TITLE "Key 4" AT 612 148 ) (ITEM TITLE "CND" TO "key1" AT 12 180 WIDTH 172 ) (ITEM TITLE "CND" TO "key2" AT 192 180 WIDTH 171 ) (ITEM TITLE "CND" TO "key3" AT 372 180 WIDTH 171 ) (ITEM TITLE "CND" TO "key4" AT 552 180 WIDTH 171 ) (ITEM TITLE "CND" TO "key5" AT 444 300 WIDTH 180 HEIGHT 83 ) (TEXT TITLE "Fifth field MUST contain different value (e.g., same invoice number, same amount, but different vendor name)" AT 168 304 WIDTH 257 HEIGHT 67 ) (TEXT TITLE "Optional analysis" AT 48 280 ) (TEXT TITLE "Key 1 must always be selected." AT 48 112 ) (TEXT TITLE "Key 5" AT 516 268 ) error_test2=T IF LEN(key1)=0 PAUSE 'You must always select Key 1' IF LEN(key1)>0 AND MATCH(key1,key2,key3,key4,key5) PAUSE 'The same field cannot be selected more than once' IF LEN(key2)>0 AND MATCH(key2,key3,key4,key5) PAUSE 'The same field cannot be selected more than once'
Duplicates
IF LEN(key3)>0 AND MATCH(key3,key4,key5) PAUSE 'The same field cannot be selected more than once' IF LEN(key4)>0 AND MATCH(key4,key5) PAUSE 'The same field cannot be selected more than once' IF LEN(key1)>0 AND NOT MATCH(key1,key2,key3,key4,key5) error_test2=F IF LEN(key2)>0 AND NOT MATCH(key2,key1,key3,key4,key5) error_test2=F IF LEN(key3)>0 AND NOT MATCH(key3,key1,key2,key4,key5) error_test2=F IF LEN(key4)>0 AND NOT MATCH(key4,key1,key2,key3,key5) error_test2=F IF LEN(key1)>0 AND error_test2=F error_test=F
percent signs (%). When ACL encounters a macro substitution variable in a command, it substitutes the value of the variable in the command. Most of the scripts in this toolkit employ macro substitution that lets the user specify which file to open. For example, the percent sign (%) in the command OPEN %infile% causes the Open command to ignore the variable name and substitute the contents of the variable. So, if the variable infile contains the value Payroll, the Open %infile% command will be transformed into Open Payroll. In the highlighted code in Dup Multiple Keys, macro substitution allows the assembly of a single character string (dup key) that can be used in the Sort, Summarize, Duplicates, and Join commands. This is a simple, elegant way to handle varying numbers of keys.
KeyChange The purpose of the KeyChange script is to create character fields for the Join command. This script makes use of macro substitution and demonstrates the use of some powerful ACL features: the LENGTH() and HEX() functions and the Define Field command. The KeyChange subscript is called by the Dup Multiple Keys1 script whenever the optional Key 5 field is specified. Since the Dup Multiple Keys1 script uses the Join command, and, prior to Version 8.3, the fields on which the join is performed must be character fields, the KeyChange script converts the key fields to their hexadecimal value. For example, when searching for duplicates with the same invoice number, date, and amount, but a different vendor number, the invoice number, date, and amount fields must be converted to character fields for the Join command to work. Rather than forcing the user to modify the input file definition, the KeyChange script creates character fields by converting the field values to their hexadecimal values.
45
Fraud Analysis Techniques Using ACL
Note: Because of the functioning of the KeyChange script, all selected fields must be physical data—no computed expressions are permitted. If a computed expression must be used, first Extract (fields) to another file. ACL will evaluate computed expressions and write them out as data which can then be used in the Duplicates script.
Define Field ACL allows the user to create new fields or expressions. Typically, this is done in the edit table layout window, which is displayed by clicking the Add a New Expression button. However, a field or expression can be created within a script by issuing the Define Field command. The format of the command is: DEFINE FIELD field-name COMPUTED expression
In the KeyChange script, the command looks like: DEFINE FIELD Hex_%KEY5% COMPUTED HEX(%KEY5%)
LENGTH() and HEX() ACL has many useful functions that allow the user to transform the data or create new fields. The LENGTH() function returns the length of a string or field. The HEX() function changes any field type (numeric, date, or character) into a character string containing the hexadecimal equivalent of the field value.
Deleting Temporary Variables, Fields, and Files The scripts have many temporary variables to control the flow to subscripts and address several user options (keys one to four with or without the fifth
46
Dup_Multiple_Keys1 DO Keychange dup_key=' ' dup_key=dup_key+'hex_%key1% ' IF LEN(key2)>0 dup_key=dup_key+'hex_%key2% ' IF LEN(key3)>0 dup_key=dup_key+'hex_%key3% ' IF LEN(key4)>0 dup_key=dup_key+'hex_%key4%' SORT ON %dup_key% TO Ztemp1.fil OPEN SUMM on %dup_key% Hex_%key5% to ZTEMP2.fil OPEN DUP on %dup_key% to ZDUP_TEMP.fil OPEN OPEN Ztemp1 SET FILTER %filt1% OPEN Zdup_Temp SECONDARY JOIN PKEY %dup_key% FIELDS ALL SKEY %dup_key% TO %dupout% OPEN %infile% DELETE Hex_%key1% OK DELETE Hex_%key2% OK DELETE Hex_%key3% OK DELETE Hex_%key4% OK DELETE Hex_%key5% OK OPEN %dupout% DELETE Hex_%key1% OK DELETE Hex_%key2% OK DELETE Hex_%key3% OK DELETE Hex_%key4% OK DELETE Hex_%key5% OK DELETE dup_key OK
KeyChange DELETE DELETE DELETE DELETE
Hex_%key1% Hex_%key2% Hex_%key3% Hex_%key4%
OK OK OK OK
Duplicates
DELETE Hex_%key5% OK IF LEN(key1)>0 DEFINE FIELD Hex_%key1% COMPUTED HEX(%key1%) IF LEN(key2)>0 DEFINE FIELD Hex_%key2% COMPUTED HEX(%key2%) IF LEN(key3)>0 DEFINE FIELD Hex_%key3% COMPUTED HEX(%key3%) IF LEN(key4)>0 DEFINE FIELD Hex_%key4% COMPUTED HEX(%key4%) DEFINE FIELD Hex_%key5% COMPUTED HEX(%key5%)
key being different). The final step is deleting the temporary variables, fields, and files: Dup Multiple Keys count1
filt1
keyn
dupnum
filt sel
Zdup temp.fil
dupout
key1-key5
Ztemp1.fil
dupout1
infile
Ztemp2.fil
hex %key1%-%key5%
dup key
Review and Analysis Knowing that a certain number of records are duplicates is only the first step. Next the user must investigate each duplicate to assess its validity. Open the new input file definition created by the Duplicates test to check for reversing entries, payment stoppages, and other rationales for records that were identified as duplicates. Verify the results using appropriate source documents.
Checking for Duplicates The Duplicates test produces a file containing all of the higher-risk duplicates, based on the auditor’s understanding of the system and its data. Sometimes the mere existence of duplicates is reason enough for concern. For example, it is unlikely that the personnel system would include two or more people with the same employee number. Usually, however, the Duplicates analysis is only the
47
Fraud Analysis Techniques Using ACL
beginning of your work. In some cases, the number of potential duplicates may be so large that you should rerun the Duplicates test, specifying more restrictive criteria—four keys instead of two, for example. The type of additional analyses varies, depending on the auditor or fraud investigator. The analysis should be based on an assessment of the risks, the potential size of the losses, and the cost of chasing down the duplicates. The following example illustrates the logic behind the Duplicates test and shows some of the additional steps that could be part of a payroll investigation.
Payroll Example It is important to understand the data before running the Duplicates test. In this example, the payroll file contains all the monthly pay transactions for a fiscal year. All overtime payments for the month are recorded separately, even though they are included on the same checks as the regular monthly pay.
48
Sample Payroll Data EmpNo
Date
Dept
Type
Amount
CheckNo
123
Jan
10
Reg
$5,000
1009
123
Feb
10
Reg
$5,000
1367
123
Mar
10
Reg
$5,000
1490
123
Apr
10
Reg
$5,000
1551
123
Apr
10
OT
$1,240
1551
123
May
10
Reg
$5,421
1603
123
May
10
OT
$536
1603
123
May
10
OT
$488
1603
123
Jun
10
Reg
$5,421
1800
123
Jul
10
Reg
$5,421
1925
123
Aug
10
Reg
$5,421
2005
123
Sep
10
Reg
$5,421
2161
123
Oct
10
Reg
$5,487
2306
123
Oct
10
Reg
$5,487
2307
123
Nov
10
Reg
$5,487
2498
123
Nov
10
OT
$445
2498
123
Nov
01
OT
$445
3402
123
Dec
10
Reg
$5,900
2699
Duplicates
One can see that, according to the sample data, each person should receive only one check per month, even if it is for both regular and overtime pay.
Results EmpNo
Date
CheckNo
Items
Testing for duplicates using only the EmpNo and Date fields with the optional fifth key left blank would falsely identify the two payments made in April, since one was regular pay and the other was overtime, as well as the three payments made in May consisting of one regular and two overtime payments. It would, however, correctly identify the two payments made in October for regular pay. It would also identify the three payments made with two different checks in November.
123
Apr
1551
2
123
Aug
2005
1
123
Dec
2699
1
123
Feb
1367
1
123
Jan
1009
1
123
Jul
1925
1
123
Jun
1800
1
123
Mar
1490
1
123
May
1603
3
123
Nov
2498
2
123
Nov
3402
1
123
Oct
2306
1
123
Oct
2307
1
123
Sep
2161
1
Testing for duplicates based on EmpNo, Date, and Type would falsely identify the two overtime payments in May, and correctly identify the duplicates in October and November. Specifying the same EmpNo and Date but a different CheckNo (Key 5) in the Duplicates test would identify the two payments in October as duplicates. It would also identify the payments in November, where the employee received one check for regular pay and overtime, and then another check for a second overtime payment with a different Dept code. The Duplicates scripts perform in the same way as the Duplicates command if Key 5—the field that must be different—is not specified. However, if you take the same payroll example and specify the same EmpNo and Date but different CheckNo, the script does the following: 1. Sorts the original payroll file on EmpNo and Date and opens the sorted file (Ztemp1). 2. Summarizes on EmpNo and Date, creating a temporary file (Ztemp2) that contains the EmpNo, Date and the CheckNo.
3. Checks for duplicates based on EmpNo and Date, and extracts the EmpNo and Date to a temporary file (Zdup Temp).
49
Fraud Analysis Techniques Using ACL
Results EmpNo
Date
123
Nov
123
Nov
123
Oct
123
Oct
invoices are themselves grouped under a purchase order, or some other type of internal tracking document, which adds still more information. Because of this, most individual transactions have numerous fields associated with them that are duplicated elsewhere. The electronic record of an invoice from a large supplier may have a hundred lines of detail, each consisting of a product description, quantity, and price. All these detail lines share a common document header containing the purchase order number, date, invoice number, payment terms, supplier address, phone and fax numbers, tax ID numbers or vendor ID numbers, and much more.
4. Opens Zdup Temp as the secondary file. 5. Joins the Ztemp1 with Zdup Temp—key fields EmpNo and Date—extracting all matched records to the specified output file. Results
Here is a typical example. This header information applies to each detail record in the accompanying table: Document 144801 Purchase Order ABC Corporation
Invoice # 123 Ms. J. Doe 2001/08/29 2% 10 days
401 North Parkway 406-555-1212
EmpNo
Date
CheckNo
123
Oct
2306
Description
Quantity
Price
Item Total
123
Oct
2307
Laser printer paper
1,000
$5.00
$5,000.00
123
Nov
2498
Toner cartridges
100
$50.00
$5,000.00
123
Nov
2498
Pens, one dozen per box
20
$14.00
$280.00
123
Nov
3402
Mouse pads
5
$8.00
$40.00
Total
$10,320.00
Accounts Payable Example A typical invoice contains a variety of background information in addition to the actual sale items. In many accounts payable systems,
50
All of the items purchased are from the same vendor, ABC Corporation. They were purchased at the same time and share the
Duplicates
same invoice number. In this case, two items have the same total dollar value ($5,000), so the Duplicates command would identify them as duplicates even if the Item Total field is included as a fourth key. However, the Duplicates test gives the user the option of identifying an extra field with a different value. The challenge in running a Duplicates test against such data is to choose key fields that allow the user to find suspicious duplicates. In
this example, the user might specify that the records have the same invoice number, vendor, date, and amount, but a different document number. Requiring that the document number be different ensures that the two $5,000 transactions are not identified as duplicates and isolates only those duplicate records with a higher risk of being fraudulent.
51
5
Gaps
Identifying Transactions Missing from a Sequence The Gaps test makes it possible to review all transactions to ensure there are no missing items. When one or more of the transactions being analyzed is missing, this technique produces quick results by pointing to specific items for investigation. The entire file can be examined to see if all items are accounted for and properly recorded. For example, if health claims are submitted on a standard form that is preprinted with the claim number, the user can test for potentially fraudulent claims by checking to see if the claim numbers correspond with the expected numbering series. Looking for claim numbers that are out of sequence or missing can help focus the search for false claims. The Gaps test examines the records for continuity and completeness within a specified range of values. When searching for fraud, identifying what is not there can often be as important as identifying what is there. The Gaps test can find fraud symptoms, including:
Missing accounts receivable payments
Purchase orders not recorded
Branch offices not reporting revenues
Receipts missing for a given day
Missing cash register tapes
Water or electricity meters readings not recorded
The Gaps test can also be used to identify records that exist, but have not been recorded; for example, missing check numbers or invoices.
52
The test sorts the data file using a selected key field. Next, it reads the records sequentially as it searches for gaps. When it finds a gap, it displays the number—for example, the invoice number of the missing record, or the range of missing items.
Running Gaps 1. From the Fraud Menu, choose Gaps or missing items. Click [OK]. The File Selection dialog box appears:
Gaps
2. From the Select input file drop-down list, choose the file to analyze. If a filter, is to be applied select the Apply filter check box. Click [OK]. The Field Selection dialog box appears:
greater than this cutoff value, the script lists the range of missing items in the log file. Assuming a cutoff value of 5, the resulting report in the log file looks like this: *** Range from 28081 to 28089 missing Missing number 28123 Missing number 28134 Missing number 28135 Missing number 28136 *** Range from 28178 to 28199 missing Missing number 28247 *** Range from 28273 to 28288 missing *** Range from 28401 to 28412 missing *** Range from 28414 to 28419 missing
If the cutoff had been set to 10, more entries would have read similarly to “Missing number 28123,” and fewer would have read “*** Range from 28178 to 28199 missing.” 3. Enter a number in the Maximum number of items to list text box (the default is 5), then select a field from the Test for missing records drop-down list. Click [OK]. The Results dialog box appears.
How the Scripts Work
4. After viewing the results, click [OK].
The Gaps script implements the simplest of the eight major tests. This is not to say that testing for gaps is especially simple—the Gaps command makes it simple.
If the number of consecutive missing items is less than the value entered, the Gaps script writes the values of each of the missing items to the log file. If the number of consecutive missing items is
The script code consists mainly of the lines necessary to run the Gaps command and create the associated dialog boxes. Notice that the Gaps script includes a statement that determines how the results
53
Fraud Analysis Techniques Using ACL
will be displayed in the Results dialog box. This statement, which makes use of the STRING() function, gives the variable gap cnt a format that includes commas and has a maximum length of 13.
Deleting Temporary Variables, Fields, and Files The following variables, fields, and files are automatically deleted during the script run: Gaps Variables error test
gap cnt
gap field
maxgap
filt1
gapdup1
infile
Review and Analysis When a gap is found in the sequence, the follow-up must be carried out manually. The missing records, by definition, are not in the data file. The first step is to verify that the missing items should not be missing. The next step is to find out why the items are missing and determine if there is a valid explanation for each missing item. The case study that follows illustrates the power of this simple fraud detection technique.
54
Gaps SET ECHO NONE SET SAFETY OFF error_test=T DIALOG (DIALOG TITLE "Gaps - File Selection" WIDTH 540 HEIGHT 400 ) (BUTTONSET TITLE "&OK;&Cancel" AT 420 12 DEFAULT 1 ) (TEXT TITLE "Select input file" AT 36 172 ) (ITEM TITLE "f" TO "infile" AT 144 168 WIDTH 247 HEIGHT 137 ) (TEXT TITLE "Gaps checks the selected input file to ensure all values for the specified field are present, and identifies gaps or missing values for this field." AT 36 100 WIDTH 364 HEIGHT 50 ) (CHECKBOX TITLE "Apply filter (optional)" TO "filt_sel" AT 144 300 ) OPEN %infile% IF filt_sel DO Filter_Selection DO Gaps_Dialog WHILE error_test SET ECHO ON GAPS ON %gap_field% MISSING maxgap ERRORLIMIT 1000 TO SCREEN PRESORT SET ECHO NONE gap_cnt=STR(gapdup1,13,'9,999,999,999') DIALOG (DIALOG TITLE "Gaps Test - Results" WIDTH 540 HEIGHT 300 ) (BUTTONSET TITLE "&OK;&Cancel" AT 432 12 DEFAULT 1 ) (TEXT TITLE "There are" AT 36 100 ) (EDIT TO "gap_cnt" AT 108 96 ) (TEXT TITLE "gaps (ranges or missing items)" AT 240 100 ) (TEXT TITLE "See the command log for more details" AT 108 184 )
Gaps
DELETE gapdup1 OK DELETE gap_cnt OK DELETE gap_field OK DELETE maxgap OK DELETE filt1 OK DELETE error_test OK SET ECHO ON
Gaps_Dialog DIALOG (DIALOG TITLE "Gaps - Field Selection" WIDTH 540 HEIGHT 400 ) (BUTTONSET TITLE "&OK;&Cancel" AT 420 12 DEFAULT 1 ) (TEXT TITLE "Select field to test for missing records" AT 36 172 WIDTH 160 HEIGHT 55 ) (ITEM TITLE "CDN" TO "gap_field" AT 204 180 WIDTH 256 ) (TEXT TITLE "Enter maximum number of items to list - a range will be shown if the number of missing items is greater than your entry (no commas)" AT 36 76 WIDTH 264 HEIGHT 64 ) (EDIT TO "maxgap" AT 324 84 WIDTH 88 DEFAULT "5" ) IF LEN(maxgap)=0 OR LEN(gap_field)=0 PAUSE 'Enter required values' IF LEN(maxgap)>0 AND LEN(gap_field)>0 error_test=F
Case Study: Free Calls Provided that they paid for the charges, employees were permitted to use their office phones to make long-distance calls. It was a good deal for the employees since the company had a special discount rate with the telephone carrier. Those making calls were asked to complete long-distance call slips with details about the date and cost of their calls. Each quarter, the employees returned the long-distance call slips and reimbursed the company for their personal calls. It was a good deal for the company, too. The employees’ calls increased the total call volume, allowing the company to negotiate a better long-distance discount rate. One of the employees, William, lost one of his call slips for a personal call, and as a result, reimbursed the company only for the slips he submitted. Later, he found the missing slip but did not reveal his error. When nothing happened, he deliberately failed to include several long-distance call slips from time to time. At the end of the year, the auditors tested the controls over the reimbursement of long-distance charges. They found that, each week, Caroline in accounts payable properly reviewed the long-distance bill and identified the noncompany numbers that had been called. They also found that she was very efficient at figuring out which were company long-distance calls and which were personal calls. The auditors were equally impressed with the monthly recording and billing procedures. However, the auditors noted that there was no control to ensure that the employees were actually paying for the calls they made. The auditors reviewed the reimbursement records using the Gaps test. Since the long-distance call slips were prenumbered, the test easily identified 26 missing slips. The results were presented to Caroline, who matched the numbers from the missing slips to the carbon copies of the slips. William was identified as the culprit for 25 of the missing slips. When approached by the auditors, he admitted that he did not submit all the slips. He was then ordered to reimburse the company and, in accordance with strict company policy on fraud, fired.
55
6
Data Profile
Establishing Normal Values and Investigating Exceptions Data profiling examines key numeric fields to identify anomalies or values in your data that fall outside the norm. In doing so, it focuses attention on transactions with a higher risk of being fraudulent. For example, a review of corporate credit card transactions might focus on large-dollar transactions, payments to unusual vendors, purchases for exact dollar amounts, or the use of company credit cards for purchases on weekends.
The most-often-used and least-often-used credit cards or merchants.
Best of all, running Data Profile tests regularly makes it possible to reduce losses by detecting fraud in its early stages.
Running Data Profile 1. From the Fraud Menu, choose Profile numeric field. Click [OK]. The File Selection dialog box appears:
Many data analysis techniques search for specific types of known exceptions in the data. However, advanced analysis techniques, such as data profiling, search for unsuspected symptoms of fraud. If, for example, the user is reviewing an inventory system and does not realize that it is possible to enter a negative amount in the Quantity Received field, he or she might not consider looking for receipt values less than zero. However, the Data Profile script automatically identifies the existence of negative values. What’s more, the Data Profile script highlights instances where a specific number for the quantity received shows up especially frequently. So, for example, a review credit card transactions, can profile the Bill Amount field to identify:
56
The minimum, maximum, average, highest, and lowest 10 transactions.
Ranges of credit card transaction amounts, such as $0 to $99.99, $100.00 to $299.99, and $300.00 to $499.99.
The most-often-recorded amount for credit card purchases.
The transactions for exact dollar amounts, such as $2,000.00 or $7,000.00.
2. From the Select input file drop-down list, choose the file you want to analyze. Click [OK]. The Numeric Field Selection dialog box appears:
Data Profile
4. Select one or more of the Data Profile tests. Click [OK]. The Results dialog box appears:
3. From the Numeric field to be profiled drop-down list, choose the field you want to analyze. If you want to apply a filter, select the Apply filter check box. Click [OK]. The Test Selection dialog box appears: 5. After viewing the results, click [OK]. If you selected multiple tests, continue to click [OK] until all results have been viewed. You can also review data profile results in the Command Log window (best displayed using the Log File view). For the Exact Multiples, Frequently Used Values, Items with Most Exact Multiples, and Least/Most Used Items tests, you can also review the results by opening the output files with the associated index applied. The index presents the output file records in the order specified by the test.
How the Scripts Work The first three Data Profile tests are based on the Statistics and Stratify commands, which makes writing the related subscripts easy,
57
Fraud Analysis Techniques Using ACL
even if the test itself is quite complicated. On the other hand, the Round Amounts test uses the simple Count command in a complicated way that requires long expressions. When creating scripts, knowing how to make effective use of ACL commands is very important. However, deciding which ranges of a numeric field to select for analysis is even more critical. Here are two examples:
The Stratify tests have been set up to deliver two specific sets of ranges. The first divides the data file from the highest value to the lowest into 10 equal intervals, regardless of how many records fall into each interval. The second totals the selected numeric field in the file, and sets the range limits to show the top 5 percent, middle 90 percent, and bottom 5 percent on either side of 0.
The Round Amounts test counts the number of records with amounts that are a multiple of 5, 10, 25, or 100. In this test, decimal places are very important because $1.25 is not a multiple of 5, whereas $35.00 is.
These kinds of judgments are based on experience. While the specific ranges and values shown have proven useful in fraud detection, they are not guaranteed to be the best possible choice in every situation. It is up to the user to consider whether other ranges or values might be better targets of the investigation.
58
Data_Profile SET ECHO NONE SET SAFETY OFF error_test=T filt1='' DIALOG (DIALOG TITLE "Data Profile - File Selection" WIDTH 540 HEIGHT 400 ) (BUTTONSET TITLE "&OK;&Cancel" AT 420 24 DEFAULT 1 ) (TEXT TITLE "Data Profile - tests a specific numeric field within your selected input file to search for anomalies in the data, such as exact multiples, infrequently used values, or frequently used values." AT 36 112 WIDTH 406 HEIGHT 78 ) (TEXT TITLE "Select input file" AT 36 232 ) (ITEM TITLE "f" TO "infile" AT 156 228 WIDTH 254 ) OPEN %infile% DO Data_Profile_Dialog1 WHILE error_test=T IF filt_sel DO Filter_Selection error_test=T DO Data_Profile_Dialog2 WHILE error_test=T IF stat SET ECHO ON STATISTICS ON %numfield% NUMBER 10 STD SET ECHO NONE IF stat DO Stat IF strat1 DO Strat1 IF strat2 DO Strat2 IF round1 DO Round IF exact1 DO Exactmult IF exact2 DO Exactsum IF freq1 DO Freqnum IF freq2 DO Freqchar
Data Profile
SET FILTER SET SAFETY ON DELETE round1 OK DELETE exact1 OK DELETE exact2 OK DELETE freq1 OK DELETE freq2 OK DELETE stat OK DELETE strat1 OK DELETE strat2 OK DELETE infile OK DELETE filt1 OK DELETE charfield OK DELETE numfield OK DELETE error_test OK DELETE filt_sel OK SET ECHO ON
Flow of Data in Data Profile
Data_Profile_Dialog1 DIALOG (DIALOG TITLE "Numeric Field Selection" WIDTH 540 HEIGHT 300 ) (BUTTONSET TITLE "&OK;&Cancel" AT 420 12 DEFAULT 1 ) (ITEM TITLE "N" TO "numfield" AT 252 120 WIDTH 252 ) (TEXT TITLE "Select the numeric field to be profiled" AT 72 112 WIDTH 165 HEIGHT 40 ) (CHECKBOX TITLE "Apply filter (optional)" TO "filt_sel" AT 72 228 ) IF LEN(numfield)=0 PAUSE 'You must specify a numeric field' IF LEN(numfield)>0 error_test=F
Note: The Stat, Strat1, Strat2, and Round tests display directly to a dialog box or to the screen without generating new data files
Data Profile Test Parameters One or more of the following Data Profile tests can be chosen from the Test Selection dialog box:
Statistics—Calls script Stat. No additional parameters required. However, this script is always run when Data Profile is selected
59
Fraud Analysis Techniques Using ACL
to provide some information needed in other tests, such as maximum and minimum values.
60
Stratify (10 intervals)—Calls script Strat1. No additional parameters required. The range from the maximum to the minimum of the selected numeric field, calculated when Statistics is run, is used to create 10 intervals.
Stratify (ranges small to medium to large)—Calls script Strat2. No additional parameters required. Small, medium, and large intervals are calculated based on the maximum and minimum values of the numeric field.
Round amounts (5, 10, 25, or 100)—Calls script Round. No additional parameters required.
Exact multiples of—Calls script Exactmult. Requires numeric value (default 1,000) against which the numeric field will be tested for exact amounts. Creates the output file Exact Mult.
Frequently used values—Calls script Freqnum. No additional parameters required. Creates the output file Freq Used Num and the index Freqnum.
Items with most exact multiples—Calls script Exactsum. Requires character field (item) on which the numeric field will be summarized. Creates the file Sum Exact Mult and the indexes Exact Pct and Exact Tot.
Least/most used items—Calls script Freqchar. Requires character field (item) on which the numeric field is summarized. Creates the output file Freq Used Character and the index Freqchar.
Data_Profile_Dialog2 DIALOG (DIALOG TITLE "Data Profile - Test Selection" WIDTH 740 HEIGHT 400 ) (BUTTONSET TITLE "&OK;&Cancel" AT 612 12 DEFAULT 1 ) (TEXT TITLE "Select one or more tests to run on the selected numeric field:" AT 60 16 ) (CHECKBOX TITLE "Statistics - min, max, average, etc." TO "stat" AT 180 36 ) (CHECKBOX TITLE "Stratify - 10 even intervals (max to min)" TO "strat1" AT 180 72 ) (CHECKBOX TITLE "Stratify - ranges small to medium to large" TO "strat2" AT 180 96 ) (CHECKBOX TITLE "Frequently used values of numeric field" TO "freq1" AT 180 192) (CHECKBOX TITLE "Exact multiples of" TO "exact1" AT 180 156 ) (CHECKBOX TITLE "Least/most used items" TO "freq2" AT 180 276 ) (TEXT TITLE "Select a character field" AT 264 304 WIDTH 152 HEIGHT 15 ) (ITEM TITLE "C" TO "charfield" AT 432 300 WIDTH 258 ) (EDIT TO "exactmult" AT 324 156 WIDTH 73 DEFAULT "1000" ) (CHECKBOX TITLE "Items with most exact multiples" TO "exact2" AT 180 252 ) (CHECKBOX TITLE "Round amounts (multiples of 5, 10, 25 or 100 )" TO "round1" AT 180 132 ) (TEXT TITLE "(change default if desired)" AT 408 160 ) (TEXT TITLE "For the next two tests, select a character field from the drop-down list:" AT 199 229 ) IF (exact2 OR freq2) AND LEN(charfield)=0 PAUSE 'You must specify a character field' IF (NOT exact2 AND NOT freq2) OR LEN(charfield)>0 error_test=F
Data Profile
Stat hinum=STR(max1,20,'9,999,999,999,999.99') lonum=STR(min1,19,'(9,999,999,999.99)') avgnum=STR(average1,18,'9,999,999.99') stdnum=STR(stddev1,18,'9,999,999.99') high10=STR(high1,20,'9,999,999,999.99') low10=STR(low1,19,'(9,999,999,999.99)') cntnum=STR(count1,18,'9,999,999') totnum=STR(total1,20,'9,999,999,999,999.99') DIALOG (DIALOG TITLE "Data Profile - Statistics Results" WIDTH 740 HEIGHT 400 ) (BUTTONSET TITLE "&OK;&Cancel" AT 624 12 DEFAULT 1 ) (TEXT TITLE "Analysis of" AT 36 28 ) (EDIT TO "lonum" AT 156 216 WIDTH 134 ) (TEXT TITLE "Highest value" AT 48 172 ) (TEXT TITLE "Lowest value" AT 48 220 WIDTH 88 ) (EDIT TO "hinum" AT 156 168 WIDTH 135 ) (TEXT TITLE "10th highest value" AT 324 172 ) (EDIT TO "high10" AT 456 168 WIDTH 132 ) (EDIT TO "low10" AT 456 216 WIDTH 133 ) (TEXT TITLE "10th lowest value" AT 324 220 ) (TEXT TITLE "Standard deviation" AT 324 268 ) (EDIT TO "stdnum" AT 456 264 WIDTH 132 ) (TEXT TITLE "Average value" AT 48 268 ) (EDIT TO "avgnum" AT 156 264 WIDTH 135 ) (EDIT TO "numfield" AT 120 24 WIDTH 208 ) (EDIT TO "cntnum" AT 120 84 WIDTH 137 ) (TEXT TITLE "Records" AT 48 88 ) (EDIT TO "totnum" AT 372 84 WIDTH 219 ) (TEXT TITLE "Total" AT 312 88 ) (TEXT TITLE "field" AT 336 28 WIDTH 80 ) DELETE low10 OK DELETE high10 OK DELETE avgnum OK
Deleting Temporary Variables, Fields, and Files These variables are automatically deleted during the script run:
Data Profile
Stat*
charfield
avgnum Ineg
exact1–2
Strat2 Round
Exactmult Freqnum Exactsum
Freqchar
count2–6
multcnt
highnum cnt mult
highchar
cntnum Ipos
count5a, 10a, 25a, 100a
multtotal
valnum
highnum
filt1
high10
mneg
countall
crlf
freq1–2
hinum
mpos
exp5, 10, 25, 100
highamtnum lownun
infile
low10
sneg
len
highchar
numfield
lonum
spos
z stat5, 5a, 10, 10a, 25, 25a, 100, 100a
highnum
Outfile
stdnum
round1
totnum
exact exp
highamtchar lowchar
crlf
stat strat1–2 useroutfile * Plus all variables created by Statistics Command.
Note: There are no variables deleted for the subscript Strat1.
61
Fraud Analysis Techniques Using ACL
Review and Analysis Initially, the results for the Data Profile tests are summarized in the Results dialog boxes for easy viewing. The detailed results are written to both the command log and the new data file where applicable. The first step is a careful review of the command log and all files created by the Data Profile tests. The indexes that have been created for the results files should be applied when the new data files are opened. The next step is to open the original data file, the one containing the detailed transactions that were analyzed by the Data Profile tests. Filters can then be used to isolate records of interest, allowing the user to drill down into the data. The Statistics and Stratify tests produce an overview of the data, searching for anomalies that become apparent, even among thousands of records.
Statistics The first test (Stat) delivers a wide range of essential statistical information and can quickly detect anomalies in numeric fields, helping to set a basis for additional audit tests. The Statistics command returns details for any numeric field, such as:
62
Average value
Absolute value
Highest and lowest values
Number of negative, zero, and positive values
DELETE DELETE DELETE DELETE DELETE
hinum OK lonum OK stdnum OK cntnum OK totnum OK
Strat1 SET ECHO ON STRATIFY ON %numfield% ACC %numfield% SET ECHO NONE Strat2 lneg=STR(min1,19) IF min1<-10 lneg=STR(-10,19) IF min1>=-10 mneg=STR(min1*.95,19) IF min1<-100 sneg=STR(min1*.05,19) IF min1<-100 spos=STR(max1*.05,19) mpos=STR(max1*.95,19) lpos=STR(max1,19) SET ECHO ON IF min1<-100 STRATIFY ON %numfield% ACC %numfield% FREE %lneg%,%mneg%,%sneg%,0,%spos%, %mpos%,%lpos% TO SCREEN IF min1>=-100 STRATIFY ON %numfield% ACC %numfield% FREE %lneg%,0,%spos%,%mpos%, %lpos% TO SCREEN SET ECHO NONE DELETE lneg OK DELETE mneg OK DELETE sneg OK DELETE lpos OK DELETE mpos OK DELETE spos OK
Data Profile
Round OPEN %infile% SET FILTER %filt1% len=LEN(SPLIT(STR(%numfield%,18),'.',2)) GROUP IF %numfield%<>0 COUNT IF MOD(%numfield%,5)=0 COUNT IF MOD(%numfield%,10)=0 COUNT IF MOD(%numfield%,25)=0 COUNT IF MOD(%numfield%,100)=0 COUNT END count5a=STR(1.000000*count2/count6*100,12) count10a=STR(1.000000*count3/count6*100,12) count25a=STR(1.000000*count4/count6*100,12) count100a=STR(1.000000*count5/count6*100,12) countall=STR(count6,12) exp5=STR(20.0000/10ˆlen,8) exp10=STR(10.0000/10ˆlen,8) exp25=STR(4.0000/10ˆlen,8) exp100=STR(1.0000/10ˆlen,8) z_stat5 = DEC((ABS(1.00000000000*count2/ count6-0.2000))/(0.20*0.800000000000/ (1.00000000000*COUNT6))ˆ0.5000,5) z_stat10 = DEC((ABS(1.00000000000*count3/ count6-0.1000))/(0.10*0.900000000000/ (1.00000000000*COUNT6))ˆ0.5000,5) z_stat25 = DEC((ABS(1.00000000000*count4/ count6-0.0400))/(0.04*0.960000000000/ (1.00000000000*COUNT6))ˆ0.5000,5)
Stratify Stratify can be used to examine the possible ranges of key numeric fields. The Stratify command counts the number of records that fall into a specified stratum, or specific intervals of the values in a numeric field. It also lets the user total the value of one or more numeric fields for each of the strata. For instance, stratifying on the Amount field in a contracts file gives a summarized view of the different values of contracts raised. Auditors use this technique to focus their examination on transactions of high materiality, but it can also be used to identify possible indications of fraud, such as transactions exceeding a person’s financial authority limit or a high number of questionable returns. The Data Profile scripts can perform two types of stratification. The first (Strat1) uses 10 intervals between the maximum and minimum values of the selected numeric field. The second (Strat2) uses intervals based on the value of the numeric field above and below zero. The intervals for “amount greater than 0” are the top 5 percent, the middle 90 percent, and the 5 percent just above 0. For “amount less than 0,” the intervals are the 5 percent closest to 0, the middle 90 percent, and the top 5 percent, the largest negative amounts. These two types of stratification offer different advantages and work well together. Running both tests and then comparing the results is an effective way to pinpoint suspect transactions. Depending on the distribution of items, there may be very few items in the top 10 percent of the range from highest to lowest, or there may be many. Also, depending on the distribution, the top 5 percent of item values may be much higher than the middle 90 percent. By stratifying as a function of the value range, and as a percentage of total items, a clearer picture of the data emerges.
63
Fraud Analysis Techniques Using ACL
In the previous example, just over 11,000 items, worth more than $23 million, have been stratified. The results reveal 44 contracts with values over $30,000. Contract Values Analysis
z_stat5a=STR(z_stat5,10) z_stat10a=STR(z_stat10,10) z_stat25a=STR(z_stat25,10) z_stat100a=STR(z_stat100,10)
<<< STRATIFY over 0,00-> 100,000.00 >>> >>> Minimum encountered was 0.59 >>> Maximum encountered was 95,440.00 Amount
Count
<—%
%—>
Total
0.00-> 9,999.99
11,252
97.98%
75.55%
17,475,360.64
10,000.00-> 19,999.99
124
1.08%
7.65%
1,769,660.28
20,000.00-> 29,999.99
64
0.56%
6.37%
1,473,722.52
30,000.00-> 39,999.99
20
0.17%
3.12%
720,991.80
40,000.00-> 49,999.99
8
0.07%
1.58%
365,092.00
50,000.00-> 59,999.99
0
0.00%
0.00%
0.00
60,000.00-> 69,999.99
4
0.03%
1.16%
268,960.00
70,000.00-> 79,999.99
0
0.00%
0.00%
0.00
80,000.00-> 89,999.99
8
0.07%
2.93%
676,692.56
90,000.00-> 100,000.00
4
0.03%
1.65%
381,760.00
11,484
100.00%
100.00%
23,132,239.80
64
z_stat100 = DEC((ABS(1.00000000000*count5/ count6-0.0100))/(0.01*0.990000000000/ (1.00000000000*COUNT6))ˆ0.5000,5)
DIALOG (DIALOG TITLE "Data Profile - Round Amounts Results" WIDTH 740 HEIGHT 400 ) (BUTTONSET TITLE "&OK;&Cancel" AT 600 12 DEFAULT 1 ) (TEXT TITLE "Multiples of 5" AT 72 136 ) (TEXT TITLE "Multiples of 10" AT 72 184 ) (TEXT TITLE "Multiples of 25" AT 72 232 ) (TEXT TITLE "Multiples of 100" AT 72 280 ) (EDIT TO "count5a" AT 204 132 ) (EDIT TO "count10a" AT 204 180 ) (EDIT TO "count25a" AT 204 228 ) (EDIT TO "count100a" AT 204 276 ) (TEXT TITLE "Actual % of records" AT 216 88 WIDTH 99 HEIGHT 40 ) (TEXT TITLE "Expected % of records" AT 384 88 WIDTH 130 HEIGHT 39 ) (EDIT TO "Exp5" AT 384 132 ) (EDIT TO "Exp10" AT 384 180 ) (EDIT TO "Exp25" AT 384 228 ) (EDIT TO "Exp100" AT 384 276 ) (EDIT TO "Z_stat5a" AT 564 132 WIDTH 120 ) (EDIT TO "z_stat10a" AT 564 180 WIDTH 120 ) (EDIT TO "Z_stat25a" AT 564 228 WIDTH 119 ) (EDIT TO "z_stat100a" AT 564 276 WIDTH 120 ) (TEXT TITLE "Z-Statistic" AT 588 88 ) crlf=CHR(10) SET ECHO ON
Data Profile
DISPLAY crlf+crlf+crlf+'********************* **********************************'+crlf+ '******* Round Amount - results for '+infile+crlf+ '******* total records = ' countall+crlf+ '******* '+crlf+ '******* Multiples of 5 - Actual %'+count5a+' Expected %'+exp5+' Z-Stat' z_stat5 crlf+ '******* Multiples of 10 - Actual %'+count10a+' Expected %'+exp10+' Z-Stat' z_stat10 crlf+ '******* Multiples of 25 - Actual %'+count25a+' Expected %'+exp25+' Z-Stat' z_stat25 crlf+ '******* Multiples of 100 - Actual %'+count100a+' Expected %'+exp100+' Z-Stat' z_stat100 crlf+ '*******'+crlf+'********************** *********************************'+crlf+crlf+ crlf SET ECHO NONE COMMENT ******** Delete Temporary Variables *************** DELETE DELETE DELETE DELETE DELETE DELETE DELETE DELETE DELETE DELETE DELETE
crlf OK count2 OK count3 OK count4 OK count5 OK count6 OK count5a OK count10a OK count25a OK count100a OK countall OK
Given these results, the auditor might decide to review the 16 contracts with values over $60,000. These contracts represent only 0.13 percent of all contracts raised, but account for nearly 6 percent of the total value of all contracts.
Round Amounts, Exact Multiples, and Frequent Values The Statistics and Stratify tests deal with the entire file, breaking it into ranges and reporting on statistical measures. The next three options in the Test Selection dialog box focus on unusual repetitions within a numeric field that have occurred because of excessive rounding, excessive use of a particular value, or excessive multiples of a particular value. Looking for round amounts is a powerful fraud detection technique. Numbers that have been rounded up, such as $200 or $5,000, do not occur very often by chance. Sometimes they are the result of a legitimate pricing decision, but they can also be a telltale sign that fraud has been committed. The existence of exact multiples is another indication of possible fraud and should be examined. For example, in credit card data, exact multiples may indicate that cardholders are obtaining cash advances from vendors for fictitious purchases.
Round Amounts: Multiples of 5, 10, 25, or 100 The Round Amounts test (Round) counts the number of records with amounts that are multiples of 5, 10, 25, or 100. The number of records for each is calculated and compared to the expected frequency. Because the test examines rounding in integer amounts, the expected proportion will vary according to the number of decimal places. For example, in a file specified to the nearest cent, the expected proportion of multiples of 5 dollars is 0.2 percent; for files specified to the nearest dollar, the expected proportion is 20 percent.
65
Fraud Analysis Techniques Using ACL
Exact Multiples of . . . This test (Exactmult) allows users to enter the value they want to use (default value is 1,000) and extracts all records that are multiples of the specified value to a file called Exact Mult. The script displays the number of records and the total value of the numeric field where the numeric field value is a multiple of your specified value.
Frequently Used Values Unless there are reasons for certain numbers to occur more often than others (e.g., price breaks or multiple purchases of the same item), the frequency distribution of the values of the numeric field, such as the amounts of credit card purchases, should be random. However, people creating fraudulent transactions are often lazy or not very creative when it comes to making up the amounts for the fake transactions. They often use the same amounts over and over again. Thus, frequently used values may be a symptom of fraud. The Freqnum script determines the most-often-used numeric values for the selected field, which makes for easy comparisons. The script produces a file, Freq Used Num, providing information on how often each value of the numeric field has occurred. It also creates an index (Freqnum) on the descending count. For example, 78 transactions in the amount of $3,410 may be suspicious, depending on the circumstances. In this case, 78 transactions for the same amount might be especially significant considering that in a data file of over 127,000 transactions, the next-most-often-used value, $1,296.21, was used only 10 times. Another helpful measure is to calculate the average number of times a specific number is used. The script runs the Statistics command on the results file to provide this information as well as the highest and lowest number of times an amount was used.
66
DELETE DELETE DELETE DELETE DELETE DELETE DELETE DELETE DELETE DELETE DELETE DELETE
exp5 OK exp10 OK exp25 OK exp100 OK z_stat5 OK z_stat10 OK z_stat25 OK z_stat100 OK z_stat5a OK z_stat10a OK z_stat25a OK z_stat100a OK
Data Profile
Exact_Mult OPEN %infile% SET FILTER %filt1% count1=0 total1=0.00 EXTRACT RECORD TO Exact_Mult IF MOD(%numfield%, %exactmult%)=0 AND %numfield%<>0 OPEN Exact_Mult STATISTICS ON %numfield% TO SCREEN NUMBER 5 multcnt=STR(count1,18,'9,999,999,999') multtotal=STR(total1,18,'9,999,999,999.99') DIALOG (DIALOG TITLE "Data Profile - Exact Multiples Results" WIDTH 540 HEIGHT 300 ) (BUTTONSET TITLE "&OK;&Cancel" AT 420 12 DEFAULT 1 ) (TEXT TITLE "Exact multiples of" AT 96 88 ) (TEXT TITLE "A file, Exact_Mult, has been created containing:" AT 36 40 ) (TEXT TITLE "Number of records " AT 96 160 ) (EDIT TO "multcnt" AT 240 156 WIDTH 156 ) (TEXT TITLE "Total value of exact multiples" AT 96 196 WIDTH 129 HEIGHT 39 ) (EDIT TO "multtotal" AT 240 192 WIDTH 156 ) (EDIT TO "numfield" AT 240 120 WIDTH 156 ) (TEXT TITLE "For the field" AT 96 124 ) (EDIT TO "exactmult" AT 240 84 WIDTH 156 )
Profiling with Character Fields To realize the full power of data profiling, the user needs to divide the data into meaningful segments and examine the results for each segment. This can be done by selecting relevant character fields and searching for frequently used values or categories.
Items with the Most Exact Multiples Selecting this test (Exactsum) allows the user to identify the item with the largest value of exact amounts and writes the result to the file Sum Exact Mult. It also defines the item with the highest percentage of transactions with exact amounts relative to the total number of transactions. For example, one credit card may have 5 transactions that are multiples of 1,000, while another may have 12. However, if the first credit card had a total of only 5 transactions and the second had 24, the percentages of the number of transactions with exact amounts would be 100 percent and 50 percent, respectively. Two indexes are created for the file Sum Exact Mult.
The Exact Pct index arranges the file by the percentage of records, largest to smallest.
The Exact Tot index arranges the file by the total value of the numeric field for records with exact amounts, largest to smallest.
The index Exact Pct flags the credit cards with 100 percent of their transactions for exact amounts as a higher priority than credit cards with only 92 percent of their transactions for exact amounts. However, the index Exact Tot arranges the records by the total value of the numeric field for records with exact amounts.
67
Fraud Analysis Techniques Using ACL
Index Exact Pct Order
Count
Exact Count
Exact Percent
Exact Amount
Exact Total
2
1
50,000
4,000.00
4,779.96
ABC COMMUNICATION
28
2
7,100
98,000.00
147,089.79
DGC CENTRE
16
1
6,300
14,000.00
239,790.57
Count
Exact Count
Exact Percent
Exact Amount
Exact Total
ABC COMMUNICATION
28
2
7,100
98,000.00
147,089.79
DGC CENTRE
16
1
6,300
14,000.00
239,790.57
2
1
50,000
4,000.00
4,779.96
Vendor IDEAL PARKING INC
Index Exact Tot Order
Vendor
IDEAL PARKING INC
The records can be listed in the order specified by the index simply by double-clicking the desired index in the View window.
Least/Most Used Items Another data profiling technique is the identification of the least or most used items. For example, in accounts payable data, it is easy to find the largest and smallest vendor accounts. Running this test (Freqchar) just once can point to items worth investigating. In addition, running the test each month allows the user to establish patterns in the data, making the anomalies easier to identify.
68
crlf=CHR(10) SET ECHO ON DISPLAY crlf+crlf+crlf+'***************************** ******************************************** '+crlf+ '******* Exact Multiples - results for '+infile+crlf+ '*******'+crlf+ '******* Number of records with Exact Multiples of '+exactmult+' is'+multcnt+crlf+ '******* Total of records with Exact Multiples of '+exactmult+' is '+multtotal+crlf+ '*******'+crlf+'************************* ******************************************** *******'+crlf+crlf+crlf SET ECHO NONE DELETE multcnt OK DELETE multtotal OK DELETE crlf OK
Data Profile
FreqNum OPEN %infile% SET FILTER %filt1% CLASSIFY ON STR(%numfield%,20) AS 'Value' ACC %numfield% AS 'Total' TO Freq_Used_Num OPEN Freq_used_Num INDEX ON Count D Total D TO "Freqnum" OPEN STATISTICS ON Count TO SCREEN NUMBER 8 highnum=STR(max1,13,'9,999,999,999') valnum=Value WHILE RECNO()=1 DIALOG (DIALOG TITLE "Data Profile - Frequently Used Values Results" WIDTH 540 HEIGHT 400 ) (BUTTONSET TITLE "&OK;&Cancel" AT 408 12 DEFAULT 1 ) (TEXT TITLE "Frequently used values for" AT 96 124 ) (TEXT TITLE "A file, Freq_Used_Num, has been created containing:" AT 48 76 ) (EDIT TO "highnum" AT 324 240 WIDTH 166 ) (TEXT TITLE "Most-often-used value" AT 156 184 ) (TEXT TITLE "Number of times used" AT 156 244 ) (EDIT TO "valnum" AT 324 180 WIDTH 166 ) (EDIT TO "numfield" AT 300 120 WIDTH 190 ) DELETE highnum OK DELETE valnum OK
The test creates an index (Freqchar) on the descending count of items. It also performs the Statistics command on the count, showing the highest, lowest, average, and so on. If examining credit card purchases for office supplies, it would be expected that a vendor with a name like General Office Supplies would generate a lot of transactions for office supplies. For General Office Supplies to rank as most used would not be surprising. However, if the number of transactions with this company is sufficiently large, perhaps your company should be requesting volume discounts. Meanwhile, the least-used vendor might be Alternative Life Styles Shop, with just two transactions. Though not indicative of inefficiency, these transactions might not be for office supplies, and should be reviewed. The Freqchar test would highlight both of these examples (most and least used values).
Case Study: Receipt of Inventory The manager of the warehouse was being investigated for allegations of theft. Normally, this called for the investigators to conduct a standard inventory review, comparing the manual count of the inventory on the shelf to the totals from the inventory system. In this case, however, the sheer quantity of inventory at most of the retail stores made it unfeasible to do more than look at a sample of the transactions. This, coupled with the fact that inventory was often in transit, rather than in the warehouse or on the sales floor, made the job very difficult. The results of their initial investigations did not identify any problems. There were no unexplained differences between the system totals and the physical inventory. Then one of the auditors suggested that the special challenges posed by this investigation made it an ideal application for data profiling techniques. Despite some reluctance from the IT department, the auditor obtained a copy of inventory transactions from the last month. The first step was to run the Statistics command on the Quantity Received field.
69
Fraud Analysis Techniques Using ACL
The test results were as follows:
ExactSum
Analysis of the Quantity Received Field
Positive Zeros Negative Totals Abs Value Range Std. Dev.
Number
Total
Average
1,821
85,587
47
180
−470
−3
2,001
85,117
43
0
86,057 106 12.07
Highest (5) 98 98 96 95 92 Lowest (5) -8 -6 -3 -3 -3 -3
OPEN %infile% SET FILTER %filt1% DEFINE FIELD Exact_Exp COMPUTED %numfield% IF MOD(%numfield%,%exactmult%)=0 AND %numfield%<>0 0.00 DEFINE FIELD Cnt_Mult COMPUTED 1 IF MOD(%numfield%,%exactmult%)=0 AND %numfield%<>0 0 CLASSIFY ON %charfield% AS 'Value' ACC %numfield% AS 'Total' Cnt_Mult Exact_Exp AS 'Exact_Tot' TO Sum_Exact_Mult DELETE Exact_Exp OK DELETE Cnt_Mult OK OPEN Sum_Exact_Mult DEFINE FIELD Exact_Pct COMPUTED
Unexpectedly, there were records with negative values in the Quantity Received field. Indeed, the negative quantities accounted for approximately 10 percent of all transactions and a total of nearly 500 items. Upon learning this, the team leader requested further investigation. The results proved that the inventory manager was stealing items from the warehouse and creating receipts with negative amounts. The manager created these negative receipts, when he had taken items from inventory, to ensure that the system totals agreed with the physical inventory. A detailed, computer-based analysis revealed that more than $802,000 had been stolen in the previous year.
70
1.000*Cnt_Mult/count*100 INDEX ON Exact_Pct D Exact_Tot D TO "Exact_Pct" IF Exact_Pct>0 OPEN STATISTICS ON Exact_Pct TO SCREEN NUMBER 5 highchar=value IF RECNO()=1 highnum=STR(max1,18,'9,999,999,999.99')
Data Profile
INDEX ON Exact_Tot D TO "Exact_Tot" IF Exact_Tot>0 OPEN STATISTICS ON Exact_Tot TO SCREEN NUMBER 5 highamtchar=Value IF RECNO()=1 highamtnum=STR(max1,18,'9,999,999,999.99') DIALOG (DIALOG TITLE "Data Profile - Most Exact Multiples Results" WIDTH 540 HEIGHT 300 ) (BUTTONSET TITLE "&OK;&Cancel" AT 432 12 DEFAULT 1 ) (TEXT TITLE "Exact multiples for" AT 36 52 ) (TEXT TITLE "A file, Sum_Exact_Mult, has been created containing:" AT 24 16 ) (EDIT TO "highamtnum" AT 108 252 WIDTH 154 ) (TEXT TITLE "Amount" AT 156 136 ) (TEXT TITLE "Total of exact multiples" AT 108 220 ) (EDIT TO "highamtchar" AT 108 168 WIDTH 154 ) (TEXT TITLE "% number of transactions" AT 312 136 ) (EDIT TO "highchar" AT 312 168 WIDTH 156 ) (TEXT TITLE "% of exact multiples" AT 324 220 ) (EDIT TO "highnum" AT 312 252 WIDTH 159 ) (EDIT TO "charfield" AT 168 48 WIDTH 207 ) (TEXT TITLE "Highest based on:" AT 36 100 WIDTH 78 HEIGHT 38 ) DELETE DELETE DELETE DELETE
highnum OK highchar OK highamtnum OK highamtchar OK
Case Study: Exact Multiples Travel expenses had always been a concern for the auditors because the controls were weak. Employees had a maximum per diem rate when traveling, but had to submit actual receipts to cover the expenses. Maximums were also established for meals: breakfast $10.00, lunch $20.00, dinner $30.00, and hotel $200.00. The auditors used the Data Profile tests to identify transactions that were multiples of $10.00 or multiples of $100.00. These transactions were compared to the paper receipts to ensure that the amounts expensed were appropriate. The manual review determined that some people were charging the maximum rates for meals and hotels, though the receipts did not justify the amounts.
Filtering and Drilling Down All of the data profiling techniques discussed in this chapter can be improved by using filters, either prior to running the tests or during the follow-up. Filters offer an easy way to move from a very general view of the data to a very focused view of individual records. This is called ‘‘drilling down.’’
Filtering before Profiling The objective of creating a data profile is to identify suspect transactions for further investigation. Filtering the data before running any tests can speed up this process. Usually, the filter would be applied to a character or date field that is related to the numeric field that will be profiled. Here are some ways to apply filters beforehand:
For the Statistics and Stratify tests, try dividing the data into categories for which you expect the results to be similar. For example, if business activity is not strongly seasonal, filter by month. If departments, divisions, or regions are similar in size, filter by each one in turn. The advantage of such filtering is that it provides an additional standard of comparison. If the total for the top 5 percent of invoice amounts
71
Fraud Analysis Techniques Using ACL
computed for the entire file seems high, then filtering by month or region and repeating the test will quickly tell the user whether the unusually high entries belong to one particular time or place.
For the Round Amounts test, try filtering out items that are always going to be round for legitimate reasons, so that the remaining rounded items stand out more. For example, rent payments and many kinds of regular service fees are likely to be rounded to the nearest $10, $50, or $100. If these are identified and set aside, perhaps using a general ledger code or vendor name, then the analysis will focus on the remaining items. The proportion of rounded items remaining will be closer to the expected proportion; therefore, abnormal items will stand out more. For the Exact Multiples tests, it may be helpful to keep track of items from previous periods that tend to show up frequently for legitimate reasons. The user can then either filter for these values beforehand or simply keep a list handy when he or she is reviewing the detailed results, so these items can be skipped in the investigation.
Filtering after Profiling Once one of the Data Profile tests has been run, the next step is to drill down to the questionable transactions to extract further details. Drilling down can be as simple as picking a narrower range of amounts and examining each record in that range, or it can involve multiple steps. For example:
72
A Least/Most Used Items test of a payables file shows ABC Corporation as the vendor for a total of 2,610 purchases. First, set the filter Vendor = ‘ABC Corporation’ to isolate
FreqChar OPEN %infile% SET FILTER %filt1% CLASSIFY ON %charfield% as 'Value' ACC %numfield% AS 'Total' TO Freq_Used_Character OPEN Freq_Used_Character INDEX ON Count Total TO "Freqchar" OPEN STATISTICS ON Count TO SCREEN NUMBER 5 lownum=STR(min1,13,'9,999,999,999') lowchar=Value IF RECNO()=1 highnum=STR(max1,13,'9,999,999,999') LOCATE RECORD count1 highchar=Value DIALOG (DIALOG TITLE "Data Profile - Least/ Most Used Results" WIDTH 540 HEIGHT 300 ) (BUTTONSET TITLE "&OK;&Cancel" AT 420 12 DEFAULT 1 ) (TEXT TITLE "Least/most used" AT 24 76 ) (TEXT TITLE "A file, Freq_Used_Character, has been created containing:" AT 24 16 WIDTH 358 HEIGHT 33 ) (EDIT TO "lownum" AT 156 216 WIDTH 154 ) (TEXT TITLE "Least used" AT 192 112 ) (TEXT TITLE "Number of times used" AT 156 184 ) (EDIT TO "lowchar" AT 156 144 WIDTH 154 ) (TEXT TITLE "Most used" AT 384 112 ) (EDIT TO "highchar" AT 348 144 WIDTH 151 ) (TEXT TITLE "Number of times used" AT 348 184 ) (EDIT TO "highnum" AT 348 216 WIDTH 151 ) (EDIT TO "charfield" AT 156 72 WIDTH 165 ) DELETE DELETE DELETE DELETE
lownum OK lowchar OK highnum OK highchar OK
Data Profile
the relevant records. Then run the Stratify test again on this subset of the data to determine what range of amounts was paid to ABC Corporation. Finally, use the Classify command on the Purchasing Agent field, and accumulate the Amount field to determine which purchasing agent or agents handle ABC Corporation, and in what proportions.
The largest amount shown in a Statistics test was $78,300.13, and the 10th largest was $68,934.00. You can apply the filter AMOUNT > = 68934.00 to isolate the records with the top 10 amounts for further review.
A Frequently Used Values test reveals that $1,003.82 appeared 213 times in a payables file. First, set the filter Amount = 1003.82 to isolate these records. Then use the Classify command as needed to focus on where the original transactions occurred. For example, classifying on the vendor code will indicate whether the transactions came from one source or several. Classifying on the Financial Account Manager field will establish who authorized the payments. Classifying on Description or another related field establishes whether the same item was purchased.
73
7
Ratio Analysis
Pinpointing Suspect Transactions and Trends
Running Ratio Analysis
When assessing a company’s performance over time, or comparing a company to others like it, a favored strategy is the use of financial ratios such as debt-to-equity, inventory turnover, or return on assets. These give indications about the relative health of a company and any upward or downward trends.
To run the max/max2 and max/min ratios test follow these steps:
1. From the Fraud Menu, choose Ratio Analysis. Click [OK]. The Type Selection dialog box appears:
A fraud detection ratio is a similar kind of tool. As with financial ratios, it is used to search for sudden changes or disturbing trends. One critical difference is that fraud ratios are often built from a single data source, such as an accounts payable file, rather than from separate data files. Ratio analysis is perhaps the single most powerful analytical technique in fraud detection because it can highlight individual anomalies in files containing millions of records. The key to ratio analysis is the variation in the selected numeric field for the given category. For example, what if an invoice arrives for $45,000 from a vendor that normally never issues invoices for more than $500? One likely explanation is a data entry error. In this case, it could be a missing decimal point. The ratio of largest invoice to second largest for that particular vendor will be 90 to 1—a red-flag warning. Similarly, if a routine monthly expense suddenly doubles, one likely cause is double-billing, either by error or from fraud. Even if you have never run a ratio analysis before, the results will be immediately useful. The ratios for each unique category are reported, so that unusually high values for the max/max2 ratio or the num1/num2 ratio will be apparent. However, ratio analysis becomes increasingly more effective when it is repeated over long periods. Against an established historical background, sudden changes in ratios stand out more clearly.
74
2. Select Max/max2 and max/min ratios. Click [OK]. The File Selection dialog box appears:
Ratio Analysis
3. From the Select input file drop-down list, choose the file to be analyzed. In the Specify output file text box, enter the name of the output file. To apply a filter, select the Apply filter check box. Click [OK]. The Field Selection dialog box appears:
5. Enter the desired analysis criteria in the text boxes provided. Click [OK]. The Results dialog box appears:
4. Select both a character field and a numeric field. Click [OK]. The User Criteria dialog box appears. Note: The Ratio Analysis test takes offsetting entries into account (i.e., eliminates entries where there is a reversing entry for the same amount). However, you should look for reversing entries in the form of negative values when reviewing the detailed results. Also, careful consideration should be given when applying an optional filter.
6. After viewing the results, click [OK]. It is also possible to the results in both the View and Command Log windows.
75
Fraud Analysis Techniques Using ACL
To run the two-fields ratios test, follow these steps:
1. From the Fraud Menu, choose Ratio Analysis. Click [OK]. The Type Selection dialog box appears:
Ratio_Analysis SET SAFETY OFF SET ECHO NONE DIALOG (DIALOG TITLE "Ratio Analysis - Type Selection" WIDTH 540 HEIGHT 300 ) (BUTTONSET TITLE "&OK;&Cancel" AT 420 12 DEFAULT 1 ) (RADIOBUTTON TITLE "Max/max2 and max/min ratios;Two fields ratios (e.g., yr1/yr2); Exit" TO "ratio_type" AT 108 120 HEIGHT 87 DEFAULT 3 ) (TEXT TITLE "Select the type of ratio analysis:" AT 36 76 ) IF ratio_type=1 DO Ratio_Analysis1 IF ratio_type=2 DO ratio_Analysis2 DELETE ratio_type OK SET ECHO ON SET SAFETY ON
2. Select Two fields ratios. Click [OK]. The File Selection dialog box appears: Ratio_Analysis1 SET SAFETY OFF error_test=T CLOSE max=0.00 max2=0.00 min=0.00 v_total=0.00 ctr=1 max_cnt=0 prev_numfield=0.00 prev_sumfield=Blanks(32)
76
Ratio Analysis
DIALOG (DIALOG TITLE "Ratio Analysis - Max/ Min File Selection" WIDTH 740 HEIGHT 400 ) (BUTTONSET TITLE "OK;Cancel" AT 612 24 DEFAULT 1 ) (ITEM TITLE "f" TO "infile" AT 240 168 WIDTH 288 HEIGHT 104 ) (TEXT TITLE "Select input file" AT 72 172 ) (TEXT TITLE "Specify output file (no spaces)" AT 72 280 WIDTH 122 HEIGHT 43 ) (EDIT TO "useroutfile" AT 240 288 WIDTH 287 DEFAULT "Ratio1_ Results") (TEXT TITLE "Test determines the highest (max), second highest (max2), and lowest (min) values for a selected numeric field." AT 48 64 WIDTH 509 HEIGHT 40 ) (TEXT TITLE "Ratios: max/max2 and max/min are calculated for each value of a selected character field." AT 48 112 WIDTH 483 HEIGHT 39 ) (CHECKBOX TITLE "Apply filter (optional)" TO "filt_sel" AT 240 348 ) outfile=REPLACE('%useroutfile%',' ','_') OPEN %infile% IF filt_sel DO Filter_Selection DO Ratio1_Dialog1 WHILE error_test error_test=T DO Ratio1_Dialog2 WHILE error_test SORT on %sumfield% abs(%numfield%) D %numfield% D to Zratio_temp2 OPEN DEFINE field Pos_Cnt COMP 1 if %numfield%>0 0
3. From the Select input file drop-down list, choose the file to be analyzed. In the Specify output file text box, enter the name of the output file. To apply a filter, select the Apply filter check box. Click [OK]. The Field Selection dialog box appears:
4. Select a character field and two different numeric fields. Click [OK]. The User Criteria dialog box appears. 5. Enter the desired analysis criteria in the text boxes provided. Click [OK]. The Results dialog box appears. 6. After viewing the results, Click [OK]. The results can be viewed in both the View and Command Log windows.
DEFINE field Neg_Cnt COMP 1 if %numfield%<0 0
77
Fraud Analysis Techniques Using ACL
How the Scripts Work The Ratio Analysis scripts are high in their structural complexity. Cross-tabulation is structurally more difficult, and the Data Profile test is easy to conceptualize. However, it requires more scripts and more lines of code. Both Gaps and Completeness and Integrity are shorter and simpler. The two main types of ratio analysis are quite different. Therefore, two separate sets of scripts are employed almost from the outset. The first test examines ratios based on a single field, such as maximum-to-minimum (max/min) or maximum-to-second-highest (max/max2). The second script examines ratios based on two fields, for example, comparing the current year’s results to the previous year’s results, or one division to another. Notice that the initial Ratio Analysis script has only one purpose. It allows you to choose one of the two types of analysis, and launches the appropriate script series. It does not ask you for a file name or a field name. All other processing is left to Ratio Analysis1 or Ratio Analysis2 and their associated subscripts.
Max/Max2 and Max/Min Ratios The subscript Ratio Analysis1 creates a file containing the highest (max), second highest (max2) and lowest (min) values of the selected numeric field for each value of the selected category (character field). It also calculates the total value of the numeric field, the number of records, and the ratios max/max2 and max/min for each unique category. The script also executes the Statistics and Stratify commands on the max/max2 ratio. The results of these commands can be viewed in the command log. Finally, the script creates a filter, Max Max2 Filter, and a descending index, Max Max2. Both should be applied when reviewing the detailed results.
78
SUMMARIZE on %sumfield% str(abs(%numfield%),12) acc %numfield% as 'Temp_Total' pos_cnt neg_ cnt OTHER %numfield% to Zratio_temp OPEN SORT on %sumfield% %numfield% d to Zratio_temp2 if Temp_total>0 OPEN GROUP if prev_sumfield=%sumfield% max_max2=1.00 if max_cnt>=2 min=%numfield% ctr=ctr+pos_cnt-neg_cnt max2=%numfield% if max_cnt=1 max_max2 = max/max2 if max_cnt=1 max_min = max/min v_Total=v_Total+Temp_Total max_cnt=0 ELSE EXTRACT prev_sumfield as sumfield wid 10 v_Total as 'Total' wid 14 max wid 11 max2 wid 11 max_max2 wid 8 min wid 9 max_min wid 8 ctr wid 5 to %outfile% if recno()>1 EOF prev_sumfield=%sumfield% max=%numfield% max2=%numfield% v_Total=Temp_Total min=%numfield% max_max2=1.00 max_min=1.00 ctr=pos_cnt-neg_cnt max_cnt=pos_cnt-neg_cnt END OPEN %outfile% SET ECHO ON STATISTICS ON Max_Max2 Max_Min STRATIFY ON Max_Max2 header="Max/Max2 Ratio Analysis" acc 1 as 'Cnt'
Ratio Analysis
SET ECHO OFF INDEX ON Max_Max2 D TO "Max_Max2" OPEN DEFINE FIELD Max_Max2_Filter COMPUTED
Flow of Data in Ratio Analysis
ctr>%cnt% AND max_max2>%maxrat% AND max_min>%minrat% AND min>1 SET FILTER Max_Max2_Filter COUNT ratio_cnt=STR(count1,6,'9,999') DIALOG (DIALOG TITLE "Ratio Analysis - Max/ Min Ratio Results" WIDTH 540 HEIGHT 300 ) (BUTTONSET TITLE "&OK;&Cancel" AT 420 24 DEFAULT 1 ) (TEXT TITLE "Number of records with ratios and number of transactions greater than user-defined values." AT 36 124 WIDTH 257 HEIGHT 56 ) (EDIT TO "ratio_cnt" AT 300 132 WIDTH 136 ) (TEXT TITLE "File" AT 36 76 ) (EDIT TO "outfile" AT 72 72 WIDTH 194 ) (TEXT TITLE "created" AT 276 76 ) (TEXT TITLE "Apply filter and index to the ratio analysis output file to see detailed results." AT 24 232 WIDTH 494 ) SET SAFETY ON DELETE Zratio_Temp.fil OK DELETE FORMAT Zratio_Temp OK DELETE Zratio_Temp2.fil OK DELETE FORMAT Zratio_Temp2 OK DELETE count1 OK DELETE v_total OK DELETE maxrat OK DELETE minrat OK DELETE sumfield OK DELETE numfield OK
79
Fraud Analysis Techniques Using ACL
Two Fields: Num field1 / Num field2 Ratio The subscript Ratio Analysis2 creates a file containing the totals of Num field1 and Num field2 for each unique value of the selected character field. It also calculates the number of records and the ratio Num1 Num2 (Num field1 / Num field2) for each unique category. In addition, this script executes the Statistics and Stratify commands on the Num field1/Num field2 ratio. The results of these commands can be viewed in the command log. Finally, the script creates a filter Num1 Num2 Filter and a descending index, Num1 Num2. Again, both should be applied when reviewing the detailed results.
DELETE cnt OK DELETE max_cnt OK DELETE ratio_cnt OK DELETE outfile OK DELETE useroutfile OK DELETE infile OK DELETE filt1 OK DELETE error_test OK DELETE prev_sumfield OK DELETE prev_numfield OK SET ECHO ON
Ratio1_Dialog1
Deleting Temporary Variables, Fields, and Files The scripts delete the following temporary variables, fields, and files after use:
Ratio Analysis Variables count1
infile
outfile
useroutfile
cnt
maxrat
prev numfield
Zratio Temp.fil
ctr
maxrat2
prev sumfield
error test
minrat
ratio type
filt1
numfield
sumfield
filt sel
numfield1–2
Temp Num1 Num2.fil
80
DIALOG (DIALOG TITLE "Ratio Analysis - Field Selection" WIDTH 540 HEIGHT 400 ) (BUTTONSET TITLE "OK;Cancel" AT 420 12 DEFAULT 1 ) (TEXT TITLE "Select character field ratios will be calculated for each unique value of this field." AT 48 112 WIDTH 197 HEIGHT 53 ) (TEXT TITLE "Select numeric field - ratios will be calculated for max/ max2 and max/min values of this field." AT 288 112 WIDTH 220 HEIGHT 55 ) (ITEM TITLE "C" TO "sumfield" AT 48 180 WIDTH 188 HEIGHT 174 ) (ITEM TITLE "N" TO "numfield" AT 288 180 WIDTH 210 HEIGHT 172 ) (TEXT TITLE "Ratio Analysis - Max/Min" AT 48 40 ) IF MATCH(0,LEN(sumfield),LEN(numfield)) PAUSE 'You must select a character field and a numeric field' IF NOT MATCH(0,LEN(sumfield),LEN(numfield)) error_test=F
Ratio Analysis
Ratio1_Dialog2 DIALOG (DIALOG TITLE "Ratio Analysis - User Criteria" WIDTH 540 HEIGHT 400 ) (BUTTONSET TITLE "&OK;&Cancel" AT 420 24 DEFAULT 1 ) (TEXT TITLE "Max/max2 ratio" AT 108 172 ) (EDIT TO "Maxrat" AT 312 168 WIDTH 33 DEFAULT "2" ) (TEXT TITLE "Max/min ratio" AT 108 232 ) (EDIT TO "Minrat" AT 312 228 WIDTH 31 DEFAULT "5" ) (TEXT TITLE "Number of transactions" AT 108 292 ) (EDIT TO "Cnt" AT 312 288 WIDTH 31 DEFAULT "5" ) (TEXT TITLE "Records where max/max2 and max/ min ratios and the number of transactions are greater than user-defined values:" AT 36 88 WIDTH 388 HEIGHT 54 ) IF MATCH(0,LEN(maxrat),LEN(minrat),LEN(cnt)) PAUSE 'Enter values for max/max2, max/min, and number of transactions' IF NOT MATCH(0,LEN(maxrat),LEN(minrat), LEN(cnt)) error_test=F
Review and Analysis The key to successful ratio analysis is the expectation of consistency. The ratio of the highest price paid for a standard product to the second highest should not be large if prices over time have not changed. The ratio of the highest to the lowest should also not be large, unless transactions of this kind became larger during a specific week or quarter or year. This Toolkit computes four commonly used ratios:
Ratio of the highest value to the lowest value (maximum/minimum).
Ratio of the highest value to the next highest (maximum/second highest).
Ratio of the current year to the previous year.
Ratio of one operational area to another.
For example, if there is concern about prices paid for a product, the user can calculate the ratio of the maximum unit price to the minimum unit price for each product or stock number. If the ratio is close to 1.00, the user can be sure that there is not much variance between the highest and lowest prices paid. However, if the ratio is large, this could indicate that the price paid for the product was too high: Product Line
Max
Min
Max Min
Product 1
235
127
1.85
Product 2
289
285
1.01
The ratio of maximum to minimum for Product 1 is large at 1.85, whereas Product 2 has a smaller variance in the unit prices with a ratio of 1.01. Open the original input file and apply a filter to view only those records where the
81
Fraud Analysis Techniques Using ACL
category value had an unusually large ratio, for example Product 1. The detailed transactions, with an index displaying them in increasing or decreasing value based on the selected numeric field, can be reviewed by examining the records to determine why the ratio was higher than expected. Paying abnormally high unit prices for products may be a symptom of kickbacks. The ratio of the maximum to the second highest value can also highlight possible frauds. For example, a large ratio could point to an anomaly in the data. A large ratio indicates that the maximum value is significantly larger than the second highest value. Auditors and fraud investigators are interested in these unusual transactions because they represent a deviation from the norm. Unexplained deviations can be symptoms of fraud. As the following example suggests, high ratios in accounts payable are often an indication of incorrect payments to a vendor.
Vendor
Count
ABC CORP
14
XYZ TECH Z-BUSINESS
Total
Max
Max2
Max Max2
58,634.37
37,522.90
4,495.59
8.34
19
75,695.25
31,553.00
4,882.04
6.46
129
253,411.33
35,645.44
6,700.00
5.32
Filter : Max/Max2 > 5 and Max/Min > 9 and Cnt > 10
The ratio of one year to another, by general ledger account code, can highlight waste, abuse, or fraud. For example, GL accounts that were once seldom-used
82
Ratio_Analysis2 error_test=T DIALOG (DIALOG TITLE "Ratio Analysis - Two Fields Ratio File Selection" WIDTH 540 HEIGHT 400 ) (BUTTONSET TITLE "OK;Cancel" AT 432 12 DEFAULT 1 ) (ITEM TITLE "f" TO "infile" AT 192 132 WIDTH 250 HEIGHT 124 ) (TEXT TITLE "Select input file" AT 24 136 ) (TEXT TITLE "Specify output file (no spaces)" AT 24 280 WIDTH 135 HEIGHT 43 ) (EDIT TO "useroutfile" AT 192 276 WIDTH 251 DEFAULT "Ratio2_Results" ) (TEXT TITLE "This test computes the ratio of two selected numeric fields for each value of the selected character field." AT 24 76 WIDTH 390 HEIGHT 37 ) (CHECKBOX TITLE "Apply filter (optional)" TO "filt_sel" AT 192 348 ) outfile=REPLACE('%useroutfile%',' ','_') OPEN %infile% IF filt_sel DO Filter_Selection DO Ratio2_Dialog1 WHILE error_test error_test=T DO Ratio2_Dialog2 WHILE error_test CLASSIFY ON %sumfield% ACC %numfield1% %numfield2% TO Temp_Num1_Num2 OPEN EXTRACT %sumfield% %numfield1% %numfield2% COUNT TO %outfile% OPEN DEFINE FIELD Num1_Num2 COMPUTED 1.00*%numfield1%/%numfield2% IF %numfield2%<>0 999.99
Ratio Analysis
SET ECHO ON STATISTICS ON Num1_Num2 STRATIFY ON Num1_Num2 header="Field1/Field2 Ratio Analysis" ACCUM 1 as 'Cnt' SET ECHO NONE INDEX ON Num1_Num2 D TO "Num1_Num2" OPEN maxrat2=EXCLUDE(STR(1.000/%maxrat%,12),' ')
may suddenly have charges coded to them. This may be a case where fraud has been hidden in dormant accounts. GL Account
FY 2008
FY 2009
Ratio (FY2008/FY2009)
Bonuses
$12,885
$125
103.0800
Travel
$50,012
$52,190
0.9583
$16
$14,080
0.0011
DEFINE FIELD Num1_Num2_Filter COMPUTED
Spec Purpose COUNT>%cnt% AND (num1_num2>%maxrat% OR num1_num2<%maxrat2%) SET FILTER Num1_Num2_Filter COUNT ratio_cnt=STR(count1,6,'9,999') DIALOG (DIALOG TITLE "Ratio Analysis - Two Fields Results" WIDTH 540 HEIGHT 300 ) (BUTTONSET TITLE "&OK;&Cancel" AT 420 12 DEFAULT 1 ) (TEXT TITLE "Number of records with ratios and number of transactions greater than user-defined values." AT 36 136 WIDTH 257 HEIGHT 56 ) (EDIT TO "ratio_cnt" AT 300 144 ) (TEXT TITLE "File" AT 36 76 ) (EDIT TO "outfile" AT 72 72 WIDTH 204 ) (TEXT TITLE "created" AT 288 76) (TEXT TITLE "Apply filter and index to ratio analysis output file to see detailed results." AT 24 220 )
The closer the ratio FY 2008/FY 2009 is to 1.000, the less variance there is between the amounts coded to the GL account in the two years. Accounts showing large or small ratios should be examined to discover the cause. For example, the low ratio of FY 2008/FY 2009 (0.0011) for the GL Special Purpose indicates that FY 2009 has significantly more charges to that GL account than FY 2008. The high ratio (103.0800) for GL Bonuses indicates that the amount charged to Bonuses was much higher in FY 2008 than FY 2009. The ratio of two operational areas can also highlight anomalies that may be symptoms of fraud. For example, the ratios for two manufacturing plants would be of interest if one or more of the ratios were significantly different from the other ratios.
SET SAFETY ON DELETE count1 OK DELETE ratio_cnt OK DELETE maxrat OK DELETE maxrat2 OK
83
Fraud Analysis Techniques Using ACL
In the following table, the high ratio of prepaid expenses (6.58) and the receivables (1.37) should be reviewed by the auditors to determine the cause: Account
Plant 1
Plant 2
Ratio
Prepaid Exp
127,643
19,407
6.58
Receivables
244,775
178,087
1.37
Inventories
501,417
492,920
1.02
Investment
4,217
4,124
1.02
Cash
75,062
77,497
0.97
Prop/Plant/Equip
334,849
343,446
0.97
Other Def Charges
1,921
2,088
0.92
The follow-up review for both types of ratio analysis should examine the detailed transaction file—examining all transactions for the category with a high ratio. When reviewing the details, pay close attention to reversing entries. The user should also look for patterns in the data, such as always the same clerk or a different userid than for all other transactions.
Case Study: Dormant but Not Forgotten The teller thought no one would notice a few small withdrawals from accounts that had been dormant for years. He figured that the owners of the accounts had either forgotten about the money or died—either way, they wouldn’t miss it. So one week he removed a few hundred dollars from seven previously dormant accounts. What he didn’t realize was that the auditors were monitoring these
84
DELETE sumfield OK DELETE numfield1 OK DELETE numfield2 OK DELETE cnt OK DELETE outfile OK DELETE useroutfile OK DELETE infile OK DELETE filt1 OK DELETE error_test OK DELETE FORMAT Temp_Num1_Num2 OK DELETE Temp_Num1_Num2.fil OK SET ECHO ON
Ratio2_Dialog1 DIALOG (DIALOG TITLE "Ratio Analysis - Field Selection" WIDTH 740 HEIGHT 400 ) (BUTTONSET TITLE "OK;Cancel" AT 624 24 DEFAULT 1 ) (TEXT TITLE "Ratio Analysis - Two Fields" AT 36 52 ) (TEXT TITLE "Select character field - the ratio will be calculated for each unique value of this field." AT 36 112 WIDTH 200 HEIGHT 51 ) (TEXT TITLE "Select the two numeric fields for which the ratio will be calculated." AT 384 124 WIDTH 222 HEIGHT 38 ) (ITEM TITLE "C" TO "sumfield" AT 36 180 WIDTH 194 HEIGHT 162 ) (ITEM TITLE "N" TO "numfield1" AT 276 180 WIDTH 193 HEIGHT 161 ) (ITEM TITLE "N" TO "numfield2" AT 516 180 WIDTH 191 HEIGHT 162 ) IF MATCH(0,LEN(sumfield),LEN(numfield1), LEN(numfield2)) PAUSE 'Select a character field and two numeric fields'
Ratio Analysis
IF NOT MATCH(0,LEN(sumfield),LEN(numfield1), LEN(numfield2)) error_test=F
Ratio2_Dialog2 DIALOG (DIALOG TITLE "Ratio Analysis - User Criteria" WIDTH 540 HEIGHT 400 ) (BUTTONSET TITLE "&OK;&Cancel" AT 420 12 DEFAULT 1 ) (TEXT TITLE "Num_field ratio" AT 72 208 ) (EDIT TO "maxrat" AT 288 204 WIDTH 50 DEFAULT "2" ) (TEXT TITLE "Number of transactions" AT 72 280 ) (EDIT TO "cnt" AT 288 276 WIDTH 50 DEFAULT "5" ) (TEXT TITLE "Records in which the ratio of num_field1 / num_field2 or num_field2 / num_field1 is greater than the ratio value provided below and the number of transactions are greater than the user-defined value:" AT 36 88 WIDTH 419 HEIGHT 80 ) IF MATCH(0,LEN(maxrat),LEN(cnt)) PAUSE 'Select num_field ratio and number of transactions' IF NOT MATCH(0,LEN(maxrat),LEN(cnt)) error_test=F
accounts. Given that the bank had over 12,000 savings accounts, and an even greater number of checking accounts, a manual review of the month’s transactions was out of the question. One of the audit objectives was to ensure that activity on dormant accounts was valid. In particular, the auditors were looking for accounts that had had no activity in the previous month but suddenly showed activity in the current month. The monthly ratio analysis accomplished this by allowing the auditors to compare the total withdrawals during the current and previous months for all accounts in which the total of the current month’s withdrawals was greater than $0.00. Then, all accounts with a ratio (previous month / current month) equal to 0.00 were reviewed in detail. Truly dormant accounts were filtered out, since the current month’s withdrawals were 0.00. However, since the accounts the teller was stealing from had been dormant, the ratio of previous to current was always 0.00. For example, if he took $125.00 from one account, the ratio would be 0/125.00 = 0; and if he took $75 from another account, the ratio would be 0.00/ 75.00 = 0.00. Thanks to the vigilance of the auditors, he was caught after just one month.
Case Study: Doctored Bills Auditors from the health care provider performed a yearly financial audit of major medical centers. In particular, they reviewed the patient billing system to determine if the charges assessed to the patient’s health care providers accurately reflected the medical procedure that had been performed. Although there was no fixed price for each procedure, the audit team had established a series of standard price ranges, using past experience as a guide. They used the Ratio Analysis tests to analyze the billing and calculate the ratio of the highest and lowest charges for each procedure. The auditing standards required that procedures with a ratio of highest to lowest greater than 1.30 be noted and that an additional review be performed. This quarter, three procedures had ratios higher than 1.30, the highest being 1.42.
85
Fraud Analysis Techniques Using ACL
Next, a filter was applied to highlight procedures with costs that deviated from the established price ratio. The auditors quickly determined that one doctor was charging significantly more than the other doctors for the same procedures. A comparison of the charges from the billing system with the payments recorded in the accounts
86
receivable system revealed that the doctor was skimming some of the payments received. The amount recorded in the receivable system was in line with the usual billing amount for the procedures. The doctor was unable to justify the higher prices or explain the difference in the billing and the receivable systems.
8
Benford’s Law
Identifying Anomalous Data What can make a false transaction stand out? The answer: expectation. One has to be expecting a different result to notice that something is amiss. Benford’s Law, developed by physicist Frank Benford, predicts digit frequencies in large, varied sets of data. Whenever data strays significantly from that prediction, there is reason to investigate. Benford’s Law holds that the first digit in a large list of numbers will be a 1 more often than a 2, and a 2 more often than a 3, and so on through all of the digits, with 9 as the least likely. Benford proved that the first digit will be 1 about 30 percent of the time, whereas 9 will be the first digit a little less than 5 percent of the time. The law applies to a huge variety of data, including populations, inventories, financial transactions, and acreages. It typically does not apply when there is an upper limit on the values, such as heights or ages of people, or where the digits are used as symbols rather than quantities, such as credit card numbers. The law applies regardless of the units of measure involved. For example, the same set of transactions will comply with the law if restated in dollars, pounds, euros, yen, or even miles or kilometers.
Understanding Benford’s Law Benford published his formulas and research in 1938 in a paper titled “The Law of Anomalous Numbers.” The paper created lasting interest among academics, who still debate the theoretical merits of the law and the reasons for its existence. However, it is only in the last few years that Benford’s discovery has been put to practical use by auditors.
88
Benford’s formula for digit frequencies is F (dd) = log(1 + 1/dd) In this formula, F is the frequency or probability of a digit combination occurring, expressed as a fraction of one or as a percentage. The expression dd represents a string of one or more digits. Using any of the digits 1–9 in place of dd, you will get the first-digit frequencies shown in the table below. Using the digits 10–99 will produce the first-two-digit frequencies. For example, the expected frequency of 13 is log(1+1/13) = 0.032, or 3.2 percent. The expected first-two-digit frequencies range from 4.1 percent for 10, to 0.4 percent for 99. This formula also works for three-digit and higher combinations. According to Benford’s formula, the expected frequencies for first and second digits, rounded to three decimal places, are as follows: Digit
1st Digit Frequency
2nd Digit Frequency
Digit
1st Digit Frequency
2nd Digit Frequency
0
not applicable
0.120
5
0.079
0.097
1
0.301
0.114
6
0.067
0.093
2
0.176
0.109
7
0.058
0.090
3
0.125
0.104
8
0.051
0.088
4
0.097
0.100
9
0.046
0.085
Identifying Irregularities Using Benford’s Law to identify unusual variations in digit frequencies is also known as digital analysis. The user selects targets
Benford’s Law
for investigation by comparing the actual proportion of items with a given digit combination to the expected proportion. The combinations that show spikes (combinations significantly above or below the expected value) are then selected for further investigation.
In some situations, the digit frequencies in the data may not conform to Benford’s Law, but will form their own consistent pattern over time. For example, a government department, an insurance company, or a lender may be limited by statute to transactions under $500,000. The digit frequencies starting with 1–4 might be consistently higher than normal, and those with 5–9 lower than normal. Yet from year to year, the results look the same. Deviations from this kind of “local” digit law can be detected by using the script Benford Custom Analysis. Digital analysis is not a complete audit technique in itself. Just because certain digit combinations show up more often than others does not prove that any kind of fraud or error has occurred. However, it can be very useful as a means of selecting targets for investigation, particularly to narrow the number of targets in a large data set.
Running Benford Analysis It is easy to spot spikes on a graph of first-digit or first-two-digit combinations. On the graph shown above, there are several obvious spikes. However, when working with large amounts of data, or analyzing three digits or more, picking out targets for analysis can be difficult. To determine if a spike is truly significant, or just a random variation, you need a suitable measure. The Z-statistic, a standard tool for assessing probability, provides this measure. The Z-statistic calculates the unlikelihood of a given outcome. Z-statistics of 1.000 or less are very common, and often occur at random. A Z-statistic of 1.96 will occur only about 1 time in 20, and a Z-statistic of 2.57 will occur about 1 time in 100. So the higher the Z-statistic, the more unlikely the outcome occurred by chance and therefore, the higher the likelihood that the data has been manipulated.
To run a standard Benford analysis, follow these steps: 1. From the Fraud Menu, choose Benford Analysis. Click [OK]. The Type Selection dialog box appears:
89
Fraud Analysis Techniques Using ACL
2. Select Analysis using Benford’s Law expected values. Click [OK]. The Parameter Selection dialog box appears:
4. Select the numeric field for which the digit frequency will be calculated. Click [OK]. To view results, open the output file created by the Benford analysis and apply the index (same name as output file) to highlight frequencies that are significantly different from the expected Benford frequencies. The results can also be graphed. In the view window, highlight all of the records and the fields; right-click and select the option graph selected data. ACL will display a bar chart with the actual and expected frequencies and the Z-stat values.
Running Benford Custom Analysis 3. Select the desired number of leading digits. Select the input file definition to analyze from the Select input file drop-down list. In the Specify output file text box, enter the name of the output file. To apply a filter, select the Apply filter check box. Click [OK]. The Numeric Field Selection dialog box appears:
If your organization’s data does not follow Benford’s expected frequency, you can create your own expected frequency distribution and compare the actual frequencies to these values. This is a two-part process. First you must determine your own expected frequency distribution; then you can run the Benford Custom Analysis test on whatever data you wish to compare.
Creating the Custom Distribution To create your own expected frequency distribution, follow this procedure: 1. Ensure that you have a clean data file—no fraudulent transactions—to use for the calculation of your custom Benford frequency distribution.
90
Benford’s Law
Benford SET SAFETY OFF SET ECHO NONE DIALOG (DIALOG TITLE "Benford Analysis - Type Selection" WIDTH 540 HEIGHT 300 ) (BUTTONSET TITLE "&OK;&Cancel" AT 420 12 DEFAULT 1 ) (RADIOBUTTON TITLE "Analysis using Benford's Law expected values;Analysis using custom Benford values;Exit" TO "benford_type" AT 84 108 HEIGHT 87 DEFAULT 3 ) (TEXT TITLE "Select the type of Benford analysis:" AT 36 76 ) IF benford_type=1 DO Benford_Analysis IF benford_type=2 DO Benford_Custom_Analysis DELETE benford_type OK SET ECHO ON SET SAFETY ON
2. Determine the data’s standard frequency distribution for the first x digits by running the Benford Analysis script against your clean data. This will create a custom frequency distribution that can be used as your standard, against which you can compare the frequency distribution of your other data files. For example, run Benford first three digits to create Ben3 AP Std. 3. Create different standards for each of the first x digits you plan to use in subsequent analyses, and provide meaningful names. For example, first four digits of Accounts Receivable—Ben4 AR Std; and first two of payroll—Ben2 Payroll Std. The naming convention should help you to remember the data source from which the custom distribution was created, and the number of first digits. Your custom distribution file must be created via the Benford Analysis script or directly from the ACL Benford command. When you test against the custom distribution, the number of first digits must be the same for both files. For example, if your standard has four digits, you must specify four digits when you run the Benford Custom Analysis.
Division by Zero If the custom distribution data includes digit combinations for which the actual count was zero, the Benford Custom Analysis script will substitute 0.001 to prevent a division by zero error. This is acceptable practice in statistical terms because the Z-statistics generated in this way are used only to rank the digit combinations, not determine significance. (If you are not familiar with the statistical term significance in this context, there is nothing to be concerned about. This note is included to prevent those who may want to apply strict measures of significance from making a wrong assumption about their results.)
91
Fraud Analysis Techniques Using ACL
Testing against the Custom Distribution Once the custom distribution has been created, you may test against it. To run a Benford custom analysis: 1. From the Fraud Menu, choose Benford Analysis. Click [OK], The Type Selection dialog box appears:
2. Select Analysis using custom Benford values. Click [OK]. The Parameter Selection dialog box appears. 3. Select the desired number of leading digits. Select the input file definition to be analyzed from the Select input file drop-down list. Select the file containing the custom Benford frequencies from the Select file containing your expected values drop-down list. In the Specify output file text box, enter the name of the output file. To apply a filter select the Apply filter check box. Click [OK]. The Numeric Field Selection dialog box appears:
92
Benford_Analysis SET ECHO NONE SET SAFETY OFF error_test=T DIALOG (DIALOG TITLE "Benford's Law - Parameter Selection" WIDTH 540 HEIGHT 400 ) (BUTTONSET TITLE "OK;Cancel" AT 432 12 DEFAULT 1 ) (TEXT TITLE "Benford's Law" AT 24 16 ) (TEXT TITLE "Select input file" AT 324 100 ) (ITEM TITLE "f" TO "infile" AT 240 132 WIDTH 273 HEIGHT 135 ) (TEXT TITLE "This test calculates the frequency of the first one to six digits of a specified numeric field for the file selected below.; and compare the actual and expected frequencies." AT 24 52 WIDTH 398 HEIGHT 34 ) (TEXT TITLE "Specify output file (no spaces)" AT 288 268 WIDTH 153 HEIGHT 21 ) (EDIT TO "useroutfile" AT 240 300 WIDTH 272 DEFAULT "Benford_Results" ) (TEXT TITLE "Number of leading digits" AT 24 100 ) (RADIOBUTTON TITLE "One;Two;Three;Four;Five;Six" TO "lead_digits" AT 24 132 WIDTH 80 HEIGHT 138 DEFAULT 1 ) (CHECKBOX TITLE "Apply filter (optional)" TO "filt_sel" AT 240 348 ) outfile=REPLACE('%useroutfile%',' ','_') OPEN %infile% IF filt_sel DO Filter_Selection DO Benford_Dialog WHILE error_test SET ECHO ON BENFORD ON %num_field% TO %outfile% LEADING lead_digits OPEN
Benford’s Law
INDEX ON ZStat D TO %outfile%.inx SET ECHO NONE DELETE outfile OK DELETE useroutfile OK DELETE num_field OK DELETE lead_digits OK DELETE filt_sel OK DELETE filt1 OK DELETE infile OK DELETE error_test OK SET ECHO ON
Benford_Dialog DIALOG (DIALOG TITLE "Benford's Law - Numeric Field Selection" WIDTH 540 HEIGHT 300 ) (BUTTONSET TITLE "OK;Cancel" AT 420 12 DEFAULT 1 ) (ITEM TITLE "N" TO "num_field" AT 252 120 WIDTH 240 HEIGHT 126 ) (TEXT TITLE "Select the numeric field for which the digit frequency will be calculated." AT 36 112 WIDTH 202 HEIGHT 52 ) IF LEN(num_field)=0 PAUSE 'You must select a Numeric Field for Benford Test' IF LEN(num_field)>0 error_test=F
4. Select the numeric field for which the digit frequency will be calculated. Click [OK]. Note: The numeric field must have the same name in both files. Edit table layout to change the field name in one of the files if this is not the case. 5. The resulting output file. can now be examined The records are indexed on the Z-statistic, decreasing from highest to lowest, so that the most unlikely results (in statistical terms) are at the beginning of the file. The analysis from this point will be similar to the standard Benford analysis, choosing targets for further investigation wherever there are large and unexplained deviations from the expected digit frequency.
How the Benford Scripts Work Standard Benford Analysis The Benford command handles most of the processing in the Benford Analysis script. It sets aside zero-value items, ignores irrelevant
93
Fraud Analysis Techniques Using ACL
leading characters like dollar signs, computes the expected and actual digit frequencies, and generates a digit-frequencies file, complete with Z-statistics. The Benford Dialog subscript assists in setting up the processing. The script also prompts the user for the required input and automatically creates an index on the descending value of the Z-statistic. By applying the index (same name as the output file), the user can view the first digit combinations that are the least likely to have occurred naturally. When all of this is done, specific targets for investigation can be selected. A thorough investigation will often start with a one-digit or two-digit Benford analysis, then a first-three-digits analysis, followed by drill-down into specific digit combinations to look at the actual transactions. The analysis should be run several times to give you a complete picture of the data. The Benford command counts the number of records with each combination of first digits and compares this with the Benford’s Law expected frequency. The results are written to the specified output file. The file presents the results in the order of the digits analyzed. For example, if the user selects first digit, the first record in the file contains the results for the first digit 1, the second record contains the results for first digits 2, and so on. The index created on the Z-statistic value (Zstat) can be applied. The index will rearrange the output file records, presenting the results in descending order, based on the statistical significance of the difference between the expected and actual frequencies.
94
Data Flows for the Benford Tests Input File 1
Input File 2
Benford on numeric field
Benford_Results
Benford on numeric field
Temp1_outputfile
Join on digits Temp2_outputfile
Extract four fields
Benford_Custom_Results
Benford’s Law
Benford_Custom_Analysis SET ECHO NONE SET SAFETY OFF error_test=T DIALOG (DIALOG TITLE "Benford Custom Analysis Parameter Selection" WIDTH 642 HEIGHT 396 ) (BUTTONSET TITLE "OK;Cancel" AT 540 24 DEFAULT 1 ) (TEXT TITLE "Benford's Law (Custom Values)" AT 24 16 WIDTH 223 ) (TEXT TITLE "Select input file" AT 252 124 ) (ITEM TITLE "f" TO "infile" AT 204 156 WIDTH 228 HEIGHT 135 ) (TEXT TITLE "This test calculates the frequency of the first one to six digits of a specified numeric field for the file selected below and compares this to your expected values.; and compare the actual and expected frequencies." AT 24 52 WIDTH 481 HEIGHT 50 ) (TEXT TITLE "Specify output file (no spaces)" AT 264 292 WIDTH 87 HEIGHT 38 ) (EDIT TO "useroutfile" AT 360 300 WIDTH 267 DEFAULT "Benford_Custom_Results") (TEXT TITLE "Number of leading digits" AT 24 124 ) (RADIOBUTTON TITLE "One;Two;Three;Four;Five;Six" TO "lead_digits" AT 24 156 WIDTH 80 HEIGHT 138 DEFAULT 1 ) (CHECKBOX TITLE "Apply filter (optional)" TO "filt_sel" AT 228 348 ) (TEXT TITLE "Select file containing your expected values" AT 444 112 WIDTH 167 HEIGHT 35 ) (ITEM TITLE "f" TO "infile2" AT 444 156 WIDTH 182 HEIGHT 139 )
Benford Custom Analysis A Benford analysis works by comparing actual to expected counts. When a standard Benford analysis is performed, the expected count can be determined using the Benford formula for each digit frequency. That is, a separate file of expected results is not required—just an equation. In the case of a Benford custom analysis, past results (the custom distribution) must be joined to another file containing the current results and the Z-statistic computed for each resulting digit combination. Then index the output file to rank the Z-statistics in order of increasing likelihood. Temp1_output file Digits
Benford_Results
Actual_Count
Digits
Actual_Count
Temp2_output file Digits
Actual_Count
Expected_Count ZStat (computed)
extract records to output file
The diagram above shows how the output file is created using a Join command plus a computed field. The highlighted script code includes some additional steps to deal with cases where the expected count is zero. A Z-statistic cannot be computed where the expected count is zero, so the initial values are tested and a value of 0.001 is substituted wherever this occurs.
95
Fraud Analysis Techniques Using ACL
Deleting Temporary Variables, Fields, and Files The following variables, fields, and files are cleaned up during the running of the standard and custom Benford scripts: Benford Variables Benford Type
infile
outfile
total2
error test
infile2
Temp1 %outfile%.fil
useroutfile
filt1
lead digits
Temp2 %outfile%.fil
filt sel
num field
total1
Review and Analysis Whether based on the standard Benford analysis or a user-supplied custom set of values, Benford analyses highlight the digits with the most significant difference (expected-actual) in the frequencies. For example, the first three digits may be 102 more often than expected. Suppose the difference between the actual and expected frequencies is 0.05 percent (0.4737 – 0.4237). However, there may be hundreds, even thousands, of records with the first three digits ‘102’. What should you do next? The first step is to open the original input file—the file containing all the detailed transactions that were analyzed to create your Benford results. The next step is to identify only the transactions with the first three digits ‘102’ for the numeric field that was analyzed. You can create and apply a filter to view the original input file as follows: Leading(numeric_field_name, no_digits) = ‘value’
96
outfile=REPLACE('%useroutfile%',' ','_') OPEN %infile% IF filt_sel DO Filter_Selection DO Benford_Custom_Dialog WHILE error_test SET ECHO ON BENFORD ON %num_field% TO Temp1_%outfile% LEADING lead_digits OPEN OPEN %infile2% SECONDARY JOIN PKEY Digits SKEY Digits FIELDS Digits Actual_Count WITH Actual_Count AS 'Initial_Expected_Count' TO Temp2_%outfile% OPEN Temp2_%outfile% DEFINE FIELD Expected_Count COMPUTED 'Takes care of any 0 values in the expected count from the standard Benford file' 0.001 IF Initial_Expected_Count=0 1.000*Initial_Expected_Count TOTAL Actual_Count total2=total1*1.00000000 TOTAL Expected_Count total3=total1*1.00000000 DEFINE FIELD ZStat COMPUTED 'calculates the ZStat' ZSTAT(Actual_Count/total2,Expected_Count/ total3,total2) EXTRACT Digits Actual_Count Expected_Count ZStat TO %outfile% OPEN INDEX ON ZStat D TO %outfile%.inx OPEN SET ECHO NONE CLOSE SECONDARY DELETE Temp1_%outfile%.fil OK DELETE FORMAT Temp1_%outfile% OK
Benford’s Law
DELETE Temp2_%outfile%.fil OK DELETE FORMAT Temp2_%outfile% OK DELETE outfile OK DELETE useroutfile OK DELETE num_field OK DELETE filt1 OK DELETE infile OK DELETE infile2 OK DELETE total1 OK DELETE total2 OK DELETE filt_sel OK DELETE lead_digits OK DELETE error_test OK SET ECHO ON
Benford_Custom_Dialog DIALOG (DIALOG TITLE "Benford Custom Analysis Numeric Field Selection" WIDTH 540 HEIGHT 300 ) (BUTTONSET TITLE "OK;Cancel" AT 420 12 DEFAULT 1 ) (ITEM TITLE "N" TO "num_field" AT 252 120 WIDTH 240 HEIGHT 126 ) (TEXT TITLE "Select the numeric field for which the digit frequency will be calculated." AT 36 112 WIDTH 202 HEIGHT 52 )
For the numeric field Amount and the first three digits '102', the filter is: Leading(Amount, 3) = '102'
The above filter highlights all amounts with the first three digits '102', including 1,023.91, 1.02, and 102.34. If the numeric field is called Total and you are interested in the first two digits, '67', the filter is: Leading(Total, 2) = '67'
With the filter applied, additional analysis can now be performed to determine why the digit combination occurs more often than expected. It may be necessary to perform a variety of additional analyses, depending on the data being analyzed and the fraud risk. The aim is to examine the records with the largest Z-statistics, looking for additional signs that a fraud has occurred. For example, after applying a filter to select only the records with the desired digit combinations, such as '102':
Credit Card Data (credit card numbers and transaction amounts)—Run the Classify command to find transaction totals for each card number. The results can point to possible use of the company credit card for personal purchases or cash advances.
Travel Expenses (travel expenses by employee)—Use Classify to total expenses by employee number to highlight the employee with the most expenses with the specified first n digits. This can point to possible fraudulent expenses claims, such as always claiming the maximum per diem rate.
Contract Amounts (contract officer and contract amount)—Again, using Classify to total contract amounts for each contract officer would highlight who had the most contacts or the highest dollar totals. This can signify possible contract splitting to avoid financial limits. Additional analysis of contract amounts by vendor for this contract officer could identify possible kickbacks for directing contracts to certain vendors.
IF LEN(num_field)=0 PAUSE 'You must select a Numeric Field for Benford Test' IF LEN(num_field)>0 error_test=F
97
Fraud Analysis Techniques Using ACL
Case Study: Signing Authority
bidding process. Further analysis showed that he was raising contracts just under the financial limit and directing them to a company owned by his wife.
The auditors were investigating possible fraud in the contracting section, where thousands of contracts were raised every month. Benford’s Law was used to examine the first two digits of the contract amount. The results of their analysis revealed that the digits 49 showed up in the data more often than expected.
Further Reading
Classifying on the name of the contracting officer for all contracts with 49 as the first two digits determined that the contracting manager was raising most of these contracts. The Stratify command showed that the contracts were primarily for amounts in the range $49,000 to $49,999 to skirt contracting regulations. Contracts under $50,000 could be sole-sourced; contracts $50,000 or higher had to be submitted to the
ACL for Windows User Guide (Version 9). Vancouver: ACL Services Ltd., 2007. David Coderre. Computer-Aided Fraud Prevention and Detection. Hoboken, NJ: John Wiley & Sons, 2009. Mark Nigrini. Digital Analysis Using Benford’s Law. Vancouver: Global Audit Publications, 2000.
98
9
Developing ACL Scripts
Introduction The purpose of this chapter is to introduce experienced ACL users to the world of scripts—users who already have access to the fraud detection and prevention scripts provided with this book, but may be interested in developing their own scripts. This chapter helps by explaining some of the basic concepts, such as:
A generic approach to data analysis and the key steps.
ACL commands that can be used in scripts.
ACL system and user-defined variables.
Workspaces.
Readers are then taken though a series of discussions that lead to the development of scripts. Topics include:
The purpose of ACL scripts.
Methods for creating scripts.
Running ACL scripts: user initiated and through a DOS .bat file.
Interactive scripts involving dialog boxes.
Advanced programming techniques: Groups.
This chapter is not a course on ACL script development for which users should take appropriate ACL training. However, this chapter can help users to get started in developing their own scripts, modifying the fraud scripts, and understanding the logic and approach used in the fraud scripts. This chapter also assumes that the reader has a good knowledge of ACL commands, is able to create and use conditional expressions, and understands the ACL functions.
100
Examples and references to the fraud scripts are provided throughout this chapter, and exercises are used to reinforce the theory and concepts. Learning is consolidated by the exercises and solutions. (Data files and ACL project files are provided on the CD-ROM at the back of the book.) Users are encouraged to perform the exercises and review the suggested solutions. ACL is a powerful tool, and there are usually at least three ways to accomplish any particular task. If your solution to an exercise is not the same as the suggested solution, that does not mean it is wrong. However, users are encouraged to review the suggested solution to ensure that they understand alternative approaches. Users are also encouraged to review other people’s scripts. Some sources of ACL scripts are:
ACL Forum, www.acl.com/supportcenter—only accessible by supported ACL users
ACL Quicksteps, www.acl.com—only accessible by premium supported ACL users
Texas ACL User Group, www.texasacl.com—free to all
It is important not only to practice writing scripts, but also to review the approaches and techniques used by other script writers. Examining the logic and approaches used by other ACL script writers is an excellent learning activity.
Data Analysis: Generic Approach The power of ACL can only be exploited if the user has a clear set of objectives for the analysis to be performed. This is even more important for ACL script development. The following outlines the steps that should be followed for any analysis. This approach is even
Developing ACL Scripts
more important when the application of data analysis is aimed at fraud detection and prevention.
6. Create or build the ACL table layout automatically created by ACL for DBF, ODBC, and delimited files.
First ensure that the goals and objectives of the audit or fraud investigation are well understood and articulated. Then perform the following steps:
7. Verify the data integrity: Use Verify Command to check data integrity.
1. Meet with the client and the programmer for the client applications. Identify all available databases, namely those internal to the client organization (i.e., main application systems) and external to the client organization (including benchmarking and standards). 2. List fields in all available databases and the standard reports that are available. 3. Based upon the audit objectives, identify the data sources, the key fields, or data elements required by the audit team. 4. Request the required data and, in doing so, ensure that unnecessary fields are excluded from the request. Prepare a formal request for the required data, specifying: Data source(s) and key fields.
Timing of the data (for example: as of September 30, 2009). Data transfer format (floppy, LAN, Internet, CD-ROM, tape, etc.). Data format (DBF, Delimited, flat file, CSV, ASCII print file, etc.).
Control totals (number of records, key numeric field totals). Record layout (field name, start position, length, type, description). A printout of the first 100 records.
Authorization, i.e., obtain client agreement on data (source, timing, integrity, etc.).
8. Understand the data—use ACL commands COUNT, STATISTICS, STRATIFY, CLASSIFY, and so on to develop an overview of the data. 9. For each objective: Formulate hypotheses about field and record relationships.
5. Request:
Check ACL totals against control totals. Check the timing of the data to ensure proper file has been sent. Compare ACL view with print of first 100 records.
Use ACL to perform analytical tests for each hypothesis. Run tests for possible problem records (the output is the so-called “hit list”). Evaluate initial results and refine the tests. Re-run and refine test to produce shorter, more meaningful results (repeat steps 6–8 as needed). Evaluate the results using record analysis, interview, or other techniques to examine every item on the refined results. Form an audit opinion on every item in your results. For each item, the auditor should be able to say that the record is okay (with a valid explanation) or that it is a probable improper transaction and more review is needed.
101
Fraud Analysis Techniques Using ACL
10. Perform quality assurance and documentation—i.e., exceptions to source; confirm analysis and nature of exceptions; and identify reasons for the exceptions.
DISPLAY SPACE—shows free space in RAM.
Following a structured approach not only helps to ensure accuracy and consistency, it can be beneficial if the analysis leads to the identification of fraud and the analyst is called to testify in court.
DISPLAY HISTORY—shows the table history for the currently open table.
ACL Commands
DISPLAY SECONDARY—shows the secondary file currently open.
Before discussing scripts, it is important to review and expand upon ACL’s functionality. The user may be familiar with many of the following commands and techniques; however, they are treated somewhat differently in scripts. The ACL software provides the user with the ability to set and control preferences, parameters, system settings, etc. Users wanting to develop scripts must consider the impact of these user defined settings – how they will be managed and controlled to ensure consistency of processing, particularly when the scripts will be run by other users. The following describes some of the more useful settings and commands that can be included in scripts.
DISPLAY Command DISPLAY is used to show information about the active ACL environment, such as variables, relations, table history and relationships. DISPLAY is often a useful command when trying to troubleshoot a script.
DISPLAY OPEN—shows the currently open tables in the ACL project.
DISPLAY PRIMARY—shows the primary file currently open.
DISPLAY VARIABLES—lists the value of all variables. DISPLAY VARIABLE name—shows value of specific variable. DISPLAY VERSION—shows the ACL version currently running.
SET Command SET can be used to change certain ACL parameters/options. Typical SET commands include: SET SAFETY ON—to restore prompting about overwriting existing field/file. SET SAFETY OFF—to remove prompting/warning that result will write over an existing field/file names). SET ECHO ON—to record script processing and results in the LOG. SET ECHO NONE—to suppress recording of script processing and results in LOG. SET LOG new logname—to create a new LOG file.
DISPLAY DATE—system date and time.
SET LOG—to return to default LOG.
DISPLAY TIME—system date and time.
SET DATE “date format”—to change default date display format.
DISPLAY FREE—shows free space in RAM.
102
Developing ACL Scripts
SET EXACT ON/OFF—to change default Exact value. With EXACT ON 'ABC' <> 'AB'; with EXACT OFF—ACL compares the values using the length from the smaller string, so 'ABC' = 'AB'. SET FILTER filter name—to set filter to an existing filter. SET FILTER—to turn the filter off (no filter). SET FILTER TO Condition(s)—to define a filter (e.g., SET FILTER Amount >0).
DELETE FORMAT tablename OK—to delete the table layout. DELETE tablename.fil OK—to delete the physical data file. DELETE FIELD Field Name OK—to delete an existing field name. DELETE Variable Name OK—to delete a variable. DLETE ALL OK—to delete all ACL and user-defined variables.
OPEN Command
SET SUPPRESSXML ON—turns off XML display of log and screen results; log and results displayed in text format.
OPEN Table Name—to open specified table.
SUPPRESSXML OFF—Return to XML display of log file and results to screen.
OPEN Table Name FORMAT Table Layout—to open specified table using the specified table layout.
DELETE Command The DELETE command can be used to delete items no longer required, including: fields, files, formats (table layouts) and variables. Unless followed by ‘OK’, ACL will prompt the user to ensure that the user really wants to delete the specified item. For example:
OPEN %v tablename%—to open a table whose name is stored in a variable called v tablename (see “Macro Substitution” on page 120).
Commands for Defining Relationships INDEX ON Field Name TO “Index Name”—to create an index to use to relate files (also used to display records in a specified order). DEFINE RELATION Parent Field Name WITH Child Table Name Index Name—to create a relation with another table (child file).
DELETE Inventory.FIL causes ACL to prompt the user with “Are you sure you want to delete Inventory?” The user must click Yes to delete the table; or No to retain the table.
DEFINE RELATION Table Name.Field Name WITH Table Name Index Name—to create an indirect relationship (grandchild file).
DELETE Inventory.FIL OK deletes the table without prompting the user.
Basic ACL Commands
Examples of DELETE commands: DELETE logname.lix OK—to delete log index file. DELETE logname.log OK—to delete the log file.
EXTRACT RECORD TO Table Name IF Condition(s)—to extract the current record layout another table. EXTRACT FIELDS Field Name1 Field Name2 TO Table Name IF Condition(s)—to extract specified fields to another table.
103
Fraud Analysis Techniques Using ACL
SUMMARIZE ON Char Field SUBTOTAL Num Field OTHER Field Name TO Table Name IF Condition(s) PRESORT—to summarize table on specified field(s), subtotal specified field(s), and include other field(s); output to data file, printer or screen. CLASSIFY ON Char Field SUBTOTAL Num Field TO Table Name IF Condition(s)—to classify table on specified field and subtotal specified field(s), and include other field(s); output to graph, printer, screen or file. CROSSTAB ON Char Field COLUMNS Char Field SUBTOTAL Num Field TO Table Name IF Condition(s) to create a crosstab where the first field name is summarized over the second (e.g., Total by Store by Month); output to data file, printer, or screen. DUPLICATES ON Field Name OTHER Field Name TO Table Name IF Condition(s) PRESORT—to search for duplicates on a specified field name(s) and include other field(s); output to data file, printer or screen.
IF Command The IF command allows the user to specify a condition that must be met in order for the command following it to execute (IF test or condition COMMAND). The condition is evaluated when ACL first encounters the IF command. If the condition is TRUE, the command following is processed. If the test is FALSE, ACL ignores the rest of the command. Example: IF Run Flag = 'T' TOTAL Amount The Amount will be totaled for the entire file if the Run Flag='T'. If the Run Flag<>'T' then the TOTAL command will not be executed (no records will be totaled); if Run Flag='T' then the
104
Amount from ALL records will be totaled. This differs from the IF condition at the command level (e.g., TOTAL Amount if Run Flag='T') which tests each record of the file and executes the total command for only the records that meet the test. So, TOTAL Amount IF Run Flag='T' will test each record to determine the value of the Run Flag and will total the Amount for records with a Run Flag='T'. The IF command is most useful for testing a variable to determine if a script will be processed. This allows the user to build a script that controls the flow to various subscripts. In a script, a series of IF command tests can be used to assess whether or not to execute specific commands. Example: IF v checkbox DO Script Name (Run a script if a checkbox is checked) IF v radiobutton = 1 DO Script Name (Run a script if the first radio button is checked) IF Write1 > 20 Total Amount (Total the Amount if the variable Write1 is greater than 20)
Variables ACL uses variables to store information (results of computations and ACL commands; and user input) for future use. There are two types of variables: ACL system variables and user-defined variables. ACL system variables are named memory spaces that store data: character, numeric, date or logical values. System variables are created by ACL commands. For example, TOTAL1 contains the sum of the last field that was totaled.
Developing ACL Scripts
System variables have the format Variable Name + Number and will end in a ‘1’ (e.g. COUNT1), unless they were generated by an ACL command that was execute inside of a Group.* When defined by a command in a Group, the ‘1’ will be replaced by the number of the line which executed the command as shown in this example,
The following defines some of the ACL system variables: Variable
Description
ABSn
Absolute value of a numeric field.
AVERAGEn
The mean value of a numeric or date field.
TOTAL Amount
COUNTn
The number of records counted.
COUNT
GAPDUPn
would generate variables TOTAL1 and COUNT1. However, when executed inside a group:
The number of gaps detected by the GAPS command; or the number of duplicates detected by the DUPLICATE command.
HIGHn
The xth highest value of a numeric or date field (STATISTICS command)—where ‘x’ is the number of high/low values set by the user.
LOWn
The xth lowest (smallest) value of a numeric or date field (STATISTICS command)—where ‘x’ is the number of high/low values set by the user
MAXn
The largest value of a numeric or date field (STATISTICS command).
MINn
The smallest value of a numeric or date field (STATISTICS command).
MLEn
Total of the most likely errors in a sample (EVALUATE command).
OUTPUTFOLDER
Path (e.g. C:\ACL\Dta Files) of the current folder.
* Group-End is an ACL structure used in scripts that, among other things, allows multiple ACL commands to be processed against the current record before ACL proceeds to the next record. Group-End is only used in scripts (see the section on Groups; pages 125–132, for more information).
RANGEn
Difference between the largest and smallest value of a numeric or date field.
SAMPINTn
Sample interval (SAMPLE command).
SAMPSIZEn
Sample size (SIZE Command).
Note: The line numbers (e.g. “- line 1”) are not included in the ACL script and must be counted manually by the script developer.
STDDEVn
Standard deviation (STATISTICS command).
TOTALn
Total value of a numeric or date field.
GROUP
-line 1
TOTAL Amount
-line 2
COUNT
-line 3
END
-line 4
These generate variables TOTAL2 and COUNT3 because the TOTAL command was executed by the second line of the Group-End structure and the COUNT command was executed by the third line.
105
Fraud Analysis Techniques Using ACL
Variable
Description
UELn
Upper error limit in sample (SAMPLE command).
VER
Version of ACL.
WRITEn
Number of records written by a command that produces a new table.
Extract All if Amount >=23,241.59—entering the tenth highest amount on the Extract command. The same results could be obtained by using the ACL system variable HIGH1: STATISTICS on AMOUNT Number 10 EXTRACT All if Amount >=HIGH1
User-Defined Variables User-defined variables are named variables created by the user. User-defined variables can be created in three main ways:
EDIT > Variables – NEW
On the Command line (e.g., v test=123)
In a script (e.g., v date1 = Date; v date2='20100301')
HIGH1 will contain the value of the tenth highest Amount (from the STATISTICS command, with the parameter “Number 10”). v ctr = 0—to initialize a variable. v ctr = v ctr+1—to increment a variable. v amount = COUNT1—to store ACL system variable in a user-defined variable.
DEFINE Field/Expression
Unless preceded by an underscore (e.g., v ctr=123), user-defined variables are temporary and will be deleted at the end of the ACL session. Permanent user-defined variables (those preceded by an underscore) become part of the ACL project and are not deleted at the end of the ACL session.
ACL is a read only software package: The user can’t change the physical data. However, ACL does allow the user to create expressions (computed fields) by either editing the table layout (click Edit on the menu bar, and then click Table Layout or in a script (DEFINE field)).
Using System and User-Defined Variables
The user can create a script to define a new expression or a new physical field by issuing the command DEFINE Field.
ACL system and user-defined variables can be used instead of hard coding the required values. For example, to identify the 10 records with the largest amounts: STATISTICS on Amount Number 10—manually noting the tenth highest amount.
106
Define a new data field: DEFINE FIELD Field Name Field Type Start Length Format Examples: DEFINE FIELD Inv Date DATE 6 10 MM/DD/YYYY
Developing ACL Scripts
DEFINE FIELD Amount Numeric 16 10 2 DEFINE FIELD Name ASCII 32 25 as 'Last Name' Unconditional Expression: DEFINE FIELD Field Name COMPUTED ——Skip Line or enter a description —— Value Example: DEFINE FIELD Last Name COMPUTED "last name of employee"
Workspaces A workspace is one or more field definitions (expressions or physical data fields) that are saved so that they can be reused by another table within the same ACL project. Workspaces allow the user to maximize the utility of user-defined expressions. For example, if you are working with inventory tables and want to create an expression to computed the extended price (Ext Price = Unit Price ∗ Quantity), the expression can be saved in a workspace and then the field (Ext Price) would be available to any inventory file defined in the same ACL project. Two things must occur for a workspace expression to be shared by other tables:
SPLIT(Name,",",2) Conditional Expression: DEFINE FIELD Field Name COMPUTED ——Skip Line or enter a description—— Value 1 IF Condition 1
1. The fields used by the expression must exist in the other tables. In the inventory example, the other inventory files must have fields called Unit Price and Quantity. 2. The user must ACTIVATE the workspace. This is done by highlighting (right-clicking) the workspace and selecting the option Activate.
Value 2 IF Condition 2 Default Value Example: DEFINE FIELD New Price COMPUTED "Calculation of new price"
Price * 1.05 if Product Class = '01' Price * 1.04 if Product Class = '02' Price * 1.02
For example, while in the Jan Inventory table, an expression called Ext price is created (Unit Price ∗ Quantity). A workspace called Inv Expressions is also created, and the Ext price field is added to this workspace. After opening Feb inventory, the user highlights the workspace Inv Expressions and selects Activate. The field Ext Price is added to the table layout for Feb Inventory and can be used in any ACL command. Creating a Workspace: 1. Open table
107
Fraud Analysis Techniques Using ACL
2. Right-click the ACL Project name folder and select New and option Workspace:
4. Click OK to save workspace:
5. Close and rename (currently called New Workspace). Activating a Workspace: 1. Open table. 2. Right-Click desired workspace and select Activate. 3. Select the fields to be added to the workspace:
Activating a workspace inside a script is also possible by issuing the command: ACTIVATE workspace name (e.g. ACTIVATE Inv Expression). Notes: 1. The workspace expression is only temporarily added to the table layout. If the table is closed, the workspace expression will not be included in the table layout and the next time the table is opened, the user will have to activate the workspace again if they want to use the expression. 2. Workspace fields can be permanently added to the table layout by making any change to the layout and saving the modified layout. For example, adding a field call TEMP with a default value of 1; saving the expression; and closing the layout will cause any
108
Developing ACL Scripts
activated workspace fields to be permanent and they can be used without having to activate the workspace. The table layout can be re-edited and the field TEMP deleted.
Sharing Workspaces Not only can workspaces be shared between tables, they can also be shared between ACL projects. This can be done in two ways:
Scripts What Is a Script? A script is a file that contains a series of ACL commands similar to a macro in Microsoft Excel. A script enables the user to: 1. Save and repeat commands faster and automatically.
1. Open the other project and right-click the ACL Project name folder and select Copy from another project and option Workspace. Then select the project containing the workspace and select the desired workspace.
2. Ensure consistency across audit teams or locations.
2. Right-click on the workspace in the current project and select Export a Server file. Give the workspace a name (e.g., Inv expression.WSP) and save it. Then open the other project; Right-click the ACL Project name folder and select Import from Server file and option Workspace, and select the desired workspace file.
5. Create interactive jobs that can be run by nonexperts.
Similar methods can be used to share scripts between projects. Scripts are saved as .bat files. Exercise 1: Workspaces Using Script Exercises.ACL and data file Inventory, create an expression that is the extended price (Ext Price = Uncst ∗ Qtyoh). Save this expression in a workspace and rename the workspace Inv Ext Price. Open Jan Inventory and activate the Workspace Inv Ext Price. Determine the Value of records where the Value <> Ext Price.
3. Share best practices with others. 4. Perform complex file processing.
Creating Scripts There are three main ways to create a new script: 1. Create a script from the table history by creating a file and then using Tools on the menu bar and option Create script from table history. 2. Create a script using the recorder—Tools on the menu bar and option Set Script Recorder on. Then run your commands e.g., OPEN AR, SORT, SUMM, Set Recorder Off 3. Highlight commands in the LOG; right-click and select Save Selected Items and option Script.
Solution: The total of Value is 14,426.95; 8 of 152 matched the Filter: Ext Value <> Value.
109
Fraud Analysis Techniques Using ACL
1. Create a Script from the Table History Exercise 2a: Inventory Summary by Location
The table history shows all the commands (FILTER, SORT and SUMMARIZE).
Using ACL project Script Exercises.ACL and data file Inventory, create file Inv Loc Sum by performing the following steps:
11/19/2008
Inventory SET FILTER Value>100
11/19/2008
Inventory SORT ON ProdCls TO
1. Open Inventory.
Input : Records:152
Control Total:0
2. Filter: Value > 100.
Output: Records:143
Control Total:0
3. SORT on Product Class to Inv by Prodclass—OPEN.
11/19/2008
4. SUMMARIZE on Location to Inv Loc Sum totaling the Value, Market Value, and Quantity on Hand. Now check the Table History (Tools on the menu bar and option Table History).
"Inv_by Prodclass"
Inv by Prodclass SUMMARIZE ON
Location SUBTOTAL Value MktVal QtyOH TO "Inv_Loc_Sum.FIL" OPEN PRESORT Input : Records:143
Control Total:0
Output: Records:7
Control Total: 0
Note: Not all commands are captured in the log, for example, the creation of an expression is not captured in the log file.
Exercise 2b: Inventory Summary by Location Use the table history to create a script to perform these actions (Tools on the menu bar and option Create script from table history) and save as Inv Loc Sum. Suggested Answer: Script Exercises Answers.ACL—script: Inv Loc Sum
110
Developing ACL Scripts
OPEN Inv_by_prodclass SUMMARIZE ON Location SUBTOTAL Value MktVal QtyOH TO "Inv_Loc_Sum.FIL" OPEN PRESORT OPEN Inv_Loc_Sum DEFINE REPORT Default_View
All of the commands necessary to create the script Inv Loc Sum, starting with opening Inventory are captured from the table history and copied to the script. This is true even if many days and other ACL work was performed between the creation of Inv by ProdClass and the summarization to Inv Loc Sum. In addition, only the commands directly related to the creation of Inv Loc Sum are contained in the table history. ACL asks for the name of the new script: 2. Set Script Recorder On A script can be created by turning the recorder on (Tools on the menu bar and option Set Script Recorder on), and then executing all of the desired commands.
Note: Please use suggested script names for all exercises as subsequent exercises will require these scripts. And the script contains the commands: OPEN Inventory SET FILTER Value>100 SORT ON ProdCls TO "Inv_by_Prodclass" OPEN
111
Fraud Analysis Techniques Using ACL
After all the commands have been entered, turn the script recorder off. ACL will prompt the user for the name of the script where the commands will be saved
11/19/2008
Inventory SET FILTER value>100
11/19/2008
Inventory EXTRACT RECORD TO
"Inv_Value_GT_100" OPEN Input : Records: 152
Control Total: 0
Output: Records: 143
Control Total: 0
A Control Total is included in the table history. If a field is defined with the option “Control Total” in the table layout, its value will be calculated and written to the table history. For example, the inventory file might be $2,300,312.14 initially; and the file Inv Value GT 100 might have a value of $2,298,871.10— the table history control totals would contain these amounts. Exercise 3: Inventory Analysis Using Script Exercises.ACL, set the script recorder ON, and execute the following commands: OPEN Inventory
3. Copy and Paste Commands from Log. A third way to create scripts is to copy the commands from the Log file. OPEN the Log and left click commands to be copied to script:
TOTAL FIELDS QTYOH VALUE EXTRACT RECORD IF Value >100 TO Inv Value GT 100" OPEN OPEN "Inv Value GT 100"
Set the script recorder OFF and save the script as Inv Analysis. Suggested Answer: Script Exercises Answers.ACL—script: Inv Analysis Note that commands not directly related with the creation of the final table (Inv Value GT 100), such as TOTAL Fields Value, are captured in the script, whereas the table history for “Inv Value GT 100” would only contain the EXTRACT command.
112
Right-click and select Save Selected Items and option Script to create the new script:
Developing ACL Scripts
END The block comment End should not contain any blank lines as ACL will interpret a blank line as the end of the comments.
Editing Scripts Scripts can be edited. Right-click the name of the script and select Edit:
Name the script and save it:
Make the required changes:
Commenting Scripts
Add new commands—cut and paste from LOG
Modify existing commands
Comment lines - Purpose; Author; Date
Delete commands
Details on how the script works, etc.
Add comments
COMMENT
113
Fraud Analysis Techniques Using ACL
Running Scripts To run a script—right-click the name of the script and select Run:
Or by selecting Tools on the menu bar and option Run Script:
114
Developing ACL Scripts
Exercise 4a: Top 10 Pay
Create a DOS batch (.bat) file that will:
Using Script Exercises.ACL and data file Pay Trans, perform the following steps:
1. Launch ACL.
1. Sort by Empid to Empid Sort and Open the Empid Sort file.
3. Set the variable Materiality to 10,000.
2. Summarize on Empid to Empid Sum to determine the total amount paid to each Empid; listing the Name, Bank, Trans No, and Bank Acct information for each Empid.
4. Run the Find duplicates script.
3. Extract the top 10 records (based on Total Amount Paid) to a file called Pay Top 10. Now build a script called Top 10 Pay that will perform all of the above steps (including Opening Pay trans) without any prompting from the user. Hint: There are three basic ways to extract the top ten paid employees: SORT on amount descending and extracting the first 10 records; INDEX on amount descending and extracting the first 10 records; or STATISTICS on Amount with number of high and low set to 10. The first two methods with give you 10 records, the statistics will give more if there is more than one employee with the tenth highest amount. Suggested Answer: Script Exercises Answers.ACL—script: Top 10 Pay.
RUN ACL Script from DOS The ability to run an ACL script from a DOS .bat file will allow the user to use the Microsoft Scheduler to run scripts at any time (even when the user is not there).
2. Load the Workbook project.
5. Create a directory called ProjectName 6. Copy the file created by Find Duplicates to the new directory So a sample DOS batch would be: C:\Program Files\ACL Software\ACL Version 9\ACLWin.exe C:\ACL Data\Workbook\Workbook.acl /vMateriality=10000 /bFind duplicates MD ProjectName COPY my dups.fil C:\ProjectName\my dups.fil Note: The Find Duplicates ACL script would have the necessary ACL code to run the DUPLICATES command and should contain the line “QUIT” (no quotes) to end the processing of ACL and return processing to the DOS batch file.
Saving a Script to a .BAT file Scripts can be shared with other ACL users. They can open their ACL project and use the Copy From Another Project option. Alternatively, scripts can be saved to a .bat file and then imported into the user’s project.
115
Fraud Analysis Techniques Using ACL
The .bat file may be imported into any ACL Project. Open an ACL Project file and Right-click the Script folder and select Import from Server and option Script:
The user must provide a name for the script:
Then select the desired script from the list of .bat files:
The .bat file will be copied into the script folder of the ACL Project.
116
Developing ACL Scripts
The .bat file can be also used without importing it into an ACL project. To use without importing, simply create a script that executes the .bat file—for example, DO Inv Analysis.bat.
Interactive Scripts Scripts can be designed to run with or without user input. Scripts can have all the necessary input parameters—table name, field names, output file name, etc.—hardcoded in the script. However, advanced techniques allow the script developer to design interactive scripts; that is, scripts that accept input from the user. This allows the user to control not only which scripts will be executed, but which tables will be used, user provided threshold values, and much more.
Note: User input is stored as a character field, even if the entry was the number 5. The script developer must use the Value() function to convert to numeric value if this is required. Rather than requiring the user to type in a response, it is also possible to create drop-down lists of specific items. For example, the following command prompts the user to select from a list of all the tables in the currently opened ACL project: ACCEPT "Select table" fields "xf " to v infile
User input can be requested in two ways: the ACCEPT command, and DIALOG boxes.
ACCEPT Command The basic format of the ACCEPT command is:
The following provides the most commonly used list items. (The ACL Help file provides a more detailed list of all the options available with the ACCEPT command.)
ACCEPT 'prompt string'
to variable name For example, to prompt user for cut-off dollar amount: ACCEPT "Enter maximum dollar amount" to v max Amt
List Item
Results
C
Drop-down list of all character fields in open table.
D
Drop-down list of all numeric fields in open table.
N
Drop-down list of all date fields in open table.
xf
Drop-down list of all table layouts in currently open project.
117
Fraud Analysis Techniques Using ACL
Exercise 4b: Top 10 Pay Prompt Use the Accept command and modify the script Top 10 Pay to prompt the user to select the Pay transaction file from a drop-down list of Project tables. Save script as Top 10 Pay Prompt. Suggested Answer: Script Exercises Answers.ACL—script: Top 10 Pay Prompt.
Dialog Boxes Dialog boxes offer the script developer (and user) a more aesthetically pleasing method of prompting for user input. A dialog box can also prompt for more than one user input at a time. To build a dialog box, edit a script and right-click the Build New Dialog button.:
Dialog buttons (on left):
TEXT—to enter text on dialog box, such as instructions for user or titles. EDIT Box—to allow user to enter text input. CHECK Box—user can check the box or leave unchecked. RADIO Button—provides user with a list of options (only one can be selected). DROP-DOWN List—provides a user-created drop-down list (user can select one). PROJECT ITEM List—provides a list of ACL Project items (tables, fields, etc).
This will create a dialog box where you can build your user prompts. This dialog box can be resized to meet the required number of prompts:
118
DELETE—to delete an item from the dialog box. SNAP to GRID—to place items in the dialog box along grid lines.
Developing ACL Scripts
Example of a dialog box – during the construction phase:
Running the script, the completed dialog might look like this:
Adding Selections to Drop-Down and Project Item Lists To enter labels and items on a drop-down list, type in a label or item and click Add. To select items for a project item list, select a category from the provided list and click Add. A proper variable name (with no spaces) must also be provided.
Drop-down List
Project Item List
The results of the dialog are stored in user-defined variables and can be used in later processing. In the above case, the DISPLAY variables command would show that the following variables were created based on the user selections/input: Variable
Type Value
Comment
V audit type
C
Operational
Selected from user-defined pull-down list.
AR
L
False
Accounts Receivable box was not checked by user.
AP
L
False
Accounts Payable box was not checked by user.
Inv
L
True
Inventory box was checked by user.
Pay
L
False
Payroll box was not checked by user.
v dept
N
2
User selected 2nd radio button (personnel).
v infile
C
Inventory
User selected Inventory from list of table layouts.
v audit name
C
Dave Coderre User entered “Dave Coderre” as Name.
119
Fraud Analysis Techniques Using ACL
Macro Substitution After requesting user input, the script developer must know how to inform ACL of the user’s input. This is done via macro substitution. In the previous example, ACL could be directed to open the selected table, Inventory.fil, by using macro substitution and issuing the command OPEN %v infile%; the “%variable name%” tells ACL to look at the contents of the variable, in this case, v infile.
This will give the user access to the dialog builder screen. New items can be added, and existing items can be deleted or modified. Closing the dialog box (by clicking on red X in top right corner) will present the options:
OK—to save changes.
CANCEL—to continue editing dialog box.
DISCARD—to close dialog box without saving changes.
For example, the command: ACCEPT “Enter Table Name” to v infile or ACCEPT “Select Table” FIELDS xf to v infile
An interactive script gives the user control over which ACL commands are executed. The script evaluates the user input (e.g., Personnel) to determine which commands will be performed. Control is achieved through the use of the IF command.
followed by
The IF command only executes an ACL command if condition is true; it does not test condition for every record. As previously described, the IF command can be used to determine whether commands (or subscripts) are to be executed or not. In the above case, ACL could use the results of the radio button selection to determine additional subscripts to be executed as follows:
OPEN %v infile% if the user enters (or selects) “Inventory,” will result in ACL opening the table Inventory.
Editing Dialog Boxes A dialog box can be edited by first editing the script, and then placing the cursor on the dialog box command line by a single left-click on the line containing the Dialog command and right-clicking the Edit Command button (top left):
IF v Dept = 1 DO Fin Tests IF v Dept = 2 DO Pers Tests IF v Dept = 3 DO Defense Tests IF v Dept = 4 DO Justice Tests In this way, the user is prompted for “department to be audited” and ACL uses the input to determine which tests (additional scripts) will be executed.
120
Developing ACL Scripts
Exercise 5: Test menu Create a new script and build the dialog box shown previously. Rename the variables from the defaults to the suggested names (e.g., rename RADIO1 to v dept; CHECKBOX1 to A R, CHECKBOX2 to A P, etc.). Save script as Test menu and then run the script. Complete the user selections and run command DISPLAY VARIABLES. Note the variables values and compare with the choices you entered.
Dialog boxes can prompt the user for any required input. A script may also contain a number of dialog boxes. For example, the script builder may wish to prompt the user for a table to be opened; open the selected table; and then prompt for the fields to be selected (from the just opened table).
Exercise 7: Generic Classify The CLASSIFY command has the following parameters:
Suggested Answer: Script Exercises Answers.ACL—script: Test Menu. The script builder can simply prompt for a test to be run or can prompt the user for every parameter: input file, fields, output file, etc. This ensures that the proper file and fields are used for the selected analysis; it also ensures maximum flexibility as the script will work with any file defined to ACL. Exercise 6: Interactive Dialog Box for Top 10 Pay Modify script Top 10 Pay Prompt by replacing the Accept command with a dialog box that prompts the user to selection the Pay transaction table and to provide the Output file name. Save as Top 10 Pay Interactive. Hint: The variables cannot contain spaces, so you should have a text box with instructions that the user should provide an output file name that does not contain spaces. Suggested Answer: Script Exercises Answers.ACL—script: Top 10 Pay Interactive.
CLASSIFY on character field SUBTOTAL numeric field to Output file. Create a script that prompts the user for:
Table to be opened (select from list of tables). Character field to classify on (select from list of character fields). Numeric field to subtotal (select from list of numeric fields). Output file to store results.
Hint: Copy an existing CLASSIFY command to a new script and modify it, replacing the actual field names with variables. Notes: 1. The script must open the user specified table before prompting for the character and numeric fields, otherwise the fields listed will be from currently opened table. Therefore, there must be two separate dialog boxes. 2. Ensure that only character fields can be selected when the user is prompted for a character field (same for numeric fields).
121
Fraud Analysis Techniques Using ACL
3. The script will only work if the specified file has at least one character and one numeric field, and if output file name has no spaces. Provide all necessary instructions to the user. Suggested Answer: Script Exercises Answers—script: Generic Classify.
If the user selects AP Analysis, the first radio button, then only the AP Analysis subscript would be executed. This is accomplished by using the IF COMMAND to test the value of the radio button variable (e.g., v menu) and executing the appropriate subscript, such as the following: IF v menu = 1 DO AP Analysis
Subscripts A script can execute any number of ACL commands. It can also execute another script—a subscript. The ability of a script to call a subscript allows the script designer to control the flow of the analyses. Check boxes and radio buttons are often used to prompt the user for choices, with subscripts ensuring that only the selected options are executed. For example, the script designer could build a dialog with a radio button selection asking the user to select an analysis to be performed from a list of possible analyses.
IF v menu = 2 DO AR Analysis IF v menu = 3 DO Payroll In this example, the first radio button was checked, so v menu has the value 1 and only the AP Analysis script would be executed. Note: If the script designer wants the user to have the ability to select more than one option, check boxes would be used instead of a radio button. The IF COMMAND would be used to test the value (T or F) of each check box. For example, given three check boxes with variable names: v AP, v AR and v Pay, the IF COMMAND would be: IF v AP DO AP Analysis IF v AR DO AR Analysis IF v Pay DO Payroll
Special Uses for Subscripts
User Dialog
122
Two special uses of subscripts are repeating a script and error trapping. The first is used to repeatedly execute a script, such as a menu selection script, until the user tells ACL to stop. The error-trapping subscript is used to test the validity of the user input, and if not appropriate, to ask the user to re-input their response.
Developing ACL Scripts
IF v menu = 4 v run menu = F
Repeating a Script Suppose the previous Main Menu (with the radio button) were to be repeatedly presented to the user, thus allowing the user to select AP Analysis one time, make another selection the next time, and so on until he or she selected the EXIT option. In English terms, we would want ACL to run the menu script until the user said stop. In scripting terms, we would tell ACL to execute the menu script WHILE a condition was TRUE. This requires two scripts: The first sets the condition as TRUE and tells ACL to repeatedly execute the menu subscript (until the condition is FALSE). Main script
Explanation
v run menu = T
Sets v run menu to TRUE.
DO Menu script WHILE v run menu
Executes Menu script until v run menu is no longer TRUE. After each time that ACL runs the menu script it returns to the Main Script and tests the value of v run menu. If the value is still TRUE, it executes Menu script again. If the value is FALSE it stops executing Menu script and continues with the execution of the Main script.
DIALOG (DIALOG TITLE “User Dialog box to prompt for user selection. Variable Dialog” WIDTH 413 HEIGHT 255) v menu is used to store the radio button selected by (BUTTONSET TITLE “&OK;&Cancel” the user. AT. . . etc IF v menu = 1 DO AP Analysis
Executes subscript AP Analysis if user selected radio button 1.
IF v menu = 2 DO AR Analysis
Executes subscript AR Analysis if user selected radio button 2.
IF v menu = 3 DO Payroll
Executes subscript Payroll if user selected radio button 3.
v run menu is set to FALSE and the Menu script will no longer be run (the WHILE condition is no longer True) if user selected radio button 4.
Error Trapping When prompting the user for entry, the script designer should ensure that a proper entry has been provided. In some cases it is possible to provide a default value or a check box, but in other cases, the script designer must test for user input. The process is similar to the repeating script—using a main script and a subscript. In the main script, a variable is set to FALSE and a subscript is called repeatedly until the variable becomes TRUE. For example, suppose the user is prompted to select a field. The subscript would contain a dialog prompting the user to make a selection and would test the variable that stores the field name, checking to see if the length is great than zero (i.e., a field name has been selected). Main script v user input = F
Explanation Sets v user input to FALSE.
DO Get user input WHILE v user input<>T
Executes Get user input until v user input is TRUE.
Get user input DIALOG (DIALOG TITLE “User Dialog box to prompt user to select a field from a Dialog” WIDTH 313 HEIGHT 355) drop-down list. Variable v field is used to store the (BUTTONSET TITLE “&OK;&Cancel” name of the field selected by the user. AT. . . etc IF LEN(“%v field%”) > 0 v user input=T
Tests length of the variable containing the selected field name. If the length greater than 0, sets the value of v user input to TRUE and returns to Main script.
123
Fraud Analysis Techniques Using ACL
Main script
Explanation
Consolidation Exercise
IF LEN(“%v field%”) = 0 Pause “You must select a field “
Tests length of the variable containing the selected field name. If the length equals 0 (no field selected). The subscript displays a prompt telling the user to select a field and re-executes the Get user input script. The subscript continues to prompt user for a valid field name until the length of v field is great than 0.
You now have all the tools to develop an audit menu that offers (repeatedly) the user a choice of tests to be run and checks for valid user input (error trapping). This exercise will use three of the scripts already developed: Inv loc Sum, Inv Analysis, and Top 10 Pay Interactive. The exercise also requires that you build the main menu and modify the existing scripts to perform error-trapping tests.
Error-trapping tests can be quite elaborate, such as testing for a valid date entry. However, even more complicated error-trapping tests would still follow the principles describes above. Exercise 9: Audit Menu Exercise 8: Get User Input Build a script that:
Prompts the user to select a table. Note: Provide a default table name.
Opens the selected table.
Prompts the user to select a field from a drop-down list of fields.
Uses error trapping to ensure the user has selected a field.
Save the scripts as Get User Input1 and Get User Input2. Test the scripts using various combinations of the default table name and selecting a table—and then not selecting or selecting a field name. Suggested Answer: Script Exercises Answers.ACL—scripts: Get User Input1; and Get User Input2.
124
Modify the Inv Analysis script to:
Prompt the user for the input table and use error trapping to ensure a table is selected.
Prompt user for the output file name (provide a default name) and remove any blanks in the user-provide name.
Save the modified script as Inv analysis2 and the error-trapping script as Inv Analysis2a. Build an audit menu (Audit Menu2) that offers the user the choice of running either an analysis to determine total inventory by location (Inv Loc Sum); to perform an inventory analysis (Inv Analysis2); to determine the top 10 employees by amount paid (Top 10 Pay Interactive); to perform a (generic) classify (Generic Classify); or the Exit. The script Audit Menu1 should present the subscript Audit Menu2 to the user repeatedly, until they select EXIT.
Developing ACL Scripts
Audit Menu1 should:
At the beginning, turn the safety off so the user is not prompted to overwrite files.
At the end, turn the safety on and delete all user-defined and ACL system variables. Run and test all user selections.
Note: In order for the scripts in this exercise to work properly you have a limited choice of input files. This would not be the case in a real life scenario. Even though you are building a script to prompt for user input, the following choices must be made:
In the Inv Analysis2 script, the user must select Inventory as the input file, otherwise the required fields (Value, and QTYOH) won’t exist and the script will fail. In the Top 10 Pay Interactive script, the user must select Pay Trans as the input file.
Suggested Answer: Script Exercises Answers.ACL—scripts: Audit menu1; Audit menu2; Inv Loc Sum; Inv Analysis2; Inv Analysis2a; Top 10 Pay Interactive; and Generic Classify.
Exercise 9 demonstrates a number of important script concepts. At the same time, it is not very user friendly in that it would fail if incorrect input file selections were made in both the Inv Analysis2 and Top 10 Pay Interactive scripts. When designing scripts for use by others, care must be made to protect the user from possible incorrect entry. One way of protecting the user is to always prompt for field names and other required user input.
Advanced ACL Scripting Techniques There are many circumstances where the running of standard ACL commands will not be sufficient to accomplish the required tasks. For example, if there is a need to compare values from one record with another record, to process records with a variable number of repeating segments, or to produce a running subtotal of a given numeric field. There are also instances where efficiency can be greatly improved by running several commands against each record before having ACL read the next record. In situations such as the above examples, the script developer must go beyond using simple ACL commands, and write advanced ACL code. The following provides an introduction to GROUPs and LOOPs, which are ACL commands that provide script developers with even more powerful scripting techniques. Once again, the user is encouraged to perform the accompanying exercises and review the suggested solutions.
GROUP Command The GROUP command can be used to improve the efficiency and performance of ACL. Normally, each ACL command issued is processed against all the records. Thus if there are five commands, ACL will read the entire file five times, once per command. However, using the GROUP command, ACL can process multiple commands in a single pass through the file. The GROUP command also gives the user the ability to restrict which records will be processed by which commands.
125
Fraud Analysis Techniques Using ACL
Simple GROUP A simple GROUP has two or more ACL commands between the GROUP and END statements. The following is an example of how a simple GROUP can be used to improve efficiency. Without the GROUP command, OPEN Inventory CLASSIFY on Location Subtotal Value
are no conditional statements that are true, ACL returns to the start of the GROUP block and process the next record. In the simple GROUP example, all records were processed by the CLASSIFY and STATISTICS commands. However, suppose that the user only wanted to classify and run statistics on records with a Value greater than zero. This could be accomplished by a conditional GROUP command. Without using a GROUP command, the ACL code would look like:
STATISTICS on Value EXTRACT Record to Zero_Value if Value = 0
OPEN Inventory
EXTRACT Record to Neg_Value if Value < 0
CLASSIFY on Location Subtotal Value if Value > 0
the above commands would read the Inventory file five times, whereas as the simple GROUP would read the file once: GROUP CLASSIFY on Location Subtotal Value STATISTICS on Value EXTRACT Record to Zero_Value if Value = 0 EXTRACT Record to Neg_Value if Value < 0 END
STATISTICS on Value if Value > 0 EXTRACT Record to Zero_Value if Value = 0 EXTRACT Record to Neg_Value if Value < 0
The GROUP command below performs CLASSIFY and STATISTICS only on records that have a Value greater than zero; and EXTRACTs records that have a Value less than or equal to zero to separate files. GROUP IF VALUE > 0 CLASSIFY on Location Subtotal Value
Conditional GROUP The conditional GROUP provides the user with a more powerful set of options. The conditional GROUP, which includes at least one ELSE-block of commands, evaluates a condition to determine if the block of ACL commands should be executed or not. If the condition is FALSE, then ACL moves to the next ELSE command. Or, if there
126
STATISTICS on Value ELSE if Value < 0 EXTRACT Record TO Neg_Value ELSE EXTRACT Record TO Zero_Value END
Developing ACL Scripts
Rather than reading the entire file four times—once for each of the CLASSIFY, STATISTICS, and the two EXTRACTS commands— ACL only reads the file once. ACL reads each record and determines where it will be processed by one of the following:
The first GROUP IF block (performing CLASSIFY and STATISTICS).
The first ELSE (Extracting the negative valued records).
The final ELSE (Extracting the zero valued records).
Records are tested to determine if the Value is greater than zero:
If YES, the CLASSIFY and STATISTICS commands are processed against these records; and ACL reads the next record without performing any of the commands in either of the ELSE blocks.
If NO, the record is tested to determine if the Value is less than zero:
If YES, the record is extracted to Neg value; and ACL reads the next record. If NO, then the record is extracted to Zero Value.
The conditional GROUP structure will subject records to maximum of two IF-tests (value > 0 and value < 0); and many records will only be tested once (all records with a value greater than zero). For example, a record with a value of $10.00 will only be tested once by the GROUP IF Value > 0. The CLASSIFY and STATISTICS commands will be run against the record (the results stored in temporary variables) and the next record will be read. Whereas without the conditional Group, every record would be
subject to four IF-tests:
One to determine if the CLASSIFY should be performed.
One to determine if the STATISTICS should be performed.
One for each of the EXTRACT statements.
Even if the value is determined to be greater than zero when testing for the CLASSIFY command, without a conditional group, the record would still be tested to see if the Statistics and Extracts should be performed or not. Note: To improve efficiency of processing large files, the user should write the script such that the IF test which will be met (True) by the largest number of records is first. This way, the largest number of records will only be IF-tested once.
Exercise 10: Running Subtotal of Value by Location Using Script Exercises.ACL and the Inventory file, create a running subtotal of the Value of inventory held at each location. Extract the fields Location, Product number, Quantity on Hand, Unit Cost, Value and Run Total (running total) to a file called Inv Loc Value. The exercise requires you to create a variable (Run Total) that will contain the running total of the Value for each Location. This means the file must first be sorted by Location. Then each time ACL encounters a new Location, the running total is reset to the Value of the current record. For subsequent records with the same Location, the running total is equal to the running total (from the previous record) + the Value of the current
127
Fraud Analysis Techniques Using ACL
record. For example, the running total would be: Loc
Value
Running Total
A
10.00
10.00 − reset to Value (10.00);
A
5.00
15.00 − previous running total
first record for location A = 10.00 (10.00) + current value (5.00) = 15.00 A
6.00
21.00 − previous running total (15.00) + current value (6.00) = 21.00
B
12.00
12.00 − reset to Value (12.00); first record for location B = 12.00
C
8.00
8.00 − reset to Value (8.00); first record for location B = 8.00
C
3.00
11.00 − previous running total (8.00) + current value (3.00) = 11.00
. . .
Note: The exercise could be accomplished with RECOFFSET(), however, you are asked to use a conditional GROUP approach. Suggested Answer: Script Exercises Answers.ACL—script: Running Total.
Nested GROUP Even more efficient coding is possible when nested GROUPS are considered. The nested GROUP allows the user to runs groups within other groups. A nested GROUP is useful when there are records for which a set of commands will be executed; and a conditional set of commands that will be executed for the remainder of the records. Nested GROUPS provide a powerful way for the user to control which commands are executed for which records. For example, suppose the user wants to perform the following analysis: For all inventory records in Locations ’01’, ’02 ’ or ’03 ’:
Total the Value of records with a Value < 0.
Classify on Location and total the Value if the Value > 0.
Extract all records with Value equal to zero to Zero Value.
For inventory records not in Locations '01’, ’02’ and ’03’:
Extract all records to Invalid Location.
With a conditional GROUP, the ACL code would look something like: GROUP if Match(Location, '01', '02', '03') and Value<0 Total Value ELSE if Match(Location, '01', '02', '03') and Value>0 CLASSIFY on Location Subtotal Value
128
Developing ACL Scripts
The nest GROUP would look like:
ELSE if Match(Location, '01', '02', '03') and Value=0
GROUP if Match(Location, '01', '02', '03')
EXTRACT Record to Zero_Value
GROUP if Value < 0
ELSE
Total Value
EXTRACT Record to Invalid_Location
ELSE if Value >0
END
CLASSIFY on Location Subtotal Value
This would be inefficient because records would be IF-tested up to three times. For example, a record with Location equal to ‘04’ would be tested by the following conditions:
ELSE Extract Record to Zero_Value END
GROUP if Match(Location, '01', '02', '03') and Value<0 (False).
ELSE
ELSE if Match(Location, '01', '02', '03') and Value>0
END
(False).
ELSE if Match(Location, '01', '02', '03') and Value=0
(False).
EXTRACT Record to Invalid_Location
It would then be processed by the ELSE-EXTRACT Record to Invalid Location command.
A record with Location equal to '04' would be IF-tested by the conditional GROUP if Match(Location, '01', '02', '03') (False) and thus be excluded from further testing by the inner GROUP. It would be processed by the outer ELSE—the Extract Record to Invalid Location command—thus only have been IF-tested once.
Even a record with Location equal to '01' could be IF-tested three times as shown in the following example for Location '01' and Value equal to 0.00:
A record with Location ‘01’ and Value equal to 0.00 would be IF-tested by the following conditions before being processed by the inner ELSE – EXTRACT Record to Invalid Location:
GROUP if Match(Location, '01', '02', '03') and Value<0 (False).
The outer GROUP - GROUP if Match(LOC, '01', '02', '03') (True).
ELSE if Match(Location, '01', '02', '03') and Value>0
The inner GROUP - GROUP if Value < 0
(False).
(False).
The inner ELSE - ELSE if Value > 0
(False).
ELSE if Match(Location, '01', '02', '03') and Value=0
(True).
However, the two inner GROUP IF conditions are simple IFs with only one test (value < 0 and value > 0). Without the Nested Group,
129
Fraud Analysis Techniques Using ACL
the record would be processed against a more complicated IF that would include Match(Location, '01', '02', '03') and a test of the value. This is less efficient that one IF Match(location, '01', '02', '03') followed by a simple test on the Value field. Note:As explained at the beginning of this chapter, variables naming is based on the Group-End line number of the command that generates the variable (e.g. COUNT2, TOTAL5). In the nested GROUP example, the TOTAL command is on the third line of the GROUPEND block, therefore the total of the value will be stored in a variable called TOTAL3. Since the GROUP-END structure causes ACL to process records one at a time, a group can also be used to store information contained in a record and compare it with the value of the same field of the next record. Take the example where a file contains the employee number followed by the name, and address for every employee. You need to extract the employee number and name to a new file. If each employee had three lines (Number, Name and Address), you could simply filter out the address lines (MOD(RECNO(),3)<>0), but in this case, the address can span multiple lines so a simple filter won’t work. Sample file: Empno: 0002309125 Ellen Cartridge
Empno: 1019982137 Jim Fulton 10924 Beech Tree Way Suite 107 Burbank California Empno: 1800066292 Mary Crestwell 524 Alesther St Ottawa Illinois
The table layout for this text file, (CR or LF) with record length 82, only defines two fields: Empno ASCII
8
10
Allrec
1
50
ASCII
The basic approach would be to identify the record that contains the employee number; remember the employee number; read the next record—which contains the name—and extracts the employee number and name. All other records would be ignored.
123 Main St, Las Vegas, Nevada Empno: 1003987128 Dave Coderre
Using the GROUP structure:
1233 Grey Rocks St
Identify the records that contain the employee number: (Find, 'Empno:',allrec);
Houston
Store the employee number in a variable: v empno = empno
Texas
Set a counter to 1: v ctr = 1
130
Developing ACL Scripts
Read the next record.
If it is not an employee record and the counter equals 1 (ELSE v ctr = 1), then ACL is reading the record containing the name; so the stored employee number and name are extracted to the results file and the counter set to 2: (Extract . . . v ctr = 2).
If it is not an employee record and the counter is greater than 1, then ACL is reading one of the address lines and the record is ignored.
The script code would look like this:
Comment − not employee record or record immediately following it ELSE v_ctr = v_ctr + 1 END
In fact, since ACL doesn’t process the record it contains address information, the last ELSE block in not required and the code could be simplified to: Comment − initialize variable for employee number and counter
Comment − initialize variable for employee number and counter v_empno=blanks(10) v_ctr = 0
v_empno=blanks(10) v_ctr = 0 Comment − test to see if this is an employee record; if yes store empno and set v_ctr to 1
Comment − test to see if this is an employee record; if yes store empno and set v_ctr to 1 GROUP if FIND('Empno:', allrec) v_empno = empno v_ctr = 1
GROUP if FIND('Empno:', allrec) v_empno = empno v_ctr = 1 Comment − test to see if this is the record following an employee record; if yes extract
Comment − test to see if this is the record
field and set v_ctr to 2
following an employee record; if yes extract field and set v_ctr to 2 ELSE if v_ctr = 1 Extract v_empno as 'Empno' SUBSTR(allrec,1,40) as 'Name' to Empno_Name.FIL v_ctr = 2
ELSE if v_ctr = 1 Extract v_empno as 'Empno' SUBSTR(allrec,1,40) as 'Name' to Empno Name.FIL v_ctr = 2 END
131
Fraud Analysis Techniques Using ACL
Note: Within a group, it is not necessary to have the Append option on the EXTRACT command. ACL will automatically append records to the file created within the group-end structure. If the file existed before the first Extract command is executed (previously created), ACL will prompt to see if the user wants to Append or Overwrite the existing file. The prompt can be eliminated by starting the script with SET EXACT OFF; and ending it with SET EXACT ON. The GROUP command gives the script developer the ability to process each record individually—taking desired actions—before ACL reads the next record. If this case, a record is read and the conditional group stores the employee number if the record contains the characters “Empno:”. If the record is the one immediately following the employee number, ACL extracts the stored employee number and the name (Substr(allrec,1,40)). For all other records, ACL does nothing.
Exercise 11a: Invoices on Consecutive Days Using Script Exercises.ACL and file Multi Invoices—identify all consecutive records for the same Vendor if the invoice dates are one day or less apart. If consecutive records are the same vendor and one day or less apart, extract both records to Inv One Day. For example: ABC
2009/09/24
123
$ 901.02
ABC
2009/09/26
137
$ 235.10
ABC
2009/09/27
139
$ 871.21
ABC
2009/09/28
140
$ 912.77
You should extract record 2 and 3 because they are both for ABC vendor and the invoice dates are one day or less apart. You
132
would also extract records 3 and 4 because they are also the same vendor and one day or less apart. Suggested Answer: Script Exercises Answers.ACL—script: Inv One Day. Exercise 11b: Invoices on Consecutive Days2 Using the above example, instead of extracting record 3 twice, once because it is one day after record 2 and once because record 4 is one day after it, only select record 3 once. In other words, if a record has already been extracted, do not extract it a second time. The above file would produce results containing records 2, 3, and 4. Suggested Answer: Script Exercises Answers.ACL—scripts: Inv One Day2 or Inv One Day2b.
LOOP and OFFSET() The LOOP command and OFFSET() function are used together to process complex files types. The LOOP command causes ACL to process a record more than once before moving on to the next record. Loops are frequently used with variable length files that have blocks of repeating fields (sometimes called buckets or segments), where each record contains repeated information that must be processed. The LOOP command must be processed inside a GROUP-END structure. All the commands between LOOP and END are executed repeatedly on the current record until the result of the WHILE test is false and the next record is processed.
Developing ACL Scripts
The basic format of the LOOP command is: GROUP LOOP while condition is True ACL Commands END
However, the file could contain only one record for each customer—each having a different number of sales information segments. The fixed portion would contain the customer name, and this would be followed by a number of sales segments as in this example: Customer
END
The user must be careful not to create an infinite loop by ensuring that the WHILE condition eventually returns false. The SET LOOP command can also be used to specify a maximum number of times ACL will loop, thus preventing an infinite loop from occurring. The OFFSET() function is used to process variably occurring data—that is, where there is a fixed portion to each record and a portion that occurs as a variable a number of times. You can also use OFFSET() with complex record structures in which a block of data (segment) has a variable starting position or an array has a variable number of values.
Qty1
Price1
Qty2
Price2
Qty3
61
12.99
390
15
11.99
ABC
123
25.00
DDF
2
33.17
XYZ
11
41.97
Price3 . . . Qtyn
Pricen
14.23
Often such files will contain a field that tells the user how many segments are on each record as in this example: Seg
Cust
Qty1
Price1
Qty2
Price2
Qty3
3
ABC
123
25.00
61
12.99
390
1
DDF
2
33.17
2
XYZ
11
41.97
15
11.99
Price3 . . . Qtyn
Pricen
14.23
Customer
Quantity
Price
ABC
123
$25.00
ABC
61
$12.99
ABC
390
$14.23
DDF
2
$33.17
Using LOOP and OFFSET(), the user instructs ACL to read each record and loop through the required number of sales segments (LOOP while loop ctr< = Seg) and offset the start position of the quantity and price fields on each loop. So for the first record, ACL would loop through the commands three times (Seg = 3). Typically, on each loop, the customer information and the current values of the quantity and price information would be extracted to a file. This would create a fixed length flat file with multiple records for each customer (as many records as there are sale information segments), each with the customer information and one set of sales information (quantity and price). The above file would be transformed to look like:
XYZ
11
$41.97
Customer
XYX
15
$11.99
ABC
The basic format is: OFFSET(Field, number-of-bytes). For example, suppose you have a data file that contains the customer and the sales information (quantity and price) for all purchases. A typical flat file would have one record for each customer purchase:
Quantity 123
Price $25.00
133
Fraud Analysis Techniques Using ACL
ABC
61
$12.99
ABC
390
$14.23
DDF
2
$33.17
XYZ
11
$41.97
XYX
15
$11.99
In the table layout, Price is defined as starting in column 20 with length 5. For each loop, the starting position of the Price will be offset by (loop ctr ∗ 14). The amount off the offset is equal to width (length) of the repeating segment.
Sometimes you simply want to work with the file directly, not creating a fixed length flat file as in this file:
Record 1: ACL enters the GROUP-END command and sets loop ctr to 0 and v total to 0.00 and enters the first loop for customer ABC:
Seg
Cust
Qty1
Price1
Qty2
Price2
Qty3
3
ABC
123
25.00
61
12.99
390
1
DDF
2
33.17
2
XYZ
11
41.97
15
11.99
Customer ABC loop 1: The first time through the loop, the loop ctr is 0, so the starting point for Price would be position (20 + 0); the v total would be set to 25.00; and loop ctr would be set to 1. Since loop ctr is less than the number of segment (loop ctr< seg), ACL would process record 1 again.
Customer ABC loop 2: The second time through the loop, the starting position for Price would be (20 + 1 ∗ 14) = 34 and Price would contain the value $12.99. v total would be set to 25.00 + 12.99 = 37.99 and loop ctr would be set to 2. Loop ctr is still less than the number of segments so ACL would process record 1 again. Price would start in position (20 + 2 ∗ 14) = 48 and have a value of $14.23; v total = (37.99 + 14.23) = 52.22; and loop ctr is set to 3. Since loop ctr is not less than the number of segments, ACL Extract Customer and v total to Loop Answer and processes record 2 (customer DDF).
Price3 . . . Qtyn
Pricen
14.23
Note: If the number of segments is not included on each record, you can use commands such as OCCURS() to check for the number of decimal points, or LEN(), or RECLEN() to determine the total record length. The number of repeating segments is the record length minus the fixed portion’s length divided by the repeating portion’s length. The user might want to total the sales information (price) for each customer. This could be done as follows: Group Loop_ctr = 0 v_total = 0.00 LOOP WHILE Loop_ctr < Seg v_total = v_total + OFFSET(Price, loop_ctr * 14) loop_ctr = loop_ctr + 1 END Extract Customer v_total to Loop_Answer END
134
Record 2: Loop ctr is set to 0 and v total is set to 0.00; then ACL enters the first loop for customer DDF:
Customer DDF loop 1: Price is 33.17; v total is set to 33.17; loop ctr is set to 1; since loop ctr is not less than the number of segments—that is, ACL extract Customer and v total to Loop Answer and processes the next record (customer XYZ).
Developing ACL Scripts
Record 3: Loop ctr is set to 0 and v total is set to 0.00; then ACL enters the first loop for customer XYZ: Customer XYZ loop 1: Price is 41.97; v total is set to 41.97; loop ctr is set to 1; since loop ctr is less than the number of segments—that is, ACL processes record #3 again. Customer XYZ loop 2: Price is 11.99; v total is set to 41.97+11.99 = 53.96; loop ctr is set to 2; since loop ctr is not less than the number of segments—that is, ACL extracts Customer and v total to Loop Answer. The best way to understand LOOP and OFFSET() is to work with a data file that has repeating segments. Open ACL project Sample Exercises.ACL and table Loop Example. Edit the table layout and notice that only the first quantity and price values are defined. Run script Loop Exercise and examine the results (Loop Answer). The next best way to learn LOOP and OFFSET() is to take a file with repeating fields and create a fixed length flat file. This brings us to the last exercise.
Exercise 12: Loop Extract Using ACL Project Script Exercises and table Segment Ctr, create a flat file that has the Product Number, Contract Number, and Amount as a fixed length flat file. For example, transform: Prod#
Cont# Amt
Cont#
Amt
Cont# Amount
A1
123
214
$100.31
91
A6
14
$50.25
$173.73
$18.17 . . . . . . . . . . . . . . . .
To a flat file that looks like: Prod#
Cont#
Amount
A1
123
$ 50.25
A1
214
$100.31
A1
91
$173.73
A6
14
$ 18.17
Suggested Answer: Script Exercises Answers.ACL—script: Loop Extract.
Applications Menu The Applications menu is designed to let the script developer create a set of application menus (scripts) that can be easily accessed by users. This is done by creating a special text file with an .mnu filename extension. This .mnu Applications menu file
must reside in the ACL program directory if it is to be available for all ACL projects; or.
Must be placed in the same working directory as the ACL project, to be available to the specific ACL project only.
The applications menu can be accessed by selecting Applications from the main menu bar in ACL. A default applications menu exists, called Template.mnu. If the user selects Applications from the menu bar, the menu options from the default template will be displayed. Two .mnu files have been provided with the toolkit: Standard and Standard2. Click Applications on the menu bar while in the Script Exercises Answers project to see the application menus.
135
Fraud Analysis Techniques Using ACL
1. Open Script Exercises Answers.ACL project. 2. Export the scripts: Inv Loc Sum, Top 10 Pay, Generic Classify, and Running Total to .bat files with the same name. 3. Create a new project called Test.ACL and copy all of the table layouts (not the scripts) from Script Exercises Answers.ACL. 4. On the menu bar, select Applications and choose the Standard option. While the menus have been built, they are not supported by the necessary ACL scripts. In the case of the toolkit, the Standard.mnu has been built to run with the scripts you created during the exercises. Using Windows Explorer, locate the Standard.mnu file in the Script Data folder. Open the file with Microsoft WordPad or Notepad as the application to use. You should see the following: MAIN MENU
5
.
Inventory Analysis
DO Inv_Loc_Sum
.
Payroll Analysis
DO Top_10_Pay
.
Generic Classify
DO Generic_Classify
.
Running Total
DO Running_Total
.
Quit ACL
QUIT
.
DO commands could contain the full path to the desired .bat files, such as DO C:\Wiley\Script data\Inv loc Sum. By default, ACL will look for the .bat files in the folder that contains the currently opened ACL project. In order to use the Standard menu or to create your own applications menu, the first step is to export the desired ACL scripts using the Export a Server File option. This will create .bat files for each script that is exported. To enable the Standard.mnu application, perform the following:
136
5. Select Inventory Analysis. This will run the Inv Loc Sum script (DO Inv Loc Sum). Note that the Inv Loc Sum script was added to the TEST.ACL project. However, it is possible to run the .bat scripts through the applications menu and not have the selected scripts written to the project file. In order to do this, each script must be edited and a line added to delete the script. While still in TEST.ACL do the following: 1. Copy all of the scripts from Script Exercises Answers.ACL to TEST.ACL. 2. Edit each script in TEST.ACL, adding the line DELETE SCRIPT Script Name OK. For example, in the Inv Loc Sum script, add the line DELETE SCRIPT Inv Loc Sum OK. 3. Export each script to a .bat file again (overwriting the existing .bat files). 4. Delete the script from the TEST.ACL Project. 5. Run the Standard menu application again, selecting Inventory Analysis. Notice that after running the Inventory Analysis this time, the script was not retained in the ACL project. The DELETE command was
Developing ACL Scripts
needed because when you run an external script from the Applications menu, ACL internalizes the script, creating a copy of it in the Overview tab of the Project Navigator. Once the script is complete, the line DELETE SCRIPT Script Name OK will remove the script from the project.
Building an Application Menu The first step is similar to the exercise just performed—the desired scripts must be exported using the Export a Server File option. The second step is to make copy of the ACL provided template.mnu file or the Standard2.mnu file. Rename and edit the copy, ensuring that the .mnu file follows the strict structure in order for the menu system to work properly. Each menu file consists of blocks: one block for each menu or submenu. Each menu block starts with the “heading” line. The heading line will appear as an actual menu item in the resulting applications menu file. The number of entries listed next to the heading must start in column 37 and does not include the heading line. The maximum number of entries is 14 per menu screen and each entry must immediately follow the heading. Each entry has a description, followed by the command or link. The general structure for any block is as follows: Menu Heading
Number of entries in the menu
Menu entry
Menu command or link
. . . Last menu entry
Menu command or link
Each line must comply with the structure:
The description can have a maximum of 35 characters (the width of the menu box).
The command must start in column 37 and may not extend beyond column 78.
Each line must be terminated with a period (.) in position 78.
Creating Submenus It is also possible to create an entry that points to a submenu and not an ACL command. For example, Standard2.mnu contains the following lines: MAIN MENU
4
.
Inventory Menu
8 menu_definition
.
Payroll Analysis
DO Top_10_Pay
.
Generic Classify
DO Generic_Classify
.
Quit ACL
QUIT
. . . .
Inventory Analysis Menu
3
.
Inventory Analysis by Location
DO Inv_Loc_Sum
.
Inventory Value GT 100
DO Inv_Analysis
.
Running Total
DO Running_Total
.
Inventory Menu is the name of the submenu and to the right of this entry is a line reference number (8 menu definition). This line reference number tells ACL where to find the submenu—that is, how many lines after the first heading line. Since the menu file starts counting with the first line (heading line) as line number zero, the submenu line in Standard2.mnu would be on line 8. The same rules apply to the submenu as those for the main menu. So line 8 is the heading line for the Inventory Analysis menu which has 3 lines.
137
10 Utility Scripts The following scripts have been provided in addition to the Fraud Toolkit scripts. These scripts perform unique tasks that are not easily accomplished by ACL. While every effort has been made to test these scripts, users should verify all results obtained. Error-trapping has not been built into these utility scripts. However, you should be able to accomplish this with the skills learned in previous chapters.
Auto Execute The script (Auto Execute) runs an ACL command against a series of files. All files within a specified directory (folder) will be opened and processed by ACL. The files must have the same table layout and the same extension. For example, the script can be used to access monthly pay files (12) in a directory and append them together to produce a year-to-date file. The script prompts for:
Sample Exercise
Directory (if blank uses current directory).
File extension of the data files.
Using the ACL Project Auto Exec Test.ACL in the directory Script Data\temp, run the Auto Execute script to combine all employee data files (with a .fil extension) into a single file.
The table layout to use to open all selected data files.
The script prompts for:
The output file name.
Directory (if blank uses current directory): blank.
File extension of the data files: .fil.
The table layout to use to open all selected data files: employee.
The output file name: All Employees.
The resulting file will contain all employee records from Employee1.FIL, Employee2.FIL, and Employee3.FIL.
138
Utility Scripts
Extract Values The script (Extract Values) reads a table (table #1) and creates a list of unique values for a user-selected character field. For each unique value of the user-selected character field in table #1, the script opens a second table and identifies records with the same value and extracts these records to a table. For example, if table #1 has a list of part-time employees and table #2 has pay transactions for all employees, the script would open table #1 and produce a list of employee numbers for all part-time employees. It would then open the pay transaction file and extract all records for each part-time employee to separate files (one for each part time employee).
Note: Table #1 and table #2 can be the same file. Then table #1 would be split into separate files. For example, given an Inventory file with a number of locations, the script would create separate files for each location. The script prompts for:
Table #1.
Character field to create the unique values (values form part of output file name).
Table #2.
Character field for comparison with unique values from table #1.
The output is sent to file FILE + the value of the character field containing unique values (selected from table #1). Note: the current script, EXTRACT extracts table #2 values matching unique values in table #1. However, any ACL command could be run by the Extract Values2 script. For example, for each Location in Inventory file, the script could CLASSIFY on Product status to separate files (one per location).
Sample Exercise Using the Inventory file, create separate files for each location. In this case, the user-required responses should be:
Table #1: Inventory.
Character field for unique values: Locations.
139
Fraud Analysis Techniques Using ACL
Table #2: Inventory.
Character field for comparison with table #1 unique values: Location.
This will create files for locations 01–06, and 22: File 01, File 02 . . . File 06, and File 22. Sample Exercise Using the Inventory and Select Locations files, create separate files for each location in the Select Locations file. In this case, the user provided responses should be:
Table #1: Select Locations
Character field for unique values: Loc.
Table #2: Inventory.
Character field for comparison with table #1 unique values: Location.
Character field for comparison with table #1’s unique values: Workdept
Note: Make a copy of the scripts Extract Values and Extract Values2 – rename them to Pay Strat and Pay Strat2 respectively. Ensure that Pay Strat calls subscript Pay Strat2. In Pay Strat2, instead of extracting the record, run the Stratify command for each work department – sending the results to the log file.
Ending Balance Verification The script (Balance Start) uses the beginning balance and transactions to calculate the ending balance. This is compared with the ending balance file.
This will create files for selected locations 05 and 06: File 05 and File 06. Sample Exercise Using the Payroll file, stratify on gross pay for each work department in the Payroll file. In this case, the user provided responses should be:
140
Table #1: Payroll
Character field for unique values: Workdept.
Table #2: Payroll
The data must have a file with the beginning balance and a file with the ending balance. The transactions may be in either one transactions file or two files (additions and subtractions).
Utility Scripts
The user selects three—or four—files and ACL prompts for the names of the required files. In this case, the user selected three files:
For each of the files (beginning, ending, and transactions (or additions and subtractions), the script will prompt for:
The account, that is, the character field for which the balances will be calculated.
The amount, that is, the numeric filed to be totaled and compared.
Note: The fields may have different names in the files, but the length of the account field must be the same in every file. An expression can be built to ensure all lengths are the same. Sample Exercises: Three Files Using the Inv Beg Balance, Inv End Balance and Inv Transactions files, compare the (Beginning + Transactions) to the Ending balances for all accounts. In this case, the userrequired responses should be: The transaction file must contain the additions (positive number) and the subtractions (negative number). If there are four files, the additions file must contain positive amounts and the subtractions file must contain negative amounts. If this is not the case, create an expression to be used by the script. The script will prompt for:
The beginning file name.
The ending balance file name.
The transaction file name or the files containing the additions and the subtractions.
The beginning file name: Inv Beg Balance.
The ending balance file name: Inv End Balance.
The transaction file name: – Inv Transactions.
For each file the user required response should be:
The account field: account.
The amount field: amount.
The results will show the beginning balance, total transactions and the ending balance for each account and will recalculate the ending balance and any difference between ending and recalculated ending balances.
141
Fraud Analysis Techniques Using ACL
Sample Exercise: Four Files Using the Inv Beg Balance, Inv End Balance, Inv Additions and Inv Subtractions files, compare the (Beginning + Additions + Subtractions) to the ending balances for all accounts. In this case, the user required responses should be:
The beginning file name: Inv Beg Balance.
The ending balance file name: Inv End Balance.
The addition file name: Inv Additions.
The subtraction file name: Inv Subtractions.
For each file the user required response should be:
The account field: account.
The amount field: amount.
The results will show the beginning balance, total transactions, and the ending balance for each account and will recalculate the ending balance and any difference between ending and recalculated ending balances.
Running Total The script (Running Total) produces a running total of a specified numeric field. The running total can be for the entire file, or it can be reset every time a character field changes.
142
The script prompts for:
Table to open.
Output file name.
Character field to use for running total – resets to 0.00 when value changes.
Numeric field to use for running total.
Sample Exercise Using the Inventory file, produce a running total for the Quantity on Hand (Qtyoh) for the entire file. In this case, the user-required responses should be:
Table to open: Inventory.
Output file name: Inv Running Total.
Utility Scripts
Character field to use for running total: blank.
Numeric field to use for running total: Qtyoh.
Sample Exercise Using the Inventory file, produce a running total for the Quantity on Hand (Qtyoh) for each location. In this case, the user-required responses should be:
Table to open: Inventory.
Output file name: Inv Running Total by Loc.
Character field to use for running total: Location.
Numeric field to use for running total: Qtyoh.
Maximum and Minimum Values The script (Max and Min Values) determines the maximum and minimum of a user-selected numeric field for a user-selected character field, such as the highest and lowest invoice amounts for a given vendor, or the highest and lowest pay amounts for each employee.
The script prompts for:
Table to open.
Character field: maximum and minimum determined for each unique value of this field.
Numeric field: maximum and minimum numeric values determined for this field.
The results are sent to a file name consisting of the character field selected + max min.
143
Fraud Analysis Techniques Using ACL
Sample Exercise Using the Inventory file, find the maximum and minimum value of products held at each location. In this case, the user-required responses should be:
Table to open: Inventory.
Character field to use for maximum/minimum: Location.
Numeric field to use for maximum/minimum value: Value.
The results will be found in Location max min.
144
Appendix: ACL Installation Process The following outlines the steps involved in the installation process for the education version of the ACL software.
Step 2: Click NEXT:
Note: ACL does not provide technical support for the software that comes with this book. Step 1: Insert the ACL CD-ROM supplied with this book and, at the main screen, click ACL 9 Education Edition:
145
Appendix: ACL Installation Process
Step 3: Accept Terms and click NEXT:
146
Step 4: Enter your user name and company and click NEXT:
Appendix: ACL Installation Process
Step 5: For the type of installation, click Complete:
Step 6: Accept the default directories:
147
Appendix: ACL Installation Process
Step 7: Click Install:
Step 8: Click FINISH:
Double-click on the ACL shortcut on your desktop to run ACL or select from the Microsoft Windows programs menu that can be accessed by clicking the Start button For further information and instructions on running ACL, refer to the documents ACL in Practice.PDF and ACLStart.PDF in the folder C:\ACL Data\Sample Data Files\.
148
Glossary ABSn
Command Log
A special variable created automatically by specific ACL commands, such as Statistics, that contains the absolute value of the field on which that command was last issued. (Note: The “n” shown here and elsewhere in the Glossary stands for an integer, typically 1, which varies according to how the command was issued. Commands issued within a Group command are assigned values from 2 upward.)
An electronic log that records each ACL command you issue, and its results, from the time you open an ACL Project until you close the Project. The information in the log is cumulative. You can view, add comments to, print, or clear the log.
ACL Project A file that contains all ACL table layouts (file definitions, views, reports, workspaces, and scripts). The file containing the data to be analyzed is not part of the Project, but is linked to the ACL Project by way of the table layout. The Project file name uses the extension .acl. See also Table Layout.
AVERAGEn A special ACL variable created automatically by specific ACL commands, such as the Statistics command, that contains the average value of the field on which that command was last issued.
Computed Field A named algebraic expression that uses calculation results or an ACL command to create an additional field. A computed field exists only in the input file definition and is not actually a part of the data file. It is a virtual field that lets you perform calculations based on information in the data file without affecting or changing the original data. You can treat a computed field like an actual physical field. See Physical Field.
COUNTn A special ACL variable created automatically by the Statistics and other ACL commands that contains the number of records counted.
Benford’s Law
Data File
A law that states that, in large sets of data, the initial digits of a numeric field will tend to follow a very predictable pattern, with the initial digit 1 being the most common and digit 9 appearing the least.
A file in which computer-based data is stored; also called a source file. The data file is not actually stored as part of an ACL Project.
CAATTs
DO
Computer assisted audit tools and techniques.
Executes an ACL script or a report file.
149
Glossary
ELSE
IF Command
An ACL command that is used in a group of commands to provide an alternative processing method if all previous tests are false. Else can only be used in a group, never on its own. See also Group.
A command that allows you to specify a condition that must be met in order for the command following it to be executed. The IF command applies to all records and initially must be true for the command to continue processing. For example, in the following command, IF verify 1 VERIFY ALL ERRORLIMIT 1000 ACL will execute the Verify command only if the variable “verify 1” is true at the time the IF command is encountered.
Expression A set of operators and values used to perform calculations, specify conditions for a test, or to create values that do not exist directly in the data. An ACL expression can be a combination of data fields or computed fields, operators, constants, functions, and variables.
Field The individual pieces of information that make up a record in a file. Computer files can have many fields for the same record. See also Variables.
Group A series of commands that is processed as a unit in a single read of a data file. You can create a group in a script, and then run the script to execute the commands in the group. Using groups increases processing speed because ACL performs all group commands against each record before reading the next record, Thus, all group commands can be executed in one pass through the data file, as opposed to reading the entire data file for each command.
IF Condition A scope parameter that selects only certain records from the whole file or limits the execution of ACL commands. For example, COUNT IF AMOUNT > 0. The condition does not depend on the initial value, it is evaluated for each record in the data file.
Improper Data Data that is technically valid, but not correct. For example, negative amounts in a numeric field that should contain only positive amounts.
Incomplete Data Data that contains gaps or blanks.
HIGHn
Index
A special ACL variable created automatically by the Statistics command that contains the nth (by default, 5th) highest value of the field on which that command was last issued.
A method of sequencing data using the Index command. The command creates an index file that contains pointers which allow ACL to read data in the specified order, even though the records in
150
Glossary
the original input file are not sorted. Creating an index file is called “indexing.”
Interactive Script A script that prompts the user for information while the script is running. See also Script.
Macro Substitution Macro substitution is the process of substituting the contents of a variable into a command in place of field names or constants. This is particularly useful when variables cannot be directly used, such as parameter keywords, fields, and file names. Variables created by interactive scripts with the ACCEPT, ASSIGN, or DIALOG commands can be used for macro substitution. The use of macro substitution is identified by a variable enclosed in percent signs (%).
Invalid Data Data in a field that does not match the field type assigned to that field.
Join A command that combines two files based on a common key field. See also Primary File and Secondary File.
Key Field A field used to identify records. Some examples are the field used to Join on, or the field to Summarize on, or the field that a file is sorted on. With some commands and functions, more than one field can form a composite key.
MAXn A special ACL variable created automatically by the Statistics command that contains the highest value of the field on which that command was last issued.
MINn A special ACL variable created automatically by the Statistics command that contains the lowest value of the field on which that command was last issued.
Nested Group A group of commands embedded within another group of commands.
LOWn A special ACL variable created automatically by the Statistics command that contains the nth (by default, 5th) lowest value of the field on which that command was last issued.
Operators Mathematical and logical symbols used in building expressions. Examples are: NOT, AND, OR, +, (), and <.
151
Glossary
Output File
Scope Parameter
A file created by ACL commands. When ACL produces a data file as output it automatically creates the associated table layout. You can use an output file as an input file for further processing and analysis in ACL.
A statement that can be included in an ACL command to limit the number of records to be processed, or to limit the execution of a command or script. Examples are If, WHILE, NEXT, and FIRST. Scope parameters can be used separately or in combination with each other.
Physical Field A physical field refers to data that exists directly in the data files opposed to a virtual or computed field. See also Computed Field.
Primary File The file you are currently working with, which is displayed in the view. You can have only one primary file open at a time. A primary and a secondary file are used in Join and Merge operations. See also Secondary File.
Script A series of ACL commands that are named and stored in a Project for future use. You can design a script to execute repeatedly and automatically, or to prompt the user for information.
Secondary File The second file used by ACL when two files are required. You can have one secondary file. A primary and a secondary file are used in Join and Merge operations. See also Primary File.
RANGEn A special ACL variable created automatically by specific ACL commands, such as the Statistics command, that contains the difference between the MAXn and MINn variables. This gives the range between the highest and lowest value.
Table (Input File) An inclusive term which refers to the data file and the Table Layout taken together.
Table layout Record A unit of related items of information that make up a file. Each record contains individual pieces of information called fields. See also Field.
152
Describes the structure, contents, and layout of a data file and links the data file to the Project. It includes information such as field names, field definitions, and field types. ACL automatically provides some of this information for you, such as the file type, character
Glossary
type, media type, the number of records, and the record length. A table layout is only defined or created once for a given data file.
Variables Variables are used to retain and carry forward information from one command to another. A variable retains its value until it is changed, deleted, or until you quit ACL. The only exception is a variable whose name starts with an underscore ( ). In this case, the variable is stored as part of the ACL Project and is not automatically deleted when you quit. Some commands, such as Statistics, automatically create variables when they are executed.
Workspace A file that stores field definitions which are portable across a number of input file definitions. Usually they are computed fields that refer to the current environment, but they can also be unique field names and descriptions.
WRITEn A special ACL variable created automatically by specific commands, such as Sequence, Gaps, or Duplicates, that contains the number of records written, up to the maximum number set in your preferences.
While A scope parameter that terminates the processing of a file as soon as the associated test fails. This is useful for limiting the scope of commands that otherwise process the whole file.
153
Index A
C
Accept command, 117–118, 121 ACL Projects. See Projects, ACL Applications menu, 135–137 Apply filter option. See Filters Association of Certified Fraud Examiners (ACFE), study by, 2 Audit software benefits of, 6 techniques, 2, 4, 6
CAATTS. See Computer-assisted audit tools and techniques Carriage returns, 24 Case studies dormant accounts, 83–85 duplicate payments, 42–43 general ledger accounts, 29, 82 healthcare audit, 85 help desk calls, 37 long-distance calls, 37, 55 signing authority, contracts, 98 Complete script, 14, 24–29 Complete Dialog script, 20–21, 29 Computed fields, used in cross-tabulation, 31, 33–36 Computer-assisted audit tools and techniques (CAATTs), 3, 6 Continuous auditing, 5 Conventions, script code, xi, 91 Copyright information, displayed, 9, 18 Count command, used within group, 25, 105 Cross Tabs script, 14, 21, 32–33 Cross Tabs Dialog script, 34 Cross-tabulation benefits of, 32–33 challenges of creating, 34 data file requirements, 30 results, reviewing and analyzing, 36–37 running, 30–31 x-axis fields, 31, 34 y-axis fields, 31
B Benford Analysis tests Benford Analysis, running, 89–90 Benford Custom Analysis, running, 90–91 results, reviewing and analyzing, 96–98 scripts described, 93–94 Benford command, 88, 91, 94 Benford script, 91 Benford Analysis script, 92–93 Benford Custom Analysis script, 95–96 Benford Custom Dialog script, 97 Benford Dialog script, 93 Benford’s Law explained, 88 formula, 88 graph, 89 spikes, significance of, 89 understanding, 88 Z-Statistic, 89–91, 93–95, 97 Blanks and type mismatches, reasons for, 22
Customizing scripts example, 12–13 options, 11–14
D Data analysis, generic approach Data analysis software benefits of, 6 techniques, 2, 56 Data Profile script, 14, 21, 56–73 Data Profile tests exact multiples of, 57, 60, 65, 71 frequently used values, 57, 60, 66, 73 items with most exact multiples, 57, 67–68 least/most used items, 57, 68–69, 72 results, reviewing and analyzing, 62–73 round amounts, 58, 60, 65, 72 running, 56–57 statistics, 59, 62 stratify, 60, 63–65 types, 59–60 Define Field command, 46, 106–107 Delete command, 103 Dialog boxes, 8, 11, 118–121 Digit frequencies, identifying unusual, 88 Digital analysis, 88–89 Display command, 102 Drilling down, with filters, 71–72 Dup Dialog script, 21, 44–45 Dup Multiple Keys script, 21, 40–43 Dup Multiple Keys1 script, 21, 46–47
155
Index
Duplicates command, used in Duplicates scripts, 38, 40, 49, 51 Duplicates tests duplicate transactions, legitimate, 38 results, reviewing and analyzing, 47–51 running, 38–40
E Error Trapping, 122–124 Exact Character Comparisons, setting, 17, 20 Exact multiples script, 60, 65–68 Exactsum script, 14, 21, 70–71
F Fields, selecting, 10 Files, selecting, 9 Filters about, 10–11 for Benford Analysis, 96–97 for Data Profile tests, 62, 71–73 ratio analysis tests, 78, 80–82 SET FILTER command, 103 specifying in tests, 11 Filter Selection script code, 11 Fraud costs of, 2 reasons for, 2–3 symptoms, 5, 52 Fraud application, creating your own, 13–14 Fraud detection and data analysis software, 2–3, 6
156
Fraud Menu selecting tests from, 6, 8–9, 16–18 working without, 20–21 Fraud Toolkit application advantages of launching through Start script, 17–18 described, ix launching, 8–10, 16–18 Fraud Toolkit for ACL components, ix how to use, ix Fraud Menu script, 18–19 Freqchar script, 21, 60, 68–69 Freqnum script, 21, 60, 66–69 Frequently used values, 60, 66–67, 73
G Gaps command, 21, 52–55 Gaps Dialog script, 21, 55 Gaps script, 54–55 Gaps test results, reviewing and analyzing, 54–55 running, 52–53 General ledger accounts, case study, 29 Group command, 25, 125 conditional, 126–127 nested, 128 simple, 126
H HEX( ) function, 45–46
I If statements IF command, 43–44, 104, 120, 122 IF parameter, 43–44 Indexes and Benford’s Analysis tests, 90, 93–95 and Data Profile tests, 60 and exact multiples, 67–68 and freqchar, 69 and frequently used values, 66 and Ratio Analysis tests, 78, 80–82 Integrity test, 14, 22–28 Interactive scripts, 100, 117–118, 120–121, 125 Interval sampling, 3 Items with most exact multiples, 60, 67–68
J Join command and cross-tabulation, 37 and custom Benford analysis, 95 and Duplicates test, 45
K KeyChange script, 45–46 KPMG surveys, 2
Index
L Least/most used items, 60, 68–69 LENGTH( ) function, 46 Line feed character, 24 Log files, alternate, creating and viewing, 17–19 Loop, 132–135
M
Ratios, commonly used, 79–80 Ratio1 Dialog1 script, 80 Ratio1 Dialog2 script, 81 Ratio2 Dialog1 script, 84–85 Ratio2 Dialog2 script, 85 REPLACE( ) function, 19 Round amounts, Data Profile test, 58, 60, 65, 67 Round script, 21, 63–66
Macro substitution, 44–45, 120
O Occupational fraud and abuse, statistics on, 2 Offset() function, 132–135 Open command, 103
P Projects, ACL, 6–8 Proportions, actual and expected, compared, 65, 72–73, 89
R Ratio Analysis benefits of, 74 and cross-tabulated files, 37 results, reviewing and analyzing, 81–86 running, 74–77 scripts described, 78–80 types, main, 79–80 Ratio Analysis script, 76 Ratio Analysis1 script, 76–80 Ratio Analysis2 script, 82–84
S Same same different, 38 Script code conventions, xi Scripts benefits of, 6–7 commenting, 113 copying into an ACL Project, 7 creating, 109–113 customizing, 11–12 defined, 6, 109 editing, 113 exiting, 20 interactive, 117, 120–121 interrupting, consequences of, 8, 20 running, 114 sharing, 115–116 Set commands, 102–103 Spikes, significance in digital analysis, 89 Start script activating, 16–18 advantages of running, 8–9, 20
Stratify, 57–58, 60, 63–64 Stat script, 21, 60–61 Statistics, 59, 62 Strat1 script, 21, 62 Strat2 script, 21, 62–63 STRING( ) function, 54 Subscripts advantages of using, 41 called by other scripts, 104, 122 role of, 40–41 special uses for, 122–124 System requirements for Fraud Toolkit application, ix
T Table Layouts copying into an ACL Project, 7–8 and workspaces, 36 Temporary variables, deleting, 8, 17, 20, 26, 36, 46, 54, 61, 80, 96 Transactions duplicate, 38 missing items, testing for, 52–54 review of, manual versus electronic, 3–4
V Variables ACL (system) defined, 104–106 deleting temporary, 8, 17, 20, 103 user defined, 104–106 viewing, 102
157
Index
Verify command, 22, 24–25, 28 VERIFY( ) function, 28
W Workspaces activating, 108–109
158
creating, 107–108 sharing, 109
Y Y-axis fields, 31
X
Z
X-axis fields, 31, 34
Z-statistic, 89–91, 93–95, 97
CUSTOMER NOTE: IF THIS BOOK IS ACCOMPANIED BY SOFTWARE, PLEASE READ THE FOLLOWING BEFORE OPENING THE PACKAGE. This software contains files to help you utilize the models described in the accompanying book. By opening the package, you are agreeing to be bound by the following agreement: This software product is protected by copyright and all rights are reserved by the author, John Wiley & Sons, Inc., or their licensors. You are licensed to use this software on a single computer. Copying the software to another medium or format for use on a single computer does not violate the U.S. Copyright Law. Copying the software for any other purpose is a violation of the U.S. Copyright Law. This software product is sold as is without warranty of any kind, either express or implied, including but not limited to the implied warranty of merchantability and fitness for a particular purpose. Neither Wiley nor its dealers or distributors assumes any liability for any alleged or actual damages arising from the use of or the inability to use this software. (Some states do not allow the exclusion of implied warranties, so the exclusion may not apply to you.)
PRAISE FOR
F R A U D A N A LY S I S T E C H N I Q U E S U S I N G A C L “When people ask me what they can do to better utilize ACL, I tell them, ‘Take an instructor lead course, participate in the ACL Forum, and study (not read, study) David Coderre’s Fraud Analysis Techniques Using ACL.’ I studied this book, and would not be where I am today without it. Even without the anti-fraud material, the book is worth the investment as a tool to learning ACL!” —Porter Broyles, President and founder of the Texas ACL User Group Keynote Speaker at ACL’s 2009 San Francisco Conference, Official ACL Super User
“For individuals interested in learning about fraud analysis techniques or the art of ACL scripting, this book is a must-read. For those individuals interested in learning both, this book is a treasure.” —Jim Hess, Principal, Hess Group, LLC
Your very own ACL Fraud Toolkit—at your fingertips Fraud Analysis Techniques Using ACL offers auditors and investigators:
DAVID CODERRE has over twenty years of experience in internal audit, management consulting, management information systems, system development, and application implementation areas. He is currently President of CAATS (Computer-Assisted Analysis Techniques and Solutions). He is the author of three highly
• Authoritative guidance from David Coderre, renowned expert on the use of computerassisted audit tools and techniques in fraud detection
regarded books on using data
• A CD-ROM containing an educational version of ACL from the world leader in fraud detection software
detection and the Institute of
• An accompanying CD-ROM containing a thorough Fraud Toolkit with two sets of customizable scripts to serve your specific audit needs • Case studies and sample data files that you can use to try out the tests • Step-by-step instructions on how to run the tests • A self-study course on ACL script development with exercises, data files, and suggested answers Filled with screen shots, flow charts, example data files, and descriptive commentary highlighting and explaining each step, as well as case studies offering real-world examples of how the scripts can be used to search for fraud, Fraud Analysis Techniques Using ACL is the only toolkit you will need to harness the power of ACL to spot fraud.
analysis for audit and fraud
Internal Auditors’ publication on Continuous Auditing (GTAG #3).