Computer Telephony Integration Second Edition
OTHER AUERBACH PUBLICATIONS The ABCs of IP Addressing Gilbert Held ISBN: 0-8493-1144-6 The ABCs of TCP/IP Gilbert Held ISBN: 0-8493-1463-1
Information Security Management Handbook, 4th Edition, Volume 4 Harold F. Tipton and Micki Krause, Editors ISBN: 0-8493-1518-2
Building an Information Security Awareness Program Mark B. Desman ISBN: 0-8493-0116-5
Information Security Policies, Procedures, and Standards: Guidelines for Effective Information Security Management Thomas R. Peltier ISBN: 0-8493-1137-3
Building a Wireless Office Gilbert Held ISBN: 0-8493-1271-X
Information Security Risk Analysis Thomas R. Peltier ISBN: 0-8493-0880-1
The Complete Book of Middleware Judith Myerson ISBN: 0-8493-1272-8
A Practical Guide to Security Engineering and Information Assurance Debra Herrmann ISBN: 0-8493-1163-2
Computer Telephony Integration, 2nd Edition William A. Yarberry, Jr. ISBN: 0-8493-1438-0 Cyber Crime Investigator’s Field Guide Bruce Middleton ISBN: 0-8493-1192-6 Cyber Forensics: A Field Manual for Collecting, Examining, and Preserving Evidence of Computer Crimes Albert J. Marcella and Robert S. Greenfield, Editors ISBN: 0-8493-0955-7 Global Information Warfare: How Businesses, Governments, and Others Achieve Objectives and Attain Competitive Advantages Andy Jones, Gerald L. Kovacich, and Perry G. Luzwick ISBN: 0-8493-1114-4 Information Security Architecture Jan Killmeyer Tudor ISBN: 0-8493-9988-2 Information Security Management Handbook, 4th Edition, Volume 1 Harold F. Tipton and Micki Krause, Editors ISBN: 0-8493-9829-0
The Privacy Papers: Managing Technology and Consumers, Employee, and Legislative Action Rebecca Herold ISBN: 0-8493-1248-5 Secure Internet Practices: Best Practices for Securing Systems in the Internet and e-Business Age Patrick McBride, Jody Patilla, Craig Robinson, Peter Thermos, and Edward P. Moser ISBN: 0-8493-1239-6 Securing and Controlling Cisco Routers Peter T. Davis ISBN: 0-8493-1290-6 Securing E-Business Applications and Communications Jonathan S. Held and John R. Bowers ISBN: 0-8493-0963-8 Securing Windows NT/2000: From Policies to Firewalls Michael A. Simonyi ISBN: 0-8493-1261-2 Six Sigma Software Development Christine B. Tayntor ISBN: 0-8493-1193-4
Information Security Management Handbook, 4th Edition, Volume 2 Harold F. Tipton and Micki Krause, Editors ISBN: 0-8493-0800-3
A Technical Guide to IPSec Virtual Private Networks James S. Tiller ISBN: 0-8493-0876-3
Information Security Management Handbook, 4th Edition, Volume 3 Harold F. Tipton and Micki Krause, Editors ISBN: 0-8493-1127-6
Telecommunications Cost Management Brian DiMarsico, Thomas Phelps IV, and William A. Yarberry, Jr. ISBN: 0-8493-1101-2
AUERBACH PUBLICATIONS www.auerbach-publications.com To Order Call: 1-800-272-7737 • Fax: 1-800-374-3401 E-mail:
[email protected]
Computer Telephony Integration Second Edition William A. Yarberry, Jr.
AUERBACH PUBLICATIONS A CRC Press Company Boca Raton London New York Washington, D.C.
AU1438_frame_fm Page iv Tuesday, November 5, 2002 12:07 PM
Library of Congress Cataloging-in-Publication Data Yarberry, William. Computer telephony integration / William A. Yarberry, Jr.—2nd ed. p. cm. Includes index. ISBN 0-8493-1438-0 (alk. paper) 1. Internet telephony. 2. Digital telephone systems. I. Title. TX5105.8865.Y37 2002 621.382′12—dc21
2002034282 CIP
This book contains information obtained from authentic and highly regarded sources. Reprinted material is quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable efforts have been made to publish reliable data and information, but the authors and the publisher cannot assume responsibility for the validity of all materials or for the consequences of their use. Neither this book nor any part may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, microfilming, and recording, or by any information storage or retrieval system, without prior permission in writing from the publisher. The consent of CRC Press LLC does not extend to copying for general distribution, for promotion, for creating new works, or for resale. Specific permission must be obtained in writing from CRC Press LLC for such copying. Direct all inquiries to CRC Press LLC, 2000 N.W. Corporate Blvd., Boca Raton, Florida 33431. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation, without intent to infringe.
Visit the Auerbach Publications Web site at www.auerbach-publications.com © 2003 by CRC Press LLC Auerbach is an imprint of CRC Press LLC No claim to original U.S. Government works International Standard Book Number 0-8493-1438-0 Library of Congress Card Number 2002034282 Printed in the United States of America 1 2 3 4 5 6 7 8 9 0 Printed on acid-free paper
AU1438_frame_fm Page v Tuesday, November 5, 2002 12:07 PM
Dedication To Carol, Will, Libby, and my parents. Thank you for your love and support.
AU1438_frame_fm Page vi Tuesday, November 5, 2002 12:07 PM
AU1438_frame_fm Page vii Tuesday, November 5, 2002 12:07 PM
Contents Chapter 1 Telephony Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . History of Telecommunications . . . . . . . . . . . . . . . . . . . . . . . . . . . . PSTN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . POTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Local Loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Types of Carriers in the United States . . . . . . . . . . . . . . . . . . . . . Carrier Structure and Numbering Scheme . . . . . . . . . . . . . . . . . . . . International Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Digital Communications Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . Common Circuit Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . ISDN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . QSIG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Carrier Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Telephony Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Equipment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . PBX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Feature Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Centrex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1 1 1 2 2 4 4 6 7 8 11 13 14 15 15 18 18 21 21 25 26
Chapter 2 IP Telephony . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fundamental Change in Communications . . . . . . . . . . . . . . . . . . What Is IP Telephony? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IP Telephony Momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Trends . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IP Telephony Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . High-Level Configuration and Variance from TDM Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fundamental IP Telephony Process . . . . . . . . . . . . . . . . . . . . . . . End-User Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27 27 27 28 28 28 29 29 31 32 32 vii
AU1438_frame_fm Page viii Tuesday, November 5, 2002 12:07 PM
Contents Screen Phones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IP Softphones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Softphones for PDAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Feature Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Communications Servers (Nobody Makes PBXs Anymore). . . . Gateways and Gatekeepers . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gateways . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Economics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IP Links Already in Place . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . International Communications . . . . . . . . . . . . . . . . . . . . . . . . . Compression and CODECs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Transmission Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Voice-over-IP (VoIP). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Voice-over-ATM (VoATM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Voice-over-Frame Relay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cost Factors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Quality of Service and Delay Sensitivity . . . . . . . . . . . . . . . Equipment and Configuration Considerations . . . . . . . . . . Voice-over-MPLS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Roadblocks to IP Telephony . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gatekeepers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Signaling Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Example Applications of IP Telephony . . . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
32 34 34 35 35 36 36 36 38 38 39 39 39 39 41 41 42 43 43 44 45 45 46 47 47
Chapter 3 CTI Concepts and Applications . . . . . . . . . . . . . . . . . . . . Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . General Functions of CTI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Basic Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . APIs and CT Standards. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . TAPI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . TSAPI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . JTAPI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CSTA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Linux Telephony API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ActiveX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Using Component Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Distributed versus Desktop CT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . SMS Controls. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Interoperability Standards. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . MVIP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . H.100 and H.110 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . S.100, S.200, and S.300 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . H.323 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49 49 49 50 52 53 55 55 58 58 59 59 60 60 61 61 62 62 62
viii
AU1438_frame_fm Page ix Tuesday, November 5, 2002 12:07 PM
Contents SIP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LDAP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Develop versus Buy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Application Generators and CT Architecture . . . . . . . . . . . . . . . . . . Middleware Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . An Example Middleware Product: Concentric Solutions . . . . . . Other Examples of CTI Applications . . . . . . . . . . . . . . . . . . . . . . . . . Travel Plan Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Banking Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . How to Use the Demo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Another Help Desk Package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
63 63 63 64 65 65 74 74 75 75 76 77
Chapter 4 Interactive Voice Response . . . . . . . . . . . . . . . . . . . . . . . Why IVR? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IVR Feature Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Representative Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hardware. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Development Software Example: Script Builder, Voice@Work, and @Work Studio . . . . . . . . . . . . . . . . . . . . . . . . . . Applications of IVR. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . General Benefits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Specific Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Outbound Messaging and Audiotext. . . . . . . . . . . . . . . . . . . . . . . Call Center Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fax-on-Demand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Applications Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Omnivox Omniview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pronexus VBVoice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Speech Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Beyond Touch-Tone. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Voice Dialer — Parlance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Name Dialer — Avaya . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Text-to-Speech (TTS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VoiceXML. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Some Alternative Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . IVR Performance and Tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sizing and Capacity Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sizing the IVR System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IVR Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Good Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reporting and Remote Monitoring . . . . . . . . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
79 79 80 82 82 84 85 85 86 87 88 89 90 90 92 95 95 96 97 99 99 100 102 103 103 104 105 105 105 107 ix
AU1438_frame_fm Page x Tuesday, November 5, 2002 12:07 PM
Contents Chapter 5 Unified Messaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Benefits of UM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Basic Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Varieties of UM and Design Considerations . . . . . . . . . . . . . . . . . . . VPIM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A UM Package Checklist. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Internet Call Waiting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
109 109 109 110 113 114 116 117 117
Chapter 6 Wireless Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wireless Applications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Extensions Off the PBX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cordless PBXs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wireless Handset Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wireless Headsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bluetooth Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Some Technologies Relevant to Bluetooth . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
119 119 119 119 121 121 122 122 124 125
Chapter 7 Contact Center Technology and Management . . . . . . . Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Contact Center Management and Standards for Agent Performance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Workforce Management and Forecasting . . . . . . . . . . . . . . . . . . . . . CRM Analysis and Data Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Good IVR Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Agent Recording and Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . Multi-Site Design and Technical Architecture. . . . . . . . . . . . . . . . . . Integrated Features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Web Integration and the Multimedia Call Center . . . . . . . . . . . . . . . Example Internet Contact Center . . . . . . . . . . . . . . . . . . . . . . . . . . . . Contact Center Physical Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Predictive Dialing Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Contact Center Trends. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
129 129
Chapter 8 Telemanagement and Outsourcing . . . . . . . . . . . . . . . . Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Ideal Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Caveat Emptor: The Downside . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Negotiating the Agreement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . SLAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Telemanagement Example: QuantumShift. . . . . . . . . . . . . . . . . . . . .
159 159 159 160 161 163 166
x
130 138 139 140 141 146 147 148 149 154 155 156 157
AU1438_frame_fm Page xi Tuesday, November 5, 2002 12:07 PM
Contents Call Center Outsourcing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 Chapter 9 Telecom Cost Management and Call Accounting . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . How to Reduce Telecom Expenses (The “Cliff Notes” Version) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Getting Started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Develop a Cost Management Vision . . . . . . . . . . . . . . . . . . . . . Get a First Cut on the Expenses . . . . . . . . . . . . . . . . . . . . . . . . Auditing Telecom Expenses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Technical and Structural Changes. . . . . . . . . . . . . . . . . . . . . . . . . On-Net/Off-Net . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IP Telephony and Voice-over-IP. . . . . . . . . . . . . . . . . . . . . . . . . Wide Area Networking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Negotiate Favorable Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . Consider Outsourcing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Negotiating Carrier Rates and Services. . . . . . . . . . . . . . . . . . . . . . . Getting Started: Collecting Data on the Current Environment . . . . Getting the Best Deal: A Negotiating Checklist. . . . . . . . . . . . . . . . . A Comparison Spreadsheet. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Outsourced Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Monitoring Carrier Service Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . Example of Carrier Service Level Specifications . . . . . . . . . . . . . . . Maintaining Optimum Discounts in a Decentralized Organization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Service Levels and Organizational Requirements . . . . . . . . . . . . . . Call Accounting and Telephony Management Systems. . . . . . . . . . Call Accounting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Call Detail Recording Systems . . . . . . . . . . . . . . . . . . . . . . . . . . Call Buffer Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Call Accounting System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Implement a Directory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cost Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Basic Calling Accounting Reporting . . . . . . . . . . . . . . . . . . . . . . . Right to Privacy and Telephony Policy . . . . . . . . . . . . . . . . . . . . Cabling and Wiring Management System . . . . . . . . . . . . . . . . . . Elements of the Cable Infrastructure . . . . . . . . . . . . . . . . . . . . Types of Cabling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Getting the Cabling Information into the Database . . . . . . . . Asset Inventory and Management . . . . . . . . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
171 171 171 173 173 173 174 177 177 179 180 180 182 183 183 184 188 188 190 192 193 194 194 195 195 196 197 197 197 198 201 202 203 204 205 206 207
Chapter 10 Preparing the Request for Proposal . . . . . . . . . . . . . . . 209 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 xi
AU1438_frame_fm Page xii Tuesday, November 5, 2002 12:07 PM
Contents Request for Proposal versus Request for Quotation . . . . . . . . . . . . RFP Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Educating the Selection Committee. . . . . . . . . . . . . . . . . . . . . . . . Organizational Participation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Consulting Expertise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Format of the RFP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The “Rehab” Option. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Evaluation of Responses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Quantitative Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Qualitative Factors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Financial Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
209 210 210 211 211 213 213 214 214 215 216 218
Chapter 11 Telephony Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Toll Fraud. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Maintenance Port Protection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Common-Sense Calling Restrictions . . . . . . . . . . . . . . . . . . . . . . . Toll-Fraud Insurance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tight Controls over Tandem Trunk Calling . . . . . . . . . . . . . . . . . Forwarding of Extensions to Dial Tone . . . . . . . . . . . . . . . . . . . . . Operators, Employees, and Social Engineering Techniques . . . Third-Party Charges. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Call Accounting Reports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Calling Card Theft . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fax-on-Demand Server. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Internal Billing System. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hardware/Software Monitoring and Toll-Restricting Tools . . . . Business Loss Due to Disclosure of Confidential Information . . . . Malicious Pranks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wireless Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The ISO Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wireless Risks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Defenses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Awareness and Simple Procedures . . . . . . . . . . . . . . . . . . . . . . . . Technical Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Hole in the Fabric of Wireless Security . . . . . . . . . . . . . . . . . . . Traditional Security Methods Still Work . . . . . . . . . . . . . . . . . . . . Auditing Wireless Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Using Security Tools to Offer More Services. . . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
221 221 222 222 222 224 224 225 225 225 225 225 226 227 227 227 229 229 230 233 233 234 234 235 236 236 236 238 240
Chapter 12 Implementing Telephony Systems . . . . . . . . . . . . . . . . . 241 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 xii
AU1438_frame_fm Page xiii Tuesday, November 5, 2002 12:07 PM
Contents The Project Team . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The User Advisory Group/Implementation Committee . . . . . . . . . Survey of the Current Environment . . . . . . . . . . . . . . . . . . . . . . . . . Nonstop Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Station Reviews. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . End-User Training. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Build the Dial Plan, Class-of-Service, and Routing Tables. . . . . . . . Equipment Readiness and Rollout . . . . . . . . . . . . . . . . . . . . . . . . . . . Software Installation for the Switch . . . . . . . . . . . . . . . . . . . . . . . . . . Adjunct Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Set Up Help Desk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Perform a Preparedness Review. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Detailed Cutover Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Backout Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 13
241 242 243 246 246 247 248 250 250 251 251 252 252 253 254
Trends and Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
Appendixes Appendix A Web Sites of Interest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 Appendix B Recommended Reading . . . . . . . . . . . . . . . . . . . . . . . . . . 275 Appendix C CTI Success Stories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277 Appendix D Telecom Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 Appendix E Sample Service Level Agreement . . . . . . . . . . . . . . . . . . 333 Appendix F
Sample Request for Proposal . . . . . . . . . . . . . . . . . . . . . 341
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
xiii
AU1438_frame_fm Page xiv Tuesday, November 5, 2002 12:07 PM
AU1438_frame_fm Page xv Tuesday, November 5, 2002 12:07 PM
Preface The author is gratified by the response to the first edition of Computer Telephony Integration. Moore’s law, in contrast to the U.S. stock market, has held up admirably since the first edition was published in late 1999. Faster chips and (at last) the serious proliferation of IP telephony have combined to offer a cornucopia of features and applications. Organizations from SOHOs up to the Fortune 100 now have the flexibility to deploy telephony solutions wherever they make sense — in the PC, the server, and the IP telephone itself. Wireless applications, combining telephony and data, have proliferated wildly and are poised for even more growth when wireless broadband (third generation) achieves critical mass. Call centers, the major users of CTI, are increasingly designing their own call flows and applications, as the tools become easier to use. Finally, TAPI and TAPI tool kits — backed by Microsoft — continue to increase in mind share/market share; it is a lot easier in 2002 to telephony-enable an application than it was a few years earlier. Based on responses from readers and changes in the telephony industry, the second edition has been significantly revised. Chapters on IP telephony, call centers, cost management, and security have been expanded. For the benefit of those of us entangled and bespattered by the proliferation of TLAs (three letter acronyms), a glossary has been added. Reader corrections, comments, praise, or chastisements are always welcome. Please send e-mails to the author at
[email protected].
xv
AU1438_frame_fm Page xvi Tuesday, November 5, 2002 12:07 PM
AU1438_frame_fm Page xvii Thursday, November 14, 2002 8:51 AM
Acknowledgments Many individuals have contributed both directly and indirectly to this book. My editor, Christian Kirkpatrick, has once again provided encouragement, advice, and gentle prodding to keep the chapters flowing. Special thanks to Ike Zach of Idrontas Corporation, Frank Marino of Southwest Telecom Consulting, Kevin Avery of Spanlink Communications, Tim Lootens of Genesys Labs, and Lee Miller of the Southwest Technology Group. Illustrative materials were contributed by Susan Cohen of Nice Systems, Joane Lowry from Quintum Technologies, and Heather Cole from Witness Systems. Juanita Mendez from HP Compaq provided valuable insights into the CTI applications used in large international call centers. And finally, the support of my wife and children during my “voluntary incarceration” while writing the book proved invaluable. WILLIAM A. YARBERRY, JR. Houston, Texas
xvii
AU1438_frame_fm Page xviii Tuesday, November 5, 2002 12:07 PM
AU1438_frame_fm Page xix Tuesday, November 5, 2002 12:07 PM
About the Author William A. Yarberry, Jr., CPA, CISA, is vice president, cost management solutions, with Idrontas Corporation based in Houston, Texas. He was previously a senior manager with PricewaterhouseCoopers, responsible for telecom and network services in the Southwest region. William has over 24 years of experience in IT application development, internal audit management, outsourcing negotiations and administration, and telecommunications management. His professional experience is in enterprise communications, with an emphasis on cost containment and server-based voice technologies, such as CTI, IVR, and Voice-over-IP. He is the co-author of Telecommunications Cost Management, as well as the author of the first edition of Computer Telephony Integration. Prior to joining PricewaterhouseCoopers, William was director of telephony services for Enron Corporation. He was responsible for the operations, planning, and architectural design for PBXs and related voice communications systems supporting more than 7000 employees. Now living in Houston, Texas, William is a University of Tennessee Phi Beta Kappa graduate in chemistry; he earned an MBA at the University of Memphis. He enjoys reading, tennis, jogging, occasional scuba diving, and spending time with his family.
xix
AU1438_frame_C01 Page 1 Tuesday, November 5, 2002 12:07 PM
Chapter 1
Telephony Basics Telephony, and Telephony’s laws lay hid in night God said, Let Alexander be! And all was light. An adaptation from Alexander Pope
INTRODUCTION CTI (computer telephony integration) applications must mesh with the existing, legacy telephony environment. Organizations typically amortize their PBX (private branch exchange) investment between five and ten years, so new applications will need to fit the older telephony standards for the years to come. The sections below outline the fundamental telephony components upon which CTI protocols and applications reside. Although IP (Internet Protocol) telephony is replacing the existing circuit switched architecture, it is important to understand the history and fundamental concepts of “traditional” telephony (sometimes called TDM, time division multiplexing). In the next two chapters, we will look at IP telephony in depth. Here we look at basic concepts that must be supported — the same business functions must be accomplished whether done by IP telephony or legacy TDM technology. Also keep in mind that the enormous worldwide investments in circuit-switched/TDM technology means that we will be living with mixed architectures for years to come. HISTORY OF TELECOMMUNICATIONS Alexander Graham Bell, a Scottish-born American inventor, patented the first commercial telephone on Valentine’s Day, 1876, just two hours before a similar patent was filed by Elisha Gray of Chicago. By 1878, the first commercial telephone exchange was brought into service in New Haven, Connecticut. The earliest telephones were sold in pairs — the purchaser was supposed to run wires directly between the two locations that needed to communicate. It did not take long for the nascent telephone company to realize that private wires strung over trees and buildings were going to be impractical if deployed on a large scale. In fact, to have all points on a network talk to all other points, N * (N – 1)/2 connections are required, where N = the 1
AU1438_frame_C01 Page 2 Tuesday, November 5, 2002 12:07 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION number of points. Hence, the development of the “switch” that allows circuits to be opened and closed from a central point rather than having “nailed up” point-to-point connections everywhere. The first CO (central office) switches were operated by young men (who soon proved too rude and were replaced by young ladies) connecting circuits physically with wires. Later, the automated switch allowed dialing without operator intervention. Eventually, the technology found its way into private businesses with large offices and the PBX was established. The PBX, at its most basic, offers line consolidation and reduction in resources required from the CO. Generally, one can assume a 1:10 ratio of employees who are using the telephone (off hook) versus those who are not. Thus, if an office building has 2000 employees, only 200 circuits from the CO are needed at one time. There are circumstances that require “nonblocking” circuits (e.g., emergency lines, the CEO’s telephone, key employees’ telephones). Of course, this model breaks down under unusual circumstances, such as a disaster (all employees trying to call relatives and friends at the same time). PSTN The public switched telephone network (PSTN) is, in some ways, the technical equivalent of the Great Wall of China. In less than one century, the entire Earth has been linked by gossamer strands of copper, fiber, satellite links, and cellular transmissions. While half of the world has never used a telephone, the other half has — a remarkable achievement in a world still plagued by poverty and lack of infrastructure. Hence, our study of telephony starts with the public network; interconnection between premises equipment and the PSTN is essential. POTS POTS (plain old telephone service) is the traditional telephony we have all known since childhood. From the old black rotary phones that, remarkably, still work to the coolest home sets, the functionality is driven by the large switches that run the network. Standard devices that attach to the PSTN include the rotary telephone (500 type), the Touch-Tone phone (2500 type), and the modem. Some typical features supported by POTS include: • 911 emergency services. For landlines, the caller’s location can be determined; hence, those calling in and unable to speak (due to an injury, etc.) can still be located. Location information for wireless phones is shown only in those areas that have implemented E911 (enhanced 911). 2
AU1438_frame_C01 Page 3 Tuesday, November 5, 2002 12:07 PM
Telephony Basics • Caller ID. Shows the number/name of the calling party. Note that caller ID is an “in-band” signal (meaning that it is sent via the same channel as the voice traffic). A variant, ANI (automatic number identification), is sent on a separate channel in larger pipes called PRIs (discussed later in this chapter in the ISDN section). • Call waiting. Notifies a subscriber who is on a call that another party is trying to reach him. The subscriber can toggle between both calls. • Speed dialing. Frequently dialed numbers can be programmed so that only one or two digits are required for dialing. • Call forwarding. Calls dialed as one number are rerouted to another number. Typically, this can be changed dynamically so that a person who is traveling can be easily reached. • Special ringing. Some telephone companies offer a second number that will terminate at the same place as the original number but will have a distinctive ring. For example, Sprint local service has a “SignalRing” service that enables a distinctive two-ring sound. For example, if your telephone number is 281-358-1234, then another number could be assigned, 281-358-5678, that, when dialed, would ring in a distinctive way. You could give your boss or most important customer the SignalRing number, and all others your primary number. You could then tell if someone important is calling by the pattern of the ringing. • Automatic recall. Any missed call can be returned with a special code. This is somewhat less necessar y now, because many phones equipped with a number display allow the calling number to be redialed. One problem with this approach is that most mid- to large-sized organizations will send out a caller ID that equals a main number or even a “bill to” number. It may not be a dialable number. Organizations may elect to send the calling number instead of a main number; to do so, they must merely change a PBX parameter. • Three-way dialing. A third party can be added to a call. This function can also be achieved using a second line on a phone with two lines and a conference button. However, the service from the telephone company provides a more balanced volume, so that one party is not significantly louder than another. • Call screening. Using a predefined list, calls can be accepted, rejected, or forwarded. • Customer-originated trace. After a subscriber receives an annoying or harassing call, a code can be keyed on the telephone to notify local police. • Voice mail. An advantage of carrier-based voice mail versus the use of a residential answering machine is that callers will never get a busy signal — calls are either answered or they go to voice mail.
3
AU1438_frame_C01 Page 4 Tuesday, November 5, 2002 12:07 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION There are other features that may or may not be offered by the local carrier. These custom features are enabled by the SS7 (Signaling System 7) network, which will be discussed later in the book. In an increasingly complex information economy, POTS continues to have appeal. It has an extremely high uptime; derives power from the local carrier; is simple to use; and, as one hotel company advertised a few years back, it has the advantage of “no surprises.” Local Loop Supporting POTS and other telephony/data services, the local loop serves as the capillaries of the communications bloodstream. A star topology is used (see Exhibit 1) to create a hierarchical system. Many schemes have been devised to replace the copper-based (twisted pair) local loop. Fixed wireless, pure fiber, voice over cable, and even voice over power lines have been proposed. However, the daunting financial investment required and technological uncertainty have combined to slow this needed infrastructure upgrade. Futurists have noted that even cable modem (generally faster than DSL, digital subscriber line) does not have the bandwidth ultimately required to pipe voice, data, and high-definition TV shows into the home. Thus, the halting pace of local loop replacement may be a blessing in disguise; ultimately, either full fiber to the home/office or perhaps shortrange, extremely high bandwidth wireless will likely be the answer. Types of Carriers in the United States In the past, it was easier to classify carriers. They were either LECs (local exchange carriers) or IXCs (interexchange carriers). LECs carried local, intraLATA (local access transport area), and intrastate traffic; IXCs carried interstate and international traffic. When AT&T divested its 22 Bell System operating companies, they were eventually regrouped into seven Regional Bell Operating Companies (RBOCs). The RBOCs offer intraLATA services — long distance service within a limited geographic area. Unfortunately, the terms have become somewhat vague as some CLECs (competitive local exchange carriers) carry long distance traffic, and some IXCs carry local traffic (e.g., AT&T and MCI are in both businesses). The RBOCs are sometimes called ILECs, incumbent local exchange carriers. Another term is reseller, for those who buy line capacity and facilities from an existing carrier and then sell it to end consumers at a markup. Since the 1996 Telecommunications Act, the term CLEC has entered the mainstream. CLECs either build their own infrastructure for local service or lease local loops from the existing LEC. Various cable TV, electrical power, ISP, and cellular telephone companies have established CLEC subsidiaries. 4
AU1438_frame_C01 Page 5 Tuesday, November 5, 2002 12:07 PM
Telephony Basics
Exhibit 1.
Local Loop Topology
One of the conditions mandated by the 1996 act was that any ILEC that wanted to get into the long distance business had to open its transmission facilities to competitors. For example, Capitol Communications in Houston, Texas, obtains facilities (lines, switches, etc.) from Southwestern Bell (SBC) in those locations where Capitol lacks fiber and copper connections. Capitol is another example of a CLEC that offers long distance services. The 1996 act also requires the Bell telephone companies to unbundle components for resale. Thus, competitors could elect to buy only voice mail or caller ID. One of the drivers in all this is the extremely high cost of local service buildout. As mentioned earlier, AT&T spent more than a century installing copper to hundreds of millions of homes, businesses, plants, and government organizations. Duplicating that infrastructure quickly is analogous to “reforesting” the Sahara Desert in a year; hence, the necessity of local service resale. Presumably, the long distance business partially compensates the ILECs for their loss of local business. However, given the rapidly declining per-minute revenues from voice long distance services, it is not clear whether the offset is there.
5
AU1438_frame_C01 Page 6 Tuesday, November 5, 2002 12:07 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION Most carriers now use digital switches in their central offices. Some of the most common switches include: • Avaya 5ESS • Northern Telecom DMS-100 (the SL100 is the commercial premises version of the DMS-100) • Siemens EWSD • NEC NEAX 61E CARRIER STRUCTURE AND NUMBERING SCHEME Dialing plans are critical — both for the public network and individual organizations. With the proliferation of cellular telephones, second lines, pagers, and normal business growth, the North American numbering plan (NANP) has received considerable attention in recent years. Each time a new area code is added, routing tables of the PBX must be updated. NANP is structured as follows. A ten-digit dial plan is divided into two parts. The first three digits are the numbering plan area (NPA), more commonly known as the area code. The remaining seven digits are also divided into two sections. The first three numbers denote the central office code (the exchange), and the remaining four digits represent a station number (extension). NPA (area codes): • N is a value of 2 through 9. • The second digit is a value of 0 through 8. • The third digit is a value of 0 through 9. A “1” in both the second and third digits has special significance: Number
Meaning
211 311 411 511 611 711 811 911
Nonemergency public service Reserved for future use Directory help Reserved for future use Repairs Reserved for future use Business office of CO Emergency
There are also service access codes in the 700, 800, and 900 series (for tollfree and surcharge services).
6
AU1438_frame_C01 Page 7 Tuesday, November 5, 2002 12:07 PM
Telephony Basics NANP also provides additional central office codes: Number
Meaning
555 844 936 950
Toll directory help Time Weather Access to interexchange carriers under Feature Group “B” access Plant testing Plant testing Information delivery services
958 959 976
There are some special two-digit prefix codes used (other than the usual “1” for long distance services): Number
Meaning
00 01 10
IXC operator help International direct distance dialing Used to dial “equal access” IXC; in most areas of the United States, callers can pick their IXC of choice by dialing an access code in the form 10xxx (see below) Custom calling service
11
To allow callers to choose their long distance company (e.g., use MCI from a telephone that would normally use AT&T if the caller dials “1”), the following equal access codes have been defined (new ones are being added routinely): Number
Meaning
031 222 223 234 288 333 432. 464
ALC/Allnet MCI Cable and wireless ACC Long Distance AT&T Sprint Litel Wiltel
International Structure The ITU-T (International Telecommunications Union-Telecommunications Services Sector) based in Geneva, Switzerland, established a dialing plan in the 1960s:
7
AU1438_frame_C01 Page 8 Tuesday, November 5, 2002 12:07 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION Zone (First Digit)
Meaning
1 2 3 4 5 6 7 8 9
North America Africa Europe Europe Central and South America South Pacific Russia Far East Middle East and Southeast Asia
After the first digit noted above, each country will have a one- to threedigit country code assigned to it. Each country has an access code that callers must use to dial international calls (some countries have the same access code; the United States and Canada, for example, both use 011 for international calling). All the above dialing plans address the public network. Many larger organizations with dedicated links (or merely IP connection points over the Internet) will have a private dialing plan that minimizes costs (least cost routing) and simplifies dialing for the end user. DIGITAL COMMUNICATIONS CONCEPTS As communications technologies have matured, layer upon layer of abstraction has been added to the sum total of communications infrastructure. For example, a perusal of Claude Shannon’s 1948 paper on information theory is a daunting task for all but the most accomplished mathematician. Fortunately, as technologies have evolved, so have standards and intelligent software. For the communications manager and project leader, it is often enough to understand the concept and the available products. Following are some of the communications techniques used in telephony systems and networks: • CODEC (coder-decoder). Converts a signal from analog form to digital signals that can be used by modern PBXs and transmission devices (e.g., Cisco 3810 ATM boxes). The same equipment and algorithms convert the signal back in the speaker or earpiece so that humans can understand the message. In some cases, the CODEC is located in the PBX; and in other systems, the handset contains the hardware necessary to do the conversion. The term CODEC has most recently been associated with videoconferencing, where it refers to compression and conversion to digital form for transmission over long distance lines. Exhibit 2 illustrates basic CODEC functions. • Pulse code modulation (PCM). PCM is the most common means to encode a voice signal into a digital bit stream. For toll-quality speech, an 8
AU1438_frame_C01 Page 9 Tuesday, November 5, 2002 12:07 PM
Telephony Basics Analog Signal
Digital Signal CODEC
Analog Signal CODEC
Sampling of Analog Signal (Quantization into Discrete Values)
Time
Exhibit 2.
Basic CODEC Functions
analog signal is sampled 8000 times per second using eight bits to record the results. Only 4 kHz of bandwidth is needed to produce good-quality speech. Note, however, that because there are only 256 possible combinations of an eight-bit binary number, speech that is digitized can never be “perfect” using this scheme. Of course, it is certainly adequate for most purposes. Speech is understandable (but not pleasant) at sampling rates/compressions much less than 8000 per second. There are two implementations of PCM: µ-law PCM and A-law PCM. µ-law is used in North America and Japan and uses a process termed companding to enhance the signal-to-noise ratio (companding compresses the amplitude range for economical transmission and then expands it back at the receiving end). In Europe, a slightly different companding technique is used, resulting in a different standard called A-law. Both provide for excellent voice quality and modem transmissions. • Voice coding/compression techniques. From the viewpoint of the organization trying to minimize communication costs, reducing the bandwidth from 64 kbps to a lower rate can be attractive. Exhibit 3 lists some common techniques for compression and bandwidth conservation. This is an area of intense research due to speech storage and bandwidth abatement. • Circuit concepts. A channel represents a path of a single logical communication. Sometimes, a channel is a circuit, but a circuit is more often considered a physical configuration of equipment. A T1, for example, has 24 channels on one physical medium. Circuits can be simplex, carrying a signal in one direction only; half duplex, carrying a signal in two directions, but not at the same time; and full duplex, carrying signals in both directions simultaneously. Full-duplex 9
AU1438_frame_C01 Page 10 Tuesday, November 5, 2002 12:07 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION Exhibit 3.
Techniques for Voice Coding and Storage
Technique PCM (pulse code modulation) ADPCM (adaptive differential pulse code modulation) LPC (linear predictive coding) VSELP (self-excited linear prediction)
CLEP (code-excited linear predictor)
Method
Quality
8000 samples per second, 64 kbps
High
Compresses to approximately 32 kbps; calculates the difference between two consecutive speech samples
High
Digitizing technique that drops voice to Low 2.4–4.8 kbps Digitizing technique used in digital cellu- Low lar telephones; compresses to 4.8 kbps Government standard for compression; compresses to 4.8 kbps
speakers in conference rooms are significantly better, for example, than half-duplex speakers because of the tendency of speakers to interrupt each other. • Digital versus analog signals. Circuits are designed to support either digital or analog signals. For an analog circuit (e.g., one used at a residence with analog telephone) to transmit data, start and stop bits are required to keep the signal synchronized (see Exhibit 4). Synchronous transmissions, in which the sending and receiving terminals receive a continuous stream of bits, are considerably more efficient (used, for example, in ISDN). Digital circuits also have a lower error rate than analog circuits. Typically, a bit error rate of 1/1000 is reasonable for analog voice circuits; such an error rate is completely unacceptable for data circuits. • Multiplexing. Digital circuits have the capability of using a single broadband signal to carry several channels over a single circuit. A multiplexor combines the channels on the sending end; and at the receiving end, the signal is demultiplexed to restore the original channels. The composite signal contains data from all the end users. It provides for an efficient use of transmission capacity. Although ATM is increasingly used for wide area connections because of its ability to combine voice, data, and video, multiplexing is still widely deployed and is appropriate in many situations (ATM, for example, has at least a 10 percent overhead). The most common multiplexing scheme is TDM (time division multiplexing). TDM allows for a variation in the number of signals being sent along the line and constantly adjusts the 10
AU1438_frame_C01 Page 11 Tuesday, November 5, 2002 12:07 PM
Telephony Basics
Exhibit 4. Asynchronous Transmission
time intervals to make optimum use of available bandwidth. It is also protocol insensitive, and can combine a number of different protocols within the same high-speed transmission link. Another scheme, FDM (frequency division multiplexing), is used for cable TV where different stations are assigned frequency bands on a single cable medium. Common Circuit Connections There are an increasing number of ways that the CO can be connected to the organization’s premises equipment (via a demarc). Some of the more traditional methods include: • Ground start, two-wire CO trunks. They can be incoming, outgoing, or both. • DID (direct inward dialing) CO trunks. DID allows callers to directly reach extensions within the organization’s workplace without the intervention of a human or automated attendant. DIDs can be ground start, but more often are digital DS1 circuits for larger organizations. Another alternative is E&M signaling with four wire circuits (now less common). • WATS (wide area telephone service). This term is not used as much now as in the past. AT&T, to its later chagrin, forgot to trademark this name, so it is now in the public domain, meaning discounted toll service provided by long distance and local exchange companies. Incoming and outgoing services were separate in the past but now can be commingled in the same trunk group (e.g., Southwestern Bell offers “smart trunks” that allow incoming/outgoing on the same trunks, increasing throughput by roughly 10 percent). In the past, WATS lines were billed at a flat rate and many employees thought the calls were “free.” That is no longer the case — all are on a per-minute basis. WATS trunks do not have addressable PSTN numbers and hence cannot be dialed directly from the outside. Hence, they are often implemented as 11
AU1438_frame_C01 Page 12 Tuesday, November 5, 2002 12:07 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION
•
•
•
•
•
12
hunt groups within the PBX. This sometimes causes a problem with caller-ID-equipped users who receive calls at their home from small businesses or friends. When they receive a call from a business that has WATS lines and see a trunk group ID, they believe they can just return the call by dialing the number — of course, they get an error tone because the trunk groups are not DIDs and cannot be dialed directly. IXC trunks. Many businesses will have dedicated trunks to their long distance carrier. These trunks may physically come from the LEC, but logically they are a pass-through to the IXC’s point of presence. With these arrangements, long distance calls out and in bypass the LEC (from a billing perspective) and thus significantly lower long distance costs. When an organization with dedicated IXC trunk talks to another similarly equipped organization, the call is “dedicated to dedicated” and results in a low per-minute charge. FX (foreign exchange) circuits. FX circuits typically use a two-wire loop start configuration that originates from an LEC outside the organization’s subscriber exchange area. Sometimes, this is done to avoid long distance charges and sometimes as a disaster recovery option. Assume, for example, that a Houston-based hospital is serviced by a Southwestern Bell central office located on Jefferson Street. The hospital has a vital need to stay connected and cannot afford to be incommunicado if the Jefferson Street CO has a fire, terrorist damage, etc. One method (and there are several) to ameliorate the risk would be to run FX lines from another Southwestern Bell central office in Houston (e.g., the Galleria CO). FX can refer to a few lines or a series of trunks. If trunks are involved, then the connectivity is not simply from a foreign CO to a telephone set, but provides connectivity to a switch. Tie lines. Still heavily used, tie lines are direct, dedicated circuits from one location to another. They can be ISDN digital circuits, analog lines, microwave, or other types of links. Alternatives to tie lines include meshed networks (such as Frame Relay) and the Internet (IP network). OPX lines. Off-premise extensions (OPX) are telephones that are not physically close to the PBX that provides their service. Generally, the PBX requires an OPX card and there are distance limitations (in terms of electrical resistance). Automatic ring-down circuits. Ring-down circuits are used in situations where a user needs to immediately call a specific location without the need to dial a number. For example, when someone is stuck in a defective elevator, he or she picks up the elevator telephone and immediately the telephone on the service desk rings. Traders connected to the New York Stock Exchange may have similar arrangements. In a ring-down circuit, an AC current is sent down the line (local or long distance). The current may light a lamp or ring a buzzer. Ring-downs are expensive but appropriate for many business/operational environments.
AU1438_frame_C01 Page 13 Tuesday, November 5, 2002 12:07 PM
Telephony Basics • Wireless options. In some situations (e.g., rural environments), landlines may not be the easiest way for a PBX to be linked to the LEC. There are a number of wireless options, including satellite communications (VSAT), microwave, or line-of-sight infrared. Except for VSAT, the user should not perceive any delay in ordinary conversation. ISDN ISDN (Integrated Services Digital Network) is worthy of a book in itself. The ITU (previously the CCITT) defines ISDN as a network service that provides end-to-end digital services (both voice and data) to end users. The original objective of the designers of the service was to rid the world of the inefficient and costly analog infrastructure that has been built over the years. ISDN does not require a massive investment in infrastructure as perhaps fiber to the residence would incur. It is deployed in Europe and in the more urban areas of the United States. Some of the benefits include (1) the number and possibly the name of the calling party is transmitted before the call is answered; (2) by using two B channels, both voice and data applications can be run via the same circuit; (3) digital services are on a line-by-line basis; (4) dial-up calls are much faster than with analog (POTS) lines; and (5) additional information can be transmitted along the signaling D channel. There are two implementations of ISDN. The lower bandwidth form, typically offered to residential and small business customers of LECs, is called BRI (basic rate interface). It has two B or bearer channels of 64 kbps capacity each, plus a D signaling channel of 16 kbps. Exhibit 5 shows a simplified BRI configuration. PRI (primary rate interface) carries 23 B channels plus a D channel (the international standard for PRI is 30 B channels and one D channel).
B Channel 64 kbps
Network Termination Device NT1
B Channel 64 kbps
PSTN (ISDN)
D Channel 16 kbps
Exhibit 5.
ISDN BRI Connection
13
AU1438_frame_C01 Page 14 Tuesday, November 5, 2002 12:07 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION Although BRI will continue to be used for years, it is a narrow-band technology and hence limited. Technologies such as DSL, VPN (virtual private network) over the Internet, Frame Relay, and other higher bandwidth transmission architectures are supplanting it. However, BRI is still used for low-speed digital communications (much better than analog dial-up) and for backup circuits. PRI, on the other hand, is the prevalent standard used by most firms to connect to the local telco. Some organizations bring in a PRI circuit and then have it broken down into BRIs within the building. For companies that have a number of BRIs already in place, splitting a PRI is often less expensive than having many BRIs supplied from the central office. ISDN provides some of the supplementary services that we expect from modern telephony systems: • • • • • •
Caller ID Closed user groups Advice of charge Call hold User-to-user signaling Call waiting
Call control in ISDN is specified by the Q.931 Standard. Some of these “under the hood” elements include setup, setup acknowledgment, progress messages, disconnect messages, and others. The key point is that ISDN is an international standard and relies on well-established standards (there are several others besides Q.931). Hence, ISDN is reliable; nobody ever got fired for using ISDN. QSIG QSIG is a signaling system used in corporate/private voice networking. Based on peer-to-peer signaling, it is known internationally as Private Signaling System No. 1. Its main usefulness is to transport features across a network without regard to specific vendor equipment. For example, an Avaya PBX and a Siemens PBX could provide the following interconnect (supplementary) services: • • • • •
Name identification Call intrusion Operator services (use one operator for multiple PBXs) Do not disturb Call completion on no reply
QSIG also allows a flexible numbering plan to be put in place for an organization with many sites/types of hardware.
14
AU1438_frame_C01 Page 15 Tuesday, November 5, 2002 12:07 PM
Telephony Basics CARRIER SYSTEMS PBXs are linked to the LEC or IXC via CSUs (channel service units) and possibly DSUs (data service units). More recently, access concentrators are being used to link various voice over data network circuits (ATM and Frame Relay). During installation of new premises equipment, one of the most critical tasks is ensuring that all the protocols and parameters are set correctly between the central office and the premises equipment. Carriers (notably AT&T) have developed various T1 framing formats. Superframes have been defined to include 12 frames at a time; extended superframe formats vary in length from 12 to 24 frames. Various schemes have been implemented to ensure the integrity of transmissions over long distances or with lengthy strings of zeros (found in data rather than voice transmissions). Some of these schemes include B8ZS (bipolar eight zero substitution) and RBS (robbed bit signaling). Premises equipment, using CRC (cyclic redundancy check) algorithms, can detect degradation in lines prior to complete failure. THE TELEPHONY PROCESS The fundamental processes of calling were established in the late 19th century; in some ways changes have been slow in the ensuing 100 years. Following is a synopsis of the basic call flow for telephony: • Answering a call. When a call comes in on an analog telephone, the PBX at the central office applies a ring voltage of approximately 70 to 90 volts to the telephone circuit. The telephone rings. When the recipient of the call takes the handset off hook, the CO determines that the circuit is complete and the phone receives an analog signal. For digital telephones/transmissions, the computer telephony application recognizes the incoming call, takes the telephone off hook, and receives digital data. • Identifying the caller. Although many older telephone networks do not support caller ID or DNIS (dialed number identification service), this capability is rapidly becoming more common in the telecommunications world. If caller ID is supported, a CTI application (e.g., Phoneline from CCOM) can pick up the name and number of the caller during ringing. This number can be used for accessing databases and doing screen pops so that the called party knows in advance who is calling and perhaps some historical information. There are four potential ways to identify a caller: (1) caller ID (analog); (2) automatic number identification (digital); (3) direct inward dial — an organization may have a specific DID assigned to a particular set of callers (customers, suppliers, etc.); and (4) DNIS. Note that DNIS allows a single trunk group to be shared by multiple 800 numbers and still allows the recipient of the call to identify
15
AU1438_frame_C01 Page 16 Tuesday, November 5, 2002 12:07 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION the number originally dialed. An order-fulfillment organization, for example, may have one toll-free number, 888-123-1234, for gardening supplies and another number, 800-345-3456, for painting services. Both numbers could be terminated at the same agent’s telephone and DNIS would let the agent know which product is of interest to the caller. • Making a call (analog). The process is to initiate the call and then dial. Call progression tones reflect dial tone, ringing, busy, fast busy, etc. Analog systems use two common methods to initiate a call: (1) loop start and (2) ground start. A phone line is seized (started) by giving it a supervisory signal. Loop start inserts a resistor between the two wires (“tip” and “ring”) of the telephone when the receiver is lifted. When the central office detects loop current, a dial tone is generated. Loop start is typically used in homes. The ring lead is connected to –48 volts and the tip lead is connected to ground. When the central office detects the loop and the fact that it is drawing DC current, the ringing at the other end stops. A ground start architecture functions by grounding one of the two telephone wires when the receiver is lifted. When the central office detects a ground wire, dial tone is generated, enabling outward dialing. Another type of signaling is E&M (historically, ear = receive and mouth = transmit), used often for two-way switchto-switch or switch-to-network connections. • Making a call (digital). Wink start is the most common method for digital telephone systems to start a call. Although the end user hears dial tone when the handset is picked up, the PBX operating system actually sets a bit to signal to the central office that the line is now active. The central office responds with a short-duration off-hook pulse or wink (usually 140 milliseconds) that tells the phone or telephony board inside a PC that it is ready to receive a dialed number for outcalling. • Dialing. The two common methods for dialing are touch tone (also called DTMF, dual tone multifrequency) and rotary, still used by 60 percent of users worldwide. The keys on the telephone instrument have two standard frequencies per key. For example, when “1” is pressed, frequencies 1209 Hz and 697 Hz are sounded simultaneously. DTMF is clearly more efficient than rotary and can be transmitted over virtually any medium (microwave, copper twisted pair, and carrier facilities). In addition, it is resistant to interference from line noise. Voice mail applications, such as Siemens’ Phonemail, use touch-tone digits to communicate with the handset (for saving, erasing, and giving instructions). As the number is dialed in a rotary instrument, the circular dial sends out a momentary break in the DC circuit equal to the digit (three opens and closes for the number 3). The CO detects the evenly spaced opens and closes and registers the appropriate digit. Rotary dialing is considerably slower than DTMF dialing, and some 16
AU1438_frame_C01 Page 17 Tuesday, November 5, 2002 12:07 PM
Telephony Basics COs do not recognize the protocol. A robust CTI application will have the capacity to recognize rotary digits and voice-initiated digits in order to get around these limitations. • Multifrequency dialing (MF). Another dialing protocol used for internal signaling at telephone companies. Also called multifrequency pulsing, MF uses the standard ten-decimal touch-tone digits plus five auxiliary signals to generate a specific sound. MF signals received considerable publicity during the 1960s as a result of the “Cap’n Crunch” toll bypass scheme. The following synopsis is from Harry Newton’s Telecom Dictionary, 14th edition: At one point in the 1960s, a breakfast cereal had a promotion. It was a toy bosun’s whistle. When you blew the whistle, it let out a nearly precise 2600 Hz tone. If you blew that whistle into the mouthpiece of a telephone after dialing any long distance number, it terminated the call as far as the AT&T long distance phone system knew, while still allowing the long distance connection to the distant city to remain open. If you dialed an 800 number, blew the whistle, and then pressed in a series of tones (called multifrequency or MF tones) on your “Blue Box,” you could make long distance and international calls for free, because the only thing the local billing machine knew about was the original toll-free call to the 800 number. … Cap’n Crunch’s legacy (he got put in jail four times during the 1970s) is System Signaling 7, a system of immense benefit to us all.
• Call progression. The public network (PTSN) and most PBXs use audible tones to indicate the movement of a call toward completion. For example, tones of dialing, busy, ring-back, error tone, and ringing are common tones used to tell the user (and internal systems) what is occurring. Computer telephony boards use call progress tones to determine the state of a particular call. A predictive dialing application, for example, will need to determine the state of an outbound call (if a busy signal is received, go on to the next number and do not send to the agent). A call progression tone varies in frequency and cadence. It usually has two frequencies (ranging from 315 to 650 Hz) and the cadence is an alternating pattern of on and off. Desirable features of call-monitoring equipment include the ability to detect stutter dial tone and elimination of noise (via filters or software algorithms). Note that tone types will vary according to country and PBX. Any computer telephony board used must be tailored for the specific PBX to which it is attached. • Termination of a call. On an analog telephone, when the receiver is put down, the loop current is terminated and the CO responds by terminating the line. A digital system recognizes an on- or off-hook bit and stops sending current to the line when the receiver is placed back on the base station.
17
AU1438_frame_C01 Page 18 Tuesday, November 5, 2002 12:07 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION Signaling between switching offices can be provided on a per-trunk or common channel basis. In common channel signaling (CCS), all the signaling information for one or more trunk groups is carried over a separate (not traffic bearing) channel. In a T1, for example, one of the channels would be for signaling and the remaining 23 would be available to carry traffic. Nonassociated CCS uses completely separate trunks or channels to provide signaling information. It is more economical and more flexible. Assume, for example, that a business has a promotion that causes lines to be swamped. With traditional CCS, available circuits may be swamped with call attempts, thus reducing the total capacity of the trunking available to an organization. Nonassociated CCS, on the other hand, ensures that loadbearing trunks do not get swamped by call attempts. EQUIPMENT PBX A PBX, at its most basic, is a large switch that constantly connects incoming/outgoing calls with specific extensions within the premises; also, it connects one extension with another within the premises (hence those internal calls never go out to the public network). In the early days, a human took a physical plug and punched each end into a receptacle so that the parties could talk. The digital PBX uses a central switching program, always active in memory, which connects to line (the customer premises side) and trunk (circuits that come from the telephone company central office) interface cards. Cards that slide into the PBX’s backplane provide the physical connections; lines, trunks, and control circuits are linked to these cards, which in turn are tied to buses within the PBX. Exhibit 6 shows the basic architecture of the installed base of PBXs today. There are six primary components to the PBX: 1. 2. 3. 4. 5.
Stations/telephones Trunk/CO connection Station/telephone connection Power supply Administrative access (e.g., through direct linked RS232 terminals or an IP network) 6. Cabinet/backplane PBXs can connect to the LEC, the IXC, and private lines, such as a tie line between two cities. With the blurring of lines between local and interexchange carriers, a PBX may only have links to one carrier, which will carry both local and long distance traffic. Even if an organization has separate trunks for the LEC and IXC, the LEC trunks will sometimes carry long distance traffic (overflow if the IXC trunks are 100 percent busy), albeit at a higher per-minute rate. 18
AU1438_frame_C01 Page 19 Tuesday, November 5, 2002 12:07 PM
Telephony Basics Digital or Analog Telephones
Workstation
Line Cards
Modem PBX
Attendant Console
Trunk Cards
Private Network
PSTN
Fax Machine
Telephony Server Linked via Ports (e.g. voicemail server)
Exhibit 6.
Basic PBX Architecture
A large, digital PBX will have a multitude of components. Following are some of the most important. • Digital stations. Virtually all major vendors offer digital stations for the end user. By converting the analog signal to binary format (via signal sampling at approximately 8000 times per second), each handset can perform a myriad of complex telephony functions. Unfortunately, many of these functions are unknown to the average user — CTI provides a means to make these features visible and far easier to use. Digital stations offer more complex features than the older analog stations (which do not have the advantage of a separate signaling channel and two digital “bearer” channels). • Main cabinet. The cabinet houses the CPU(s) for system operation supervision, disk drives for the configuration and saving system transactions, and memory and bus slots for all the cards. Many systems are modular so that additional cabinets can be added as the need arises. 19
AU1438_frame_C01 Page 20 Tuesday, November 5, 2002 12:07 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION
•
•
•
•
•
•
•
One of the selection criteria for a PBX is whether additional stations (or analog lines) can be added without a “forklift” upgrade. T1 cards. Interface cards (DS1 cards). T1 lines travel from the demarc (of the local exchange carrier) to these interface cards. They are usually connected to the switch by an amphenol connector. Twenty-four channels are available for voice, data, or video communications. These channels can be configured by the technician for any combination of traffic (of course, the CO must make the appropriate changes in its switch to match the requirements of the organization). For example, twelve channels could be allocated to AT&T for long distance, eight for Southwestern Bell local service, and the remaining four could serve as a tie line to another PBX in the organization. Input/output cards. These cards allow easy access to the switch for call detail (CDR port) and general I/O. Many switches now have an Ethernet connection (obviously important for any CTI application). Older switches, such as the Siemens Rolm 9751, have an SMIO (system management I/O) port that transmits screen dumps in response to line commands. Getting these screen dumps in a database format is not straightforward. Analog cards. If the organization uses analog trunks, these cards are needed. They serve the same function for analog traffic as the T1 cards do for digital traffic. Station line cards. These are the cards that connect to the wiring that goes to the frame. Typically, eight to twenty-four stations (telephones) can be served by one station line card. Some switches have stations with an analog adaptor on the back, allowing both voice and data to be transmitted on a single telephone pair — back to a single port on the station line card. Thus, a user could have a PC modem connection and a digital set serviced by the same port. This is possible because the protocol used by most vendors is a variant of ISDN, which has two 64kbps B channels; one channel can be used for voice and the other for data. System administration terminal. System commands, via command line or GUI (graphical user interface), are entered into the system administration terminals (usually PCs with modems). Formatted reports of hourly traffic data on engineered resources such as trunk groups are generated via this link. Uninterruptible power source (UPS). Switches are sensitive to power spikes, lightning, etc. If power is fed from a UPS rather than straight from the local power grid, electrical failure is less likely. Rectifier. Most switches operate on DC power. Accordingly, incoming AC power must be converted to DC via a rectifier.
Switching establishes a temporary or permanent connection between lines and trunks. A switch matrix generally uses a multiplexing technique 20
AU1438_frame_C01 Page 21 Tuesday, November 5, 2002 12:07 PM
Telephony Basics known as TDM (time division multiplexing) to carry signals and messages from one point to another. In TDM, the information is carried in time slots arranged like soldiers marching single file across a bridge. The matrix ensures that each input signal will match the correct output signal. All switches are driven by software. Traditional PBXs use a variety of (highly proprietary) UNIX operating systems, whereas some of the newer (and smaller) switches run on NT. As switches assume more complex operations, the likelihood of failure from software errors will undoubtedly increase. Line side is the term used to indicate a connection between the PBX and the customer side. Typically, a line side port will link to either an analog (POTS) station or a vendor-specific digital station. The opposite of line side is CO side and refers to links (trunks) that connect to a central office. Digital trunks are considerably more efficient than analog and offer superior error correction, signaling, and monitoring. Other basic functions of a circuit-switched PBX include: • Station-to-station dialing (typically using a four- or five-digit internal dialing plan) • Direct inward dialing (DID), which allows stations to ring without operator intervention • Direct outward dialing (DOD) • Call detail, an audit trail of calls made to and from a station • Automatic call distribution (ACD), where calls to a single number are distributed to a number of specified extensions • Voice mail • Directory services • Internal reporting and administration Feature Sets In the past, feature sets have been the battleground between vendors competing for customers requiring high-end systems. Over time, the features have become somewhat less important because the ones most often used are provided by all the major PBX vendors. New features added now are often specialized. The exception to this trend is IP telephony, which allows telephones — in some ways — to act like small computers, delivering slick services such as department directories and reference information on a relatively large screen. Exhibit 7 lists some of the traditional features found on major PBXs. Key Systems Key systems go back to the 1920s. In those days, a wealthy CEO might have three or four telephones on his desk to ensure no important calls 21
AU1438_frame_C01 Page 22 Tuesday, November 5, 2002 12:07 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION Exhibit 7. Feature Automatic call distribution (ACD)
Automatic number identification (ANI)
Automatic wake-up Call blocking Call coverage
Call forwarding
Call pickup
Camp on
Class of service
22
Common Feature Sets of PBXs Comments ACD groups are the sine qua non of help desks and call centers. The PBX (in some cases, an adjunct server) receives the call and makes an intelligent decision where to place that call. ACD functionality allows the PBX to track calls by agent, gives the caller the option to leave messages, and distributes calls based on criteria such as area code. For example, calls coming in with area codes east of the Mississippi River could be sent to the “East Desk” sales representatives, and area codes west of the Mississippi could be sent to the “West Desk.” Also referred to as “caller ID” for analog circuits. With this feature, the PBX can pick up the incoming digits and perform some function with them. For example, the incoming number can be displayed or can be used to do a database lookup on a separate database server. ANI can be provided to the PBX by any CO switch that is SS7-compliant and does not require ISDN. Hotel industry feature offered by most major PBX vendors. Allows PBX to block specific country codes or area codes (e.g., 900 calls). The specification in PBX software of the actions to be taken when the called party does not answer his or her phone. Does the call go to voice mail? Is it sent to a location on the other side of the world (for around-the-clock coverage)? Does it ring at co-workers’ desks? Also referred to as “send all calls.” The user can enter codes to send any incoming call immediately to a designated number, voice mail, or some other receiving function, such as an interactive voice response (IVR) system. This saves callers time. If users know they will be out of the office, they can set the appropriate codes, and callers will not be forced to sit through four rings to get to voice mail. Allows any user to answer someone else’s ringing telephone, as long as both parties are in the same “pick group.” Typically used by an administrative assistant covering for a number of other users. Ability to dial an extension and if that extension is busy (in use), the PBX automatically continues to try to reach the extension, although the originating caller has hung up. Once the dialed extension is available, the PBX rings the originating party and the called party. This saves the user the effort of continually trying to reach a busy extension. A code assigned to each extension that specifies the PBX functions allowed. For example, lobby telephones may be assigned class of service 5, which allows local calls only. Each organization can define class of service to meet its needs. It is a good security practice to assign only as many features to a class of service as are needed for the job.
AU1438_frame_C01 Page 23 Tuesday, November 5, 2002 12:07 PM
Telephony Basics Exhibit 7. Feature Conference calling
Digit translation
Direct inward dialing (DID)
Direct outward dialing (DOD)
Directory Hunt group
Last number dialed Least cost routing
Look-ahead routing
Common Feature Sets of PBXs (Continued) Comments Allows the user to initiate a conference call for a maximum number of internal and external callers, depending on vendor. The sound quality will typically not equal that of a dedicated call bridge such as AT&T’s “meet me” service. In addition, typically less than a dozen participants can be accommodated from a user digital set. Conferencing on an extension does not have the volume and balancing controls found on a true “call bridge” device. The PBX should be able to translate an arbitrary number of digits (e.g., “12345”) into a different number (e.g., “011-044-123-1234”). This capability includes masking — the ability to change all outgoing numbers starting with “6” to start with “2.” Some systems lack the CPU and processing power to do a large number of digit translations without affecting systemwide performance. Parties outside the building can call the extension directly rather than having to go through the operator or an auto-attendant. The local exchange company usually assigns a range of DIDs that are available to the organization. Obtaining a contiguous range of numbers greatly simplifies day-to-day assignment of extension numbers to new employees. Allows an extension or modem to directly connect with outside trunks. Modems are often put on DOD-only lines; this increases security and avoids use of numbers in the DID range (which are sometimes scarce, particularly if an organization has grown rapidly at one site). Allows user to scroll through names on a handset display to find the correct extension to dial. An association of extensions such that when one of the dialed extensions is busy, it goes to the next, and the next, and so on. Various PBX implementations determine the order of the search. For help desks, hunt-group configuration should be carefully designed. For example, assume that agents #1, #2, and #3 receive all calls in a hunt group. If the PBX always goes to agent #1 first, then the call workload will be skewed — #1 will never get a break and #3 may be underutilized. Random starts would be preferable in that work environment. Hunt groups and ACDs share many similar attributes. However, hunt groups are more rigid in their function (e.g., always random or always go to agent #1 first). ACDs, on the other hand, can use more sophisticated CTI functions and capabilities within the switch itself to route a call. This should be available either as a standard button or as a programmable button on the handset. The PBX automatically selects the least cost route to send a phone call. The least cost could be tie lines, PSTN, Internet lines, a microwave circuit, or some other route. This feature is transparent to the user. This PBX function checks for network busy conditions before attempting to complete the call.
23
AU1438_frame_C01 Page 24 Tuesday, November 5, 2002 12:07 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION Exhibit 7. Feature
Common Feature Sets of PBXs (Continued) Comments
Message waiting light
This feature is obvious. It is the aesthetics that should be reviewed. Some users prefer large, highly visible lights; others prefer a tiny, discrete blinking light that can be more easily ignored. Off-premises For at-home or remote office workers, off-premise digital telestation phones are essential. The PBX has a card that is dedicated to offpremise extension telephones. Paging Implemented via telephone speakers using the paging channel. Usually, a specific number is dialed and all parties connected can hear the message over a loudspeaker. There is also a common feature that sends a numeric page from voice mail based on preset criteria (e.g., page only when a high-priority message is received) Personalized Ability to modify the ring of the handset. Particularly useful in highringing density cubicle work environments where several telephones may be ringing at once. Recorded Announcements associated with voice mail should be flexible, e.g., announcements customized greetings for internal and external callers, and the ability to use ANI to recognize the caller and play a special greeting. If an important client calls, the greeting could be customized to say: “Ms. Jones, I’ll be at 713-123-1234 if you need me right away”; all others would get the message: “I will be out of the office until Monday, please leave a message.” Skills-based Agents have different skills (e.g., speak different languages; have inrouting depth, Tier-2 experience; know a particular set of products). If those skills can be assigned to a database that can be queried, calls can be sent to the right agent. An IVR could say: “Press 1 for an English-speaking agent, press 2 for a Spanish-speaking agent.” The IVR/ACD software would use a database of skills to route the call to the appropriate agent. Speed dialing Programmable rep dials are a convenience for the user. When (“rep” dials) reviewing the capabilities of a PBX, it is useful to ask whether user-programmed rep dials can be retained if the telephone is moved (either literally or virtually) to a new location. Time-of-day For organizations whose contract with the long distance carrier routing has an off-peak discount, the system routes the call (to the extent possible) based on the time-of-day discount structure.
were missed. An engineer within AT&T decided that it would be more efficient to have all the lines that would otherwise be going to the four phones connect to a single phone. The single phone could have four buttons or keys. The desk was less cluttered and the “key” system was born (sometimes called KTS, key telephone system). To select a specific line, a button is pressed; hence the phrase heard in so many small businesses, “Fred, pick up on line 1.”
24
AU1438_frame_C01 Page 25 Tuesday, November 5, 2002 12:07 PM
Telephony Basics The earliest key systems actually had large numbers of lines physically going to the station; installation was time consuming and error prone. Later, in the 1970s, each station had only one line going to it and a central controller made an instantaneous connection to the button pressed. Key systems have become more complex and have taken on more functions traditionally associated with PBXs. They are used in small offices (less than 50 people) and are convenient because anyone can pick up a specific line at any phone. In a PBX, a call can come in or go out any trunk, and the option for DID is almost always implemented for larger organizations. This means that a call coming in any trunk will be directed to a specific station, based on the number originally dialed. Otherwise, the telephone operator will receive all the calls and then will need to transfer to the specific station required. Because of the law of large numbers, a PBX can scale easily. A thousand users do not need a thousand trunks — typically only 10 to 15 percent will be on the phone at one time. In a key system, the relationship of people to lines is not so disproportionate. Another factor that limits the size of a key system is the size of the station on the desk. Most users find more than 24 keys intimidating; worse yet, with all the modern information appliances on desks now, the sheer real estate consumed is a concern. Hybrids are, as one would expect, devices that sit between key systems and PBXs in functionality. The hybrids allow calling from extension to extension, transfers of calls to specific extensions, and other functions similar to the PBX. They also can pool trunks. Except for the keys and selection of lines, the sharp difference between key systems and PBXs no longer exists. In practice, key systems are small and PBXs are bigger. There is no precise cutoff. Centrex Centrex, originally a specific regional Bell product, has become a generic term for business lines provided directly from the central office; these lines do not go through a premises PBX. Centrex is delivered by the public switching infrastructure (i.e., uses the big switch at the CO) and does not require a major investment by the customer in premises equipment, as is the case with a large PBX installation. Incidentally, when performing a telephony security assessment, it is important to note that “locking down” the PBX has no effect on Centrex lines. Because hackers war-dial modems via any publicly available line, this could be an exposure point. Some of the functions provided by Centrex include: • Custom dialing plans 25
AU1438_frame_C01 Page 26 Tuesday, November 5, 2002 12:07 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION • Call waiting, call forwarding, call park, and voice mail • Speed dialing, ring again, and caller ID • Management and security functions such as forced authorization codes, line restrictions (e.g., no international calling), etc. Some of the reasons for using Centrex include: • Avoiding capital investment (although, in the long run, a PBX infrastructure is cheaper for the majority of users) • Providing redundant services should the PBX fail or the main lines to the premises are cut (this assumes that the copper pair Centrex lines are run along a separate conduit) • Providing dial-up access to servers and maintenance ports by using lines that have numbers outside the range of the normal business numbers (makes it more difficult for war dialers) • Enabling the boss to have a bright red, different-looking phone in the office — no need to conform to corporate policy! • Centrex lines cannot be as easily intercepted by individuals with access to the PBX; sometimes, Centrex phones in executive offices are called “sweetheart” lines (the reason is beyond the scope of this book) SUMMARY In this chapter, we have touched on some of the basic telephony concepts and required functions. The more-than-100-year-old telephony system is, to some extent, chained to its own history. No one would design a system from scratch today based on circuit-switched lines that use up the entire bandwidth even if no traffic is flowing. As we will see in later chapters, IP telephony addresses this problem and will gradually replace the existing infrastructure. The process will take decades, due to the massive investment in what exists today.
26
AU1438_frame_C02 Page 27 Tuesday, November 5, 2002 12:08 PM
Chapter 2
IP Telephony If it weren’t for electricity, we’d all be watching television by candlelight. George Gobel
INTRODUCTION Fundamental Change in Communications Most futurists believe that information networking, as a fundamental change to society, is comparable to the introduction of fixed location agriculture or the introduction of coal- and gas-powered machines in the 19th century. A subset of this revolution is the transition from analog/waveformbased communications to digital/packetized transmissions. When communication is digital, networking resources can be shared, optimized, and monitored far more efficiently than in the analog world of a “channel per conversation.” George B. Dyson, in his book, Darwin Among the Machines, elegantly states the impact of this transition: A circuit-switched communications network, in which real wires are switched to connect a flow of information between A and B, would be swamped by the intractable combinatorics of millions of computers demanding random access to their collective address space at once. All the switches in the world could never keep up. But with packet-switched data communications, collective computation scales gracefully as the number of processors (both electronic and biological) grows. … Consensual protocols, running on all the processors in the net, maintain the appearance of robust connections between all the elements at once. The resulting free market for information and computational resources determines which connection pathways will be strengthened and which languish or die out. By the introduction of packet switching on an epidemic scale, the computational landscape is infiltrated by virtual circuitry, cultivating a haphazard, dendritic architecture reminiscent more of nature’s design than of our own.
The first decade of the 21st century will see the rapid transition of all networks (voice and data) to a digital/packet-based architecture. The sections below outline the structure and implementation options currently available for IP telephony.
27
AU1438_frame_C02 Page 28 Tuesday, November 5, 2002 12:08 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION What Is IP Telephony? There are many variants of IP telephony definition, but they have the following in common: (1) a set of standards for packet transmission; (2) ability to commingle various media such as voice, data, and video, on LANs, WANs, and the Internet; and (3) flexibility with regard to physical media — IP telephony works over twisted pair, fiber, xDSL, ISDN, leased lines, coaxial cable, etc. With IP telephony, the same infrastructure that carries e-mail, data, and Web traffic will also carry voice (which has been formatted and converted into packets). True IP PBXs are conglomerations of servers, routers, gatekeepers, and gateways. Some of the larger vendors, burdened with the headache of maintaining one foot in the legacy past and the other foot in the future, use an interim “IP-enabled” approach to retrofit their TDM (time division multiplexing) dreadnoughts. IP Telephony Momentum The first edition of this book, written in late 1998–1999, briefly discussed IP telephony. Since that time, IP telephony has become the standard or at least the direction for all the PBX vendors. Those who operate PBXs — particularly large PBXs — are conservative, and rightly so, because users expect 99.999 percent uptime. Nevertheless, the move toward IP technology is inexorable. There will be human colonies on Mars before the last circuit-switched PBX dies, but there is no longer any question about future architecture. Rather than subjecting the reader to a single, overly large chapter on IP telephony, the subject has been broken into (1) premises IP telephony, covering equipment, standards, and local architectures; and (2) Voice-over-IP, a wide area networking technology. The transition to IP telephony has opened up many new possibilities for computer telephony integration. Devices that formerly had to be linked to the PBX via physical ports can now use standard Ethernet cards. In many cases, there is less analog/digital conversion, resulting in increased efficiency. Trends The traditional, “big iron,” mainframe-like PBX is rapidly being disaggregated. Increasingly, multiple servers perform specialized functions such as sophisticated desktop directory lookups. Similar to the 1980s transition from mainframes to PCs, the focus is shifting from efficiency of equipment and architecture1 to efficiency of people. A disaggregated architecture is necessary to provide all these value-added services.
28
AU1438_frame_C02 Page 29 Tuesday, November 5, 2002 12:08 PM
IP Telephony In addition to a more people-centric environment, telephony is seeing an explosion of standards, significant improvements in signal processing (sometimes as important to throughput as bandwidth), more wireless/PDA functions, and a strong shift to the Web paradigm. Some specific drivers include the proliferation of XML and SCEs (service creation environments). SCEs allow developers to more easily create applications that are provided on a network basis. IP telephony is as driven by the need to fit in internationally as other products. Its flexibility, relative to TDM architectures, allows easier customization to national cultures and needs. Proprietary versions of UNIX have long been the backbone operating system for larger PBXs. Microsoft® operating systems (NT and XP) are often used for smaller systems, but Linux seems to be gaining popularity, especially for specialty servers such as Avaya’s applications server. Linux has several advantages — low cost, constant debugging by open source enthusiasts, and efficient internal processes, reducing overhead. IP TELEPHONY ARCHITECTURE High-Level Configuration and Variance from TDM Architecture Exhibit 1 shows a typical IP telephony configuration. Looking at the design, some differences between the traditional PBX architecture and IP telephony are apparent: • Bandwidth is used more efficiently. Traditional telephone conversations use 64 kbps in each direction, for a total of 128 kbps. A fiveminute conversation, with both ends of the circuit open, is a transmission volume of 4.7 megabytes (MB). Because, most of the time, parties take turns talking, that transmission could be reduced to 2.35 MB by eliminating the half of the conversation that is silent at any one time. Dead space (no one talks) reduces the total even further. Regardless of what the calling parties do, circuit-switched networks hold open the entire bandwidth for the duration. In contrast, packet-based voice transmission technology is able to use the dormant bandwidth. As a result, three to four times the number of conversations can be fit into the same dedicated space. Of course, standard compression techniques (discussed later in this chapter) reduce the bandwidth requirements even further. • Components are more evenly distributed. The various servers need not be together, as long as they are linked on the local area network. • Growth is incremental. When the traditional PBX runs out of shelf space, even if only one more user needs to be added, the next increment in hardware is usually substantial. With IP telephony, another switch or IP telephone is the only addition required. 29
AU1438_frame_C02 Page 30 Tuesday, November 5, 2002 12:08 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION
Data Network (e.g. Frame Relay or circuit-switched lines)
Public Switched Telephone Network Router
TelephonyEnabled Router
Building Backbone Switch
IP Telephones Workstations CTI Server
Apps Call Server Manager
Exhibit 1.
Example IP Telephony Architecture
• Port limitations are reduced. By using standard, data-type connections (typically, Ethernet with RJ45 connectors), any new telephony device (server or appliance) can be added without port limitation concerns. Of course, there are capacity issues with IP telephony as with the TDM architecture — they just do not arrive as sharp, major requirements for upgrades. • The architecture is less proprietary. In the data world, vendors have, from the beginning, been subjected to relentless pressure to open their systems and make them standards based. In contrast, traditional PBX vendors have highly proprietary operating systems; complex telephony applications, unless programmed by the vendor, had to be performed outside the PBX and then integrated. • Fewer single points of failure. Servers and switches can be distributed throughout the organization. It is practical to use backup servers in this architecture; with a large, enterprise-class, TDM-based PBX, backup is considerably more expensive and requires advanced cross wiring. One does not keep a spare Avaya Definity G3 around just in case, due to the expense. Some organizations will have dual PBXs with circuits between them so that telephony service is minimally interrupted if a PBX fails. It is not practical to haul in an unwired PBX and install it quickly. On the other hand, traditional PBXs from the major vendors are extremely reliable.
30
AU1438_frame_C02 Page 31 Tuesday, November 5, 2002 12:08 PM
IP Telephony • Potential for lower costs. Openness and cost are inversely related. When telephony servers can run Linux as an operating system and use off-the-shelf Hewlett-Packard or Dell boxes, the price wars begin. The barriers to entry are now greatly reduced. “Frank and Bill” can build an un-PBX in a garage and sell the “Frank and Bill telephone system” over the Internet the next day! • Single wiring. Data and voice travel along the same wiring infrastructure. Of course, this has some risk, because a catastrophic network failure could shut down both voice and data. Clearly, the sloppy practices that were sometimes used in the past2 are unacceptable when “one-wire” architectures prevail. Monitoring, quality installs, neat and harnessed wiring bundles, cooling, and adequate physical security are de rigueur. In this architecture, the router/gateways must be voice enabled. Hence, the operating software and related parameters on the router must be capable of addressing the requirements of real-time traffic. A voice packet that gets behind a large e-mail attachment flowing through the network (either local or across the WAN) is lost to the receiver, even if it eventually gets through. Delayed voice is garbled, unintelligible voice. Fundamental IP Telephony Process IP telephony, at the detail level, seems a bit intimidating to those who have spent careers in the circuit-switched world. It looks especially complex because there are far more choices, the technical standards are more visible, and the solution is not dictated by what is available from a closed system; also, circuit-switched PBXs have complex operating systems, but most of that is hidden from telephony staff performing day-to-day support tasks. The following call flow is a simplified version of the basic IP telephony process: • The user lifts the receiver; a digital signal (usually) is transmitted to the PBX. • A dial tone is presented by the PBX. • The number is dialed, then retained temporarily in memory. • The PBX (more often called communications server) syncs up the number dialed to the IP address of another PBX (the IP host). The link could be to a gateway if the called party is on the PSTN. • The calling PBX and the called IP host set up a session to handle packets from each other. Two channels are required for duplex (both directions) communications. • Packets between both parties are exchanged and converted into sounds via a CODEC. • The call is terminated by either party.
31
AU1438_frame_C02 Page 32 Tuesday, November 5, 2002 12:08 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION • Each party’s line is now open for new calls because the circuit between the phone and the PBX is closed. • The PBX transmits a termination signal to the IP host, which shuts down its own session. • The dialed number to IP host mapping, previously stored in memory, is not erased. If a transmission technology looks like data, talks like data, and hangs around with other data, then it probably is data — IP telephony really is just another data technology, with some demanding requirements for realtime communications links (low latency, etc.). End-User Devices Introduction. Skeptics can certainly say that IP telephony, at its very best, sounds like the crystal-clear, circuit-switched telephone voice channel. What they cannot say, however, is that the traditional proprietary digital phone has anywhere near the flexibility of the IP phone.
As Cisco touts so vigorously, the IP phone is really a computer. It is not a dumb terminal (i.e., a regular phone). This has far-reaching implications — way beyond the efficiency or cost benefits normally promoted for IP telephony. Because the phone is a computer, it can be programmed (in C++ and other development environments, depending on the vendor) and can obtain data from the rest of the network. Now, IP telephones, with their expanded display capabilities, can support applications such as airline flight availability. Another example is a station-based directory search that includes a photograph of an individual. Following are some examples of IP telephones (whether hardware-based or virtual). Screen Phones. Exhibit 2 shows Cisco’s full-featured 7960G IP phone. It has the usual features that one would expect from a high-end, traditional digital phone (such as date and time, calling party name, calling party number, etc.) plus features that are more IP-telephony-related. Following is a list of some of these additional features:
• Messages. The Cisco IP Phone 7960G identifies incoming messages and categorizes them for users on the screen. This allows users to quickly and effectively return calls using direct dial-back capability. • Directories. The corporate directory integrates with the Lightweight Directory Access Protocol (LDAP) standard directory. • Settings. The settings feature key allows the user to adjust display contrast and select a ringer tone and volume settings for all audio such as ringer, handset, headset, and speaker. Network configuration preferences can also be set up, usually by the system administrator. Configuration can either be automatic or manually set up for Dynamic Host 32
AU1438_frame_C02 Page 33 Tuesday, November 5, 2002 12:08 PM
IP Telephony
Exhibit 2.
Cisco 7960G IP Phone (Courtesy of Cisco Systems, Inc.)
Control Protocol (DHCP), Trivial File Transfer Protocol (TFTP), Call Manager, and backup Call Managers. • Services. The Cisco 7960G allows users to quickly access diverse information such as weather, stocks, quote of the day, or any Web-based information using XML (Extensible Markup Language) to provide a portal to an ever-growing world of features and information. • Help. The online help system gives users information about the phone’s keys, buttons, and features. The pixel display allows for greater flexibility of features and significantly expands the information viewed when using features such as Services, Information, Messages, and Directory. For example, the Directory button can show local and server-based directory information. VXML (Voice Extensible Markup Language) is discussed in more depth in Chapter 4. For now, keep in mind that IP phones should be programmable and that they use VXML to obtain information within the IP network. High-end IP phones from most vendors will have a minihub or a two-port switch built into the unit, allowing an RJ45 Ethernet connection from the wall. The second port links the phone to a nearby workstation. If there is no minihub or switch built in, additional wiring is required because most employees will use both a telephone and a workstation. Increasingly, IP phones use inline power, eliminating the requirement for a space-wasting power unit at the end-user location. This eliminates one of the early objections to IP telephony: plug space is at a premium for many offices. Exhibit 3 shows a typical inline power configuration (also called single wire). 33
AU1438_frame_C02 Page 34 Tuesday, November 5, 2002 12:08 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION
IP Phone
Switch
4 Wire, 48 Volt, DC Power Going through 10/100 Ethernet
IP Phone
Exhibit 3.
Single-Wire (Inline) Power for IP Phones
Some other nice-to-have features for IP phones include: • Multiple options for audio compression (e.g., G.711 and G.729a) • Netmeeting and H.323 compatibility (H.323 will be discussed later in this chapter) • Ability to statically (fixed) assign an IP address or use DHCP • RS232 port for add-ons (viewing the IP phone as a computer, physical interface points become more important for the future) IP Softphones. An IP softphone uses the workstation to perform telephony functions. A picture of a phone on the screen provides a functional guide to use the softphone for any service that could be provided by a stand-alone desktop model. In some cases, softphones are easier to use; conferencing, for example, is usually done by drag-and-drop with an address book. As with many desktop programs, keyboard shortcuts speed the process for those who tire of pointing and clicking. Some softphones allow for the integration of the user’s private directory with the enterprise directory (LDAP).
In most cases, users or call center agents will have both a softphone and a physical IP phone. This allows for considerable flexibility — some functions are easier with the softphone, others are easier with a physical phone. Softphones for PDAs. “Swiss knife” appliances are following the expected technical marketing curve — getting smaller; consuming less power; and offering telephony, PDA, and computing functions all in one. Bluetooth technology may ease the entry of PDAs into premises-based IP softphones. Vendors, including Ericsson, HandSpring (Treo), Kyocera (Smartphone), Nokia, Qualcomm, and Motorola are rushing telephony-enabled PDAs to market. There is clearly a strong incentive to develop these systems as IP telephony rather than bandwidth-hogging circuit-switch technology. 34
AU1438_frame_C02 Page 35 Tuesday, November 5, 2002 12:08 PM
IP Telephony Feature Sets. Despite the effort required to add features to the traditional, TDM-based PBXs, the major vendors have developed a plethora of functions that IP phones are just now matching. Of course, in many cases traditional TDM vendors now offer either pure IP telephony systems or hybrids. Following is a partial checklist of features/capabilities that should be reviewed before making any architectural decisions:
• • • • •
Large LCD display with soft key prompting 20 to 30 programmable keys Integrated hub or switch; two-port is the obvious minimum Full-duplex speaker phone Availability of an adaptor (sometimes called an extender) for off-premises work; for example, a telecommuter could use the adaptor to enable telephony over a DSL or cable modem Internet connection • Color display • Java support • Web browser (e.g., NEC’s Dterm Inaset model) Communications Servers (Nobody Makes PBXs Anymore) Although some of the major vendors call their PBXs “communications servers,” it will be a while before these ersatz mainframes shrink down to the size most people associate with servers. The term PBX is beginning to show the patina of legacy technology. Once known as the “un-PBX,” today’s communications servers often use off-the-shelf operating systems (for example, Avaya has a server that runs on Linux). Steve Jobs’ quip, “don’t trust a computer you can’t lift,” seems to have sunk in. Telecom managers like the open architecture and flexibility of these boxes. Some of the advantages include: • Less reliance on the vendor. Implementing complex business rules, particularly in a contact center, traditionally requires vendor support, on the vendor’s schedule. With more straightforward communication servers, the changes can often be programmed in-house. • Distributed risk. Running telephony on off-the-shelf servers has sometimes been considered unsafe, given the 99.999 percent uptime required by many organizations. However, the considerably lower cost of these servers compared to a centralized PBX enables multiple servers to be distributed in any large site. Depending on response time from the PBX vendor, use of communications servers might even be a less risky option. • Low-cost add-ons. Once again we see open architectures drive down costs. For example, Calltrol’s Object Telephony Server supports large conferences; traditional systems usually require a separate audioconferencing box. 35
AU1438_frame_C02 Page 36 Tuesday, November 5, 2002 12:08 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION • Standard hardware. Many communications servers use Intel chips (or Intel’s competitors’ chips). Dialogic and Intel telephony boards are the standard workhorses of these boxes. Of course, it is important to note that just because a telephony system runs on standard hardware and a standard OS, it may still be a TDM-based machine, using physical ports and all the other traditional circuit-switched architectures. Of course, the smaller communications server vendors as well as the market leaders have seen the light on IP telephony. Increasingly, even small servers will use IP telephony as their base architecture. The largest factor that could slow down this trend for the small systems is the price of IP telephone sets; they are often several hundred dollars each (like the TDM-based digital phones). In contrast, standard analog TDM phones are usually less than $50. Gateways and Gatekeepers. Gateways connect IP telephony to the PSTN,
TDM equipment, and an organization’s WAN. They are an easy way to connect legacy TDM PBXs to the IP world. GATEWAYS Introduction A gateway sits between the circuit-switched network and the IP telephony network. It encodes and translates a call into an IP packet stream or reverses the process. Some of the PSTN services that are connected to IP telephony networks include: • POTS (plain old telephone service) lines • ISDN • T1 channelized voice circuits Gateways can be embedded in routers or can function in a stand-alone configuration. Increasingly, routers handle much of the previously standalone gateway traffic. For example, Cisco reports that its SRS (survivable remote site) telephony has capability embedded in Cisco IOS that runs on the local branch office IP Telephony router. In the event of a WAN link failure, SRS Telephony allows a Cisco router to perform backup call processing for a small office with 1 to 48 Cisco IP Phones. SRS Telephony automatically detects a failure in the network, and using Cisco’s SNAP (Simple Network Automated Provisioning) capability, initiates a process to intelligently autoconfigure the router to provide call processing backup redundancy for the IP phones in that office. The router provides call processing for the duration of the failure, ensuring that the phone capabilities stay up and operational. (Source: www.cisco.com.)
36
AU1438_frame_C02 Page 37 Tuesday, November 5, 2002 12:08 PM
IP Telephony A low-end, stand-alone VoIP gateway is shown in Exhibit 4. Typically, this device provides reasonable (sometimes equal to toll quality) voice service over IP networks (e.g., between corporate offices). Some important/desirable features include: • Advanced compression • Echo cancellation • Easy installation and configuration (particularly for locations that have no technical staff) • FXS (foreign exchange station) interface so that access to a POTS phone or trunk ports (PBX/key system) is available • Packet recovery algorithms • Ability to create preferred dialing plans • Call detail logging for cost accounting and system administration • Integrated system to eliminate the need for a separate database server • Bandwidth and jitter buffer adjustment capability to optimize voice quality Operation of the VoIP gateway is relatively straightforward. Exhibit 5 shows a typical configuration. Either the corporate IP WAN or the Internet can be used to transport packetized voice traffic. In practice, the Internet provides better quality than one would think; however, it would not be prudent to rely solely on the public Internet. A failover to the PSTN should always be available for those organizations using the Internet as a primary link (for that matter, even organizations using private lines should have a PSTN failover for redundancy).
Ethernet Port (RJ45)
LED Indicators
COM Port (RS-232)
Exhibit 4.
4 Ports (Analog) PSTN Links
VoIP Gateway for a Small to Medium Office
37
AU1438_frame_C02 Page 38 Tuesday, November 5, 2002 12:08 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION
Corporate Headquarters
PBX
VoIP Gateway
PSTN
Router PSTN
Corporate IP WAN (or Internet)
Router
VoIP Gateway Field Office
Exhibit 5. Stand-Alone Gateway Allows Circuit-Switched PBXs to Use IP Network for WAN Link
Economics IP Links Already in Place. For organizations with circuits already in place
(dedicated lines, Frame Relay, etc.), voice can sometimes be added with little bandwidth consumption. With plummeting per-minute charges for domestic voice traffic, the use of VoIP (Voice-over-IP) as toll bypass becomes more questionable, given the planning and monitoring required for the circuits. Some of the largest users of voice minutes are now contemplating agreements stating minutes-per-cent rather than cents-per-minute. Hence, at least for domestic traffic, VoIP from a WAN perspective is often seen as a marginal investment — particularly when the telecom manager has so many other priorities. Toll bypass does not equal IP telephony; and with only a small decrease in total costs, many feel the effort exceeds the benefits.
38
AU1438_frame_C02 Page 39 Tuesday, November 5, 2002 12:08 PM
IP Telephony International Communications. International rates, particularly for lessdeveloped countries, are still reasonably high (although falling). Organizations may have an existing Frame Relay circuit that could accommodate packetized voice traffic. Although tuning to improve QoS can improve the quality of the voice link, the increased number of hops may degrade the MOS (mean opinion score for quality). Organizations will typically pilot international VoIP to get a more direct feel for its usability.
Of course, for carriers the economics are completely different. Carrying many gigabits per second of combined customer traffic, VoIP provides significant benefits from an internal perspective. In fact, many traditional, circuit-switched services are only circuit switched at the edge; the carrier’s internal network is increasingly all packetized. Compression and CODECs The term CODEC means coder/decoder. CODECs are mathematical algorithms embedded in hardware that convert analog signals into digital, and vice versa. Another meaning, developed just recently, is compression/decompression. The quality of the CODEC strongly affects the perceived clarity of the voice signal. For example, in the cellular phone industry, the quality of voice improved significantly with faster chips and better CODECs in the telephone itself. More towers and stronger signals make a difference in reception, but an improved CODEC delivers much improvement without any increase in infrastructure cost. IP telephony has benefited from the same mathematical advances (sometimes little squiggly marks on a mathematician’s blackboard result in real-world improvements!). Exhibit 6 lists some of the currently defined CODECs, as defined by the ITU (International Telecommunications Union). Undoubtedly, new ones will be developed to squeeze even more wasted bandwidth out of voice traffic. Transmission Technologies One could spend a lifetime studying protocols. There are many ways to ship voice traffic. In some cases, voice rides on higher-level protocols, which in turn ride on lower-level protocols; in other cases, voice rides “coach class” on lower-level protocols — and pays for less bandwidth-consuming overhead. Following is a discussion of various approaches. Voice-over-IP (VoIP). IP is a layer 3 protocol, meaning it is a higher-level protocol riding on lower, level 2 protocols such as Frame Relay, ATM, or standard circuit-switched, dedicated lines (such as T1s or T3s). It can also travel over wireless links such as 802.11b, which is a wireless standard for packet transmission, including data and voice. 39
AU1438_frame_C02 Page 40 Tuesday, November 5, 2002 12:08 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION Exhibit 6. CODEC G.711
G.722
G.726 G.729
G.729ab
G.729e
G.990
Representative CODECs Description
Standard voice CODEC for high-quality (toll) sound. Used for years in the PSTN. Full duplex. Applications include ISDN, digital telephone sets on a digital PBX, digital satellite systems, cordless phones, and videophones. Converts to and from PCM voice at 64 kbps. This algorithm is required for ITU-T-compliant videoconferencing (H.320/H.323). Speech CODEC operating at 48, 56, and 64 kbps. Samples at 16 kHz unlike most other CODECs, which sample at 8 kHz. Full-duplex algorithm. Used for digital leased lines, cordless phones, voice store-and-forward systems, videophones, and various multimedia products. Defines an ADPCM (adaptive differential pulse code modulation) voice coder. Operates at 16, 24, 32, and 40 kbps. Speech CODEC. Uses CS-ACELP (Conjugate structure algebraic code-excited linear predictive) algorithm. Squeezes speech down to 8 kbps. Note: G.729a is a simplified form for DSVD (digital simultaneous voice and data — used, for example, in voice/data modems). Typical applications include VoIP, digital satellite systems, PSTN, ISDN, leased lines, voice store-and-forward, video phones, and multimedia products. Speech CODEC that decreases bandwidth requirements by exploiting the silent moments in speech. Uses CS-ACELP and achieves a compression ratio of 16:1; it has a delay of 15 milliseconds. Used for the same applications as G.729. Sometimes called “Annex E,” this CODEC is designed to exceed PSTN quality. It has a bit rate of 11.8 kbps and can easily carry general audio traffic such as music. It is also considered a good technology for remote surveillance, conferencing, and announcements (e.g., for call waiting). New standard for high-capacity traffic. Specifies symmetrical bidirectional rates of 2 Mbps and asymmetrical bidirectional rates of 640 kbps. G.990 puts the patina of “official standards” on DSL-type architectures, although DSL as implemented now will not necessarily conform to G.990. The market has demanded a broadband standard for some time and this is the ITU-T’s response.
IP is a routing layer protocol. Typically, TCP is used to ensure delivery. TCP/IP can be considered one of the wonders of the late-20th century world. Not only has it been adaptable to a plethora of media and technologies, but — even more remarkably — it has been adopted by virtually every hardware/software vendor in the world. Other than children using two tin cans and a string, everyone supports TCP/IP. While RSVP (Resource Reservation Protocol) was once considered the key VoIP protocol, now the type of service (ToS) octet field of the IP header is the new contender for effective QoS (quality of service). ToS is supposed to classify traffic at the borders between the customer and service provider (ISP). This technology is not currently in place throughout the world. 40
AU1438_frame_C02 Page 41 Tuesday, November 5, 2002 12:08 PM
IP Telephony IP fragmentation is necessary to ensure that voice traffic is not delayed excessively. Unfortunately, it adds to overhead due to the large size of the IP header. By some estimates, the headers in IP voice traffic make VoIP use as much as 50 percent more bandwidth than voice traffic over Frame Relay (VoFR). Voice-over-ATM (VoATM). Many years ago, ATM (Asynchronous Transfer Mode) burst on the telecom world as universal salve for communications technologies that would not talk to each other. There was to be a single standard — from WAN to desktop, and all the interfaces could have the same protocol (at least layer 2 protocol); the technical architect’s life was supposed to be greatly simplified.
As with most utopias, a few details prevented full implementation. ATM to the desktop was and still is prohibitively expensive for all but the few users who need extreme bandwidth; and with Gigabit Ethernet cards decreasing in cost, even those users no longer need ATM. However, the carriers did adopt ATM; it forms the backbone of the public network today (on the inside of the network, not the edge where users touch the telcos). Individual organizations also use ATM for applications where voice, data, and video need to be mixed. ATM has built-in QoS and is extremely reliable. ATM implementations using CBR (constant bit rate) provide a continuous stream of information that functions as an ersatz circuit-switched, dedicated link. Unfortunately, ATM has a few disadvantages: • The circuit emulation service (CES) of ATM monopolizes bandwidth that could be used for other applications. • ATM avoids delay by sending fixed-size ATM cells half empty, in order to eliminate the 6 milliseconds for 47 bytes of voice to fill the cell. The result is a 20 percent loss of bandwidth. To resolve these inefficiencies, the ITU developed a variable bit rate function (VBR-RT) for VoATM. This standard provides for minicells that can be stuffed into regular size cells. These minicells vary from 1 to 64 bytes. As a result, a more efficient, variable-sized payload is sent, significantly increasing efficiency. VBR-RT is part of the AAL2 (ATM adaptation layer 2) architecture and is now considered the standard for VoATM. AAL2 provides for voice compression, silence suppression, and multiple voice channels of different bandwidth within one ATM link. The variable-sized payload concept is similar to Frame Relay fragmentation, discussed in the next section. Voice-over-Frame Relay (VoFR). Voice-over-Frame Relay is a remarkable feat, akin to Dr. Samuel Johnson’s comment about a dog walking on its hind legs — “It is not done well; but you are surprised to find it done at all.” Some organizations use Voice-over-Frame Relay successfully, others find the 41
AU1438_frame_C02 Page 42 Tuesday, November 5, 2002 12:08 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION voice quality too rough for business use. However, no one argues that the cost is significantly lower than traditional dedicated circuits (costs in the one-cent-per-minute-or-lower range). Frame Relay is the offspring of the X.25 protocol. X.25 was designed for poor-quality microwave and copper transmission facilities. To make a point, one Australian engineer successfully passed data traffic over cotton string soaked in salt water using the X.25 protocol. X.25 achieves its powerful accuracy over “dirty” lines by checking each and every packet, using a CRC (cyclic redundancy check) on the contents. X.25’s strength is also its weakness. By checking every packet, along with negative acknowledgment to the sender, the delay becomes significant (around 500 milliseconds). This much delay makes X.25 unsuitable for audio, streaming video, or applications that require rapid back-and-forth response. Frame Relay, designed to operate over modern digital circuits and fiber backbones, is a layer 2 protocol that checks the validity of a frame of data, but does not request retransmission if an error is found. Instead, the higher-level application is held accountable for correcting any errors. X.25, on the other hand, is a layer 3 protocol and includes flow control, error detection and correction, supervisory information, etc. Because Frame Relay does not do such extensive error checking, the frames are routed far more quickly from one node to another in a Frame Relay network. Frame Relay supports both PVCs (permanent virtual circuits) and SVCs (switched virtual circuits). The result is that communications circuits can be shared by many users at the same time (which is why carriers charge less for Frame Relay circuits than for traditional dedicated T1s or fractional T1s). Most carriers provide Frame Relay services with a CIR (committed information rate) that specifies a guaranteed transmission rate of packets. If bandwidth is available, the rate of transmission can go significantly above that rate. Because voice cannot be delayed without severely degrading the quality of the sound, the CIR of the Frame Relay service should be adequate to carry most of the voice traffic (e.g., a 64-kbps CIR). The DE (discard eligible) bit is set to zero for frames that should not be discarded. Theoretically, these are the frames that do not exceed the CIR. Frames above the CIR are marked with the DE bit equal to one and can be dropped. Organizations have used the capacity above the CIR in the network as an added benefit, because most of the timeframes with DE equal to one get transmitted. However, as carriers fill available fiber resources, “free” bandwidth above the CIR is increasingly rare. Cost Factors. Estimating in advance the cost of a Frame Relay network is time consuming; network managers may become frustrated when trying to compare offerings from vendors. Major cost elements include the following: 42
AU1438_frame_C02 Page 43 Tuesday, November 5, 2002 12:08 PM
IP Telephony • Access line to local telco. Unless a bypass vendor is used, the Frame Relay access line that connects the organization to the public Frame Relay network goes through the local telephone company. • Port cost. An organization’s access line will terminate into a serial port on a Frame Relay switch, located at the frame operator’s premises. Various sizes of ports are available — from 64 kbps to above T1 speed. Frame Relay traffic cannot exceed the port speed. • Private virtual circuit charge. This charge to set up the circuit on the provider’s network is driven by the specified CIR. The charge is not distance sensitive (sometimes called a “postalized” rate). • Frame Relay premises equipment (FRAD, CSU/DSU, etc.). Quality of Service and Delay Sensitivity. When designing the Frame Relay network, managers should review frame length-limiting options. Otherwise, long data frames can interfere with voice frames that need to get through the network in a timely manner. This is a subset of the more fundamental problem of quality of service.
If cost were not an issue, QoS would not be relevant, because the CIR could simply be increased to the level that contention would not be important. However, cost is a major driver of VoFR (and VoIP), and CIR is one of the factors that directly affect total line charges, along with the port charges. Some providers, such as MCI, offer varying priorities to their subscribers. A sampling scheme is used to prioritize traffic for a specific customer. Traffic between customers is not prioritized. This is not a true QoS and may require specific router hardware at both ends. Some other techniques/factors for QoS include: • Ensure that abundant bandwidth exists. • Reduce superfluous traffic. For example, multicasting broadcasts to every port, regardless of need, wastes resources. Sometimes, this is termed pruning. By defining members of a specific group using DLCIs (data link connection identifiers), bandwidth can be conserved. In addition, conference calls can be supported using the same technique. • Bandwidth planning; make sure that “edge” traffic stays there and only goes on the enterprise backbone if needed. • Avoid long frames. Equipment and Configuration Considerations. Real-life implementations of VoFR use equipment with proprietary solutions. Although there is a new standard, the FRF.11 implementation agreement from the Frame Relay Forum, it will take some time for all the vendors to gear up production to meet the new standard for interoperability.
43
AU1438_frame_C02 Page 44 Tuesday, November 5, 2002 12:08 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION There are a number of compression algorithms that are used by Frame Relay vendors. Some of the most common include: • PCM and ADPCM. These are toll-quality compression algorithms that require large bandwidth (64 or 32 kbps). • ATC (adaptive transform coding). This is a simple system with a variable digitization rate. • ACELP (algebraic code-excited linear prediction). One of the latest and most well-researched compression algorithms, ACELP allows neartoll-quality speech over Frame Relay all the way down to 4.8 kbps. Other factors to be considered when reviewing FRAD (Frame Relay assembler/disassembler) equipment include: • Congestion management. The device should respond to a traffic load by varying queue size before congestion occurs. • Jitter buffering. Jitter is the variable latency for packets moving through a network. It disrupts the perception of smooth audio; the user hears irritating pops and cracks. Use of a large jitter buffer allows the incoming packets to be smoothed so that the user hears continuous speech output. • Fragmentation. To prevent large packets from disrupting voice communications, frames should be fragmented to smaller sizes (see Exhibit 7). For example, ACT Networks limits packets as follows: — Voice: 83 bytes per frame maximum — Asynchronous data: 71 bytes per frame — Synchronous data: 72 bytes per frame — Fax: maximum 58 bytes per frame • Priority service. By putting data into buffer queues first, fax and voice can be given higher priority to smooth delivery to the user. • Silence detection. Equipment should be able to detect silence and dynamically give that bandwidth to other channels as needed. • Ability to set the discard bit on. Because the retransmission of a voice packet does not make sense for a real-time environment, any voice packet must be set to “discard eligible” so that the system does not waste resources trying to retransmit a packet when it is too late to be of value to the user. Voice-over-MPLS. MPLS (Multi-Protocol Label Switching) has many of the characteristics of Frame Relay, but adds two important features:
1. More sophisticated traffic shaping. MPLS can predefine network paths from end to end. For example, a customer could set up one path for large file transfers and data bursts (not delay sensitive), and another path with a larger allocation of sustained bandwidth plus precisely controlled delay variation parameters to support voice. This could also be called Voice-over-IP-over-MPLS. 44
AU1438_frame_C02 Page 45 Tuesday, November 5, 2002 12:08 PM
IP Telephony
Exhibit 7. Frame Relay Fragmentation to Reduce Network Delay
2. Easier creation of secure VPNs. MPLS is inherently secure and does not require specialized equipment for encryption. It also does not incur the overhead of encryption. MPLS moves packets in a manner that is similar to LAN switching. Over a wide area network, it uses short, fixed-length prefixes to identify the path that packets (datagrams) are supposed to take. Layer 2 switching, used in LANs, has always been very fast but difficult to use in larger public and private networks. MPLS provides this efficiency; also, the ability to dynamically reroute traffic is a significant benefit for congested networks. The net result to customers will be better service levels and faster throughput. Use of MPLS will undoubtedly grow due to its inherent efficiency, lower cost, and ability to self-optimize by rerouting around bottlenecks. Most routers today have built-in support for MPLS. Roadblocks to IP Telephony Gatekeepers. Gatekeepers have a completely different function than gateways. By controlling H.323 and SIP (Session Initiation Protocol) endpoints (e.g., for Microsoft’s Netmeeting), gatekeepers control the communications between various IP devices on the network. For example, they perform address translation so that users can call “Bill Smith” rather than “171.144.228.45.” Gateways, which will be discussed later, are workhorses 45
AU1438_frame_C02 Page 46 Tuesday, November 5, 2002 12:08 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION that handle TDM-to-IP translations and other internal voice processing chores. For a small IP telephony installation, a gatekeeper may not be necessary; however, larger implementations require a gatekeeper. For example, VoIP gateways register at the gatekeeper, and the gatekeeper identifies the correct gateway for the user to call a specific user. The gatekeeper will typically include call registration, endpoint control, call services, and management of IP address directories. Some gatekeepers have a status port (TCP port 7000) that can be used to monitor/control the network. An example gatekeeper is the Solphone GK-3250, which manages 1024 gateways and 250 concurrent calls. The smaller version, the GK-3025, manages 100 gateways and 25 concurrent calls. Signaling Protocols. SIP (Session Initiation Protocol) and H.323 are often portrayed as contenders for IP telephony signaling leadership. For several years, they will coexist and certainly any network that runs one should at least interface with the other. Some networks run both.
H.323, the older of the two protocols, is an umbrella ITU standard that describes packet-based audio, video, and data conferencing (delay-sensitive traffic). Other standards, such as H.245 and Q.931, fall under H.323. Together, these standards determine how a call (or session) is set up, monitored, terminated, etc. across a network. Other protocols, such as RTP, actually carry the IP telephony voice traffic. Microsoft, by including a complete SIP stack to Windows® XP, is betting that SIP will be key to telephony, instant messaging, and other real-time multimedia communications. SIP has moved from a dark horse to the front of the pack. As a straightforward, text-based protocol, it links IP phones, soft switches, and gateways. SIP lets other protocols perform functions such as security (e.g., IPSec); its mission is to link devices and do session initiations. Because SIP is such a bare-bones protocol, others can add specialty protocols without changing the fundamental session initiations that allow devices to talk to each other. H.323, on the other hand, tries to exhaustively specify many more conditions, and thus is not as flexible or efficient. Developers are creating add-ons to SIP all the time. For example, SIP-CPL (Call Processing Language) provides an XML-based system to direct call flows. SIP telephones are penetrating the market. Some potential benefits include the following: • An easy-to-include, built-in smart card reader eliminates manual logins 46
AU1438_frame_C02 Page 47 Tuesday, November 5, 2002 12:08 PM
IP Telephony • Independence from the PBX; SIP phones act more like computers than telephones • Ability to talk to directory servers and perform more complex queries • Large number of speed dial numbers (many hundreds) can be programmed into the phone itself • Easy integration with soft switches, such as Interactive Intelligence’s soft-PBX Cisco offers a PIX firewall that provides standard security services while allowing outside SIP connections. SIP, because of its simplicity and flexibility, will increasingly enable real-time communications devices to communicate and use network facilities. Example Applications of IP Telephony. Traditional, TDM-based PBXs and
adjuncts support thousands of applications. In some cases, IP telephony has not caught up yet, but the functionality lag is diminishing quickly. Most of the current “IP telephony” applications are the same ones that have been around for several years — unified messaging, personal assistants, 911 support, speech recognition, etc. Aside from infrastructure differences, which are not visible to the average user, there are some real differences at the station level. The IP phone is not a digital telephone with specialized, highly proprietary chips in its circuitry. It is a computer ! Listed below are just a few applications. As any other computing system, IP phones can be programmed to do what makes business sense: • Hotels, room-service ordering. By ordering from an electronic menu shown on an IP phone, customers avoid having to call and potentially get put in a queue, or have a misunderstanding about the order. The hotel avoids card printing and card collection. Other hotel applications include flight information, recreational services available (e.g., golf tee times), and entertainment. • Airlines. Using on-board IP telephones, customers can make gift purchases, and check weather, stock pricing, and other information. • Educational institutions. Students can register for class, send targeted e-mails, buy books, and obtain football or basketball tickets. SUMMARY IP telephony continues to gain market share; no one now questions the outcome of the TDM-versus-IP telephony competition. The economics of IP telephony/VoIP are compelling because of the enormous bandwidth efficiencies, compared to traditional circuit-switched networks. Many organizations have existing data networks that could easily tolerate the relatively small increase in traffic from voice. Standards, such as SIP and H.323, are 47
AU1438_frame_C02 Page 48 Tuesday, November 5, 2002 12:08 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION enabling organizations to use multiple vendors and drive a much-higher quality of sound to the end user. IP telephony will not be toll quality if the design and implementation of the voice/data network are poor. Fortunately, the algorithms are getting better, along with the underlying edge equipment. After determining if the savings merit the investment and attention,3 an organization should consider a pilot program. Although the technology can do the job, testing of equipment parameters, QoS, network management, and integration into the LAN/WAN should be carefully reviewed prior to installation. Notes 1. Contrary to popular belief, nothing is cheaper than a mainframe for routine, headsdown operations. In most cases, however, it is more important for people to be efficient than for the architecture to be efficient. 2. The author has seen several instances where CAT 3 wiring has been incorrectly used to transmit data from a switch to end users (CAT 5e is the minimum grade for data). Users experienced chronic slowdowns in response time as TCP/IP frantically retransmitted packets over inadequate wire. This will not work in IP telephony! 3. Many organizations are still amortizing TDM-based PBXs that were purchased in the late 1990s.
48
AU1438_frame_C03 Page 49 Tuesday, November 5, 2002 12:09 PM
Chapter 3
CTI Concepts and Applications INTRODUCTION Computer telephony integration (CTI) is the set of software and hardware components that allows a computer to manage telephone calls and integrate telephony services into desktop computers, servers, PBX devices, and other computing equipment. The generic term CTI is often used to include IVR (interactive voice response), unified messaging, and sometimes, Voice-over-IP (VoIP). CTI is growing at roughly 30 percent per year, with the small office–home office (SOHO) segment accounting for a significant part of that growth. This chapter focuses on the technological underpinnings of the myriad of applications that are being developed for organizations of all sizes. Later in the chapter, examples of CTI implementations are described. Viewed in the perspective of several decades, the drive toward CTI is similar to the escape of the user from the straightjacket of the mainframe in the 1980s. By using personal computers, far more computing power was brought to knowledge workers. CTI is equally revolutionary; the power in servers and the PBX is now being distributed to many more people than before. The value of telephony is at last being fully realized. GENERAL FUNCTIONS OF CTI Later sections in this chapter address CT functions at a detailed level. From a high-level perspective, CTI provides the following: • First-party call control. Using a PC and modem, a telephone or telephone-like instrument can make calls based on a database or a directly entered phone number. First party allows calls to be controlled from a single computer but not calls going to others in the organization (e.g., to reroute other agent’s work when they are on vacation). The computer is connected to the telephone so that calls can be monitored, initiated, and answered. In this mode, the telephone set can function completely independently of the PC. 49
AU1438_frame_C03 Page 50 Tuesday, November 5, 2002 12:09 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION • Third-party call control. Ability to view and control calls coming to others in the organization (e.g., pick a call ringing on another person’s extension). The telephone set is linked to the PBX or telephony server. The PC is connected only to the LAN. Hence, a “third party” or the PBX/communications server links the two. In order for the PC to initiate telephony actions, a signal must be sent via the LAN to the telephony server, which in turn causes actions to take place via the telephone set. • PC as telephone (softphone). With appropriate circuit boards (e.g., Dialogic boards), the PC can be made into an ersatz telephone, with the standard functions shown as graphical icons. This is often a SOHO solution. • Automatic recognition of incoming calls based on caller ID or ANI (automatic number identification). Use of that information to retrieve or manipulate information on databases. • Automatic logging of call activity (via logs with date and time stamps). Fax, e-mail, and voice mail messages associated with a specific customer can be stored together and indexed to provide a complete record of communications with the customer. • Fax management. • A series of voice-activated services, such as text-to-speech or speechto-text. • Display information about the call on the monitor. Using CTI, previous calls and the current call can be shown in far more detail than is available with the limited LCD display space found on most telephones. • Multiple call handling facilitated via a computer screen interface. • Easy setup of multiparty calls. • Using Internet tools such as JTAPI (Java telephony application programming interface), browsers can be telephony enabled so that customers and employees can use online directories and complete transactions using voice communications after reviewing information on the Internet. • Allows for innovative functions such as locating key employees in an emergency (e.g., call first goes to office number, then cellular phone, then pager, then to the next person on the emergency notification list, etc.). Personal agents can read the time, date, and topic of a meeting over the telephone to the appropriate party. BASIC ARCHITECTURE The three common configurations for computer telephony integration include: 1. Client/server. The connection between the telephone and the desktop is a logical, not physical connection. Telephony connections and 50
AU1438_frame_C03 Page 51 Tuesday, November 5, 2002 12:09 PM
CTI Concepts and Applications functions are implemented via a telephony server (third party). See Exhibit 1. 2. Desktop. There is a direct, physical link between the telephone and the desktop. Special hardware is required to ensure that the proper dual tone multi-frequency (DTMF) signals, etc., can be sent to the PBX from the computer. This first-party configuration is maintenance intensive for large organizations. See Exhibit 2. 3. PC as telephone (softphone). With the appropriate telephony boards and multimedia hardware, the PC can function as a telephone. Of course, when the LAN or network goes down, the user has no telephone. Given the problems in most organizations’ legacy wiring infrastructure, this option has not been as widely deployed as one would think, given there is no physical telephone to buy and maintain. However, call centers and other high-volume environments are using softphones to simplify activities such as conferencing, as well as add-business functions. For example, an agent could have a softphone with buttons for mortgage loans and unsecured loans. By clicking a button, the appropriate application, perhaps a Visual Basic program, could be started as required.
Exhibit 1.
Third-Party CTI
51
AU1438_frame_C03 Page 52 Tuesday, November 5, 2002 12:09 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION
Exhibit 2.
Desktop CTI Architecture
IP telephony provides even more options. If the telephone on the desktop (or wireless unit) is a small “computer,” then it will have capabilities that allow it to perform functions independently. For example, using a JTAPI program (discussed later), the telephone could obtain conference room availability or initiate a directory lookup. APIS AND CT STANDARDS CTI would be considerably more difficult without application programming interfaces (APIs). The code to control telephony via the computer would be complex and would vary for each platform. The developer needs the ability to send commands to a software layer that abstracts the hardware layer from the application. APIs also allow code to be ported (with a reasonable level of effort) to older hardware, thus saving the organization considerable expense. Computer telephony has been around for many years. Before the proliferation of APIs, it was accomplished via proprietary and tedious coding; customers got only what their PBX vendor offered. APIs are used by a program to accomplish a specific activity. For example, a programmer could write a simple dialer in Visual Basic or Java that would dial any number highlighted; the user may keep a name and address list in MS Word and invoke a Visual Basic program that dials the highlighted number via a modem connection. With the growing use of call controls from the desktop, a standard interface is essential to allow developers a straightforward way to provide users with all the telephony options 52
AU1438_frame_C03 Page 53 Tuesday, November 5, 2002 12:09 PM
CTI Concepts and Applications available on the machine. APIs provide many functions, including ODBC (open database connectivity) database access, MAPI (messaging) applications for e-mail, transport layer interface programs, and other functions. APIs in commercial use today include TAPI (telephony), TSAPI (telephony services), JTAPI (Java telephony), IBM’s CallPath, Apple’s Telephony Manager, CSTA (Computer-Supported Telephony Application) and the Linux telephony API. The most prominent of these APIs are discussed here. The architectures differ significantly and should be reviewed carefully before major telephony development is initiated or purchased. TAPI Telephony application programming interface (TAPI) was originally developed by Intel and Microsoft in 1994 and is delivered with all current Windows® architectures. For example, a home user with a modem line and running Windows XP could write a program in Visual Basic to dial a number based on a name in a private directory. It could also be used in an Excel macro to make a phone call to someone if a certain cell has a negative balance. The fact that any Windows-based program can be telephony enabled is an exciting prospect. It means that any perceived event can trigger a telephony action or that a telephony action (e.g., a phone call from the boss) can trigger an event such as launching Excel and displaying the latest summary sales projections. It is easy to be creative with this capability. For example, you could have your stockbroker’s telephone number tied to a trigger that queries the price of ABC stock on the Internet. If the price exceeds your target, the trigger can launch a Word document that says “Call stockbroker and sell ABC.” TAPI supports both traditional PSTN telephony and IP telephony. Like any other API, TAPI requires that the manufacturer write the low-level code (driver) that allows high-level TAPI commands to communicate with the hardware. These drivers are formally called TSPs (TAPI service providers). Windows has a built-in TSP, called Unimodem, that allows connectivity to most manufacturers’ modems. All other telephony hardware, such as PBXs and DSPs (voice processing cards), need a TSP to be TAPI addressable. Because not all hardware has the same capabilities, not all TAPI calls will be supported by every TSP. For example, if one telephony card supports caller ID and another one does not, then TAPI can only retrieve the information from the card that supports caller ID. Programs that are not TAPI compliant cannot share telephony devices. For example, assume a user needs a simple answering machine application via a voice modem. In addition, outgoing calls need to be made on the same voice modem (to perform dial-up networking). These two applications compete for the same resources, and one must be shut down before the other is invoked. Less than elegant. 53
AU1438_frame_C03 Page 54 Tuesday, November 5, 2002 12:09 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION As an aside, it is interesting to note that voice modems have some limitations compared to their more expensive telephony card cousins: • • • • • •
Limited audio quality Spotty DTMF tone detection, making IVR applications less reliable Busy detection is not always effective Lack of support for headset devices No ring-back signaling Unreliable remote party hang-up signaling
There are other limitations to voice modems not listed; the point is that computer telephony cannot exceed the hardware upon which it depends. Professional-quality applications (except for some limited home use) need high-quality and highly functional telephony boards. TAPI-compliant telephony devices work best with digital rather than analog connections. Call progress detection is key to telephony applications (it is important to know when the other party picks up!); digital connections provide more detailed and reliable call progress detection. Of course, call progress detection works with analog lines; however, if software is distributed internationally, the varying characteristics of analog signaling (versus standard ISDN or T1) often need to be modified between countries. When first introduced, this first-party API was a rather limited function set of telephony capabilities. Its intent was to abstract the hardware layer so that developers can create software not dependent on specific hardware (device) configurations. Microsoft has steadily enhanced the capabilities to include support for IP telephony (in other words, the application using TAPI 3.0 does not have to know if it is connecting to the PSTN or an IP network) as well as Intel’s universal serial bus (USB). TAPI is supported by more than 100 CT vendors and appears to be gathering momentum. Many Windows applications use TAPI. For example, ACT! Contact Manager dials numbers from its database. It pops a screen for incoming calls matching a telephone number in the database. TAPI also includes ActiveX support, which allows developers to use a graphical environment to develop modular programs that can be assembled into robust applications (ActiveX will be discussed in more depth later). With the addition of object-oriented tools, TAPI is a strong competitor for market dominance. TAPI 3.0 is implemented as a suite of COM (Component Object Model) objects rather than a TAPI/C Windows API set, as was the case with TAPI 2.1. Evolving to the COM model allows TAPI applications to be written in any language, such as C/C++ or Visual Basic. Other standards/functions supported include H.323 conferencing, improved QoS, and IP multicast conferencing. By using the Windows 54
AU1438_frame_C03 Page 55 Tuesday, November 5, 2002 12:09 PM
CTI Concepts and Applications Active Directory service, deployment of applications is simpler for larger organizations. The Active Directory feature is important. In a typical packet-based network, a user’s network address (IP address) changes frequently between sessions. According to Microsoft, “user-to-IP mapping information is stored and continually refreshed using the Internet Locator Service (ILS) Dynamic Directory, a real-time server component of the Active Directory.”1 TSAPI Telephony services application programming interface (TSAPI) was one of the first APIs and was developed by Novell and AT&T (Avaya). It is designed to integrate the PBX with Novell’s Netware LAN. Major functions include call control, call routing, monitoring, directory services, and query. It is server rather than client based and requires a Netware file server to be physically linked to the PBX. TSAPI provides good call control because of the logical link between the telephone and the workstation. Server-based applications have the advantage that no physical link is required between the PC, the workstation, and the telephone (as is the case, for example, with Siemens’ COM Manager). This makes TSAPI less costly to implement in a large office where hundreds or thousands of legacy telephone sets may be in place. Another advantage of TSAPI is that many operating systems are supported (e.g., Mac OS, UNIX). At one point, TSAPI seemed to be languishing in the market place, but recent efforts by Avaya to continue development may lengthen its life. It appeals to large companies that may have multiple legacy operating systems. For example, Avaya has partnered with CCOM to support Phoneline, an application that allows the user to dial numbers, to conference, and to perform many functions on the PC that would normally have to be done on a handset. Phoneline uses TSAPI to drive directory lookup, logging of calls, and quick call setup. The telephony work is done at the server level, not at the PC. TSAPI has been a favorite of call center developers because of the rich call-control feature set (e.g., support of agents, skill-based routing, groups of interest). The biggest threat to TSAPI appears to be TAPI, which is supported by Microsoft. JTAPI Java telephony application programming interface (JTAPI) is a portable (i.e., can be ported to various operating systems), object-oriented API for Javabased computer telephony applications. Introduced in 1996 by Sun Microsystems in collaboration with a number of telecommunications firms, JTAPI supports first- and third-party development. A Java virtual machine (VM) and a JTAPI-compliant telephony subsystem are the only requirements. Browsers such as Netscape and Internet Explorer are delivered with Java 55
AU1438_frame_C03 Page 56 Tuesday, November 5, 2002 12:09 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION VMs. As a result, Web-based applications using JTAPI can run on the majority of platforms, using embedded HTML applets. JTAPI is perceived as more open than the other APIs. Indeed, packages such as IBM’s CallPath, TAPI, and SunXTL (for the SPARC/Solaris environment) can invoke JTAPI modules for telephony functions. Another strength of this young product is its portability — developers can move from one platform to another with little modification to the code. The Java telephony model includes the common “objects” that are found in real phone calls — ringing, connection, addressing, termination, etc. JTAPI can interface with a telephony card inside a PC as easily as the traditional PBX. Addresses are, of course, telephone numbers. JTAPI has some ambitious goals: • Simplify telephony development • Support both first- and third-party call control • Provide API for a wide variety of telephony applications and hardware/software platforms • Interface with existing TSAPI- and TAPI-based applications • Support more complex functions such as call center routing, with extensions to the basic set of features • Supply reusable objects to speed development JTAPI is seen in the market as an enabler of the “Internet phone,” where Web access and telephony are combined. As an example of JTAPI implementation, consider a typical inbound call center. A customer service group may receive calls directed by an intelligent ACD (automatic call distributor), and then use an intranet or the Internet to access the business application and get screen pops as calls are answered. JTAPI programming (similar to Netscape plug-ins familiar to most home Internet users) would allow telephone functions (make a call, transfer a call, conference, etc.) in a frame within the browser. A browserenabled agent can also use Java Media Framework (JMF) and Java Speech to integrate audio and video, so that both the customer and the call center agent can see each other while they are talking. Traditional call center statistics (number in the queue, average wait time, etc.) could be programmed to display on the agent’s browser screen. Similarly, computer telephony using JTAPI (or ActiveX, for that matter) can help outbound call centers. CT-enabled dialers fall into the following categories: • Call preview dialers. CT obtains database records on a specific customer before the call is placed so that the agent is fully informed before speaking to the customer. This would typically be implemented 56
AU1438_frame_C03 Page 57 Tuesday, November 5, 2002 12:09 PM
CTI Concepts and Applications for high-value customers (e.g., more than $3000 business annually for a retail customer). • Predictive dialers. CT does all the hard work of dialing, listening (via tones) for voice mail or ring no answer, and other conditions that would preclude a live conversation, then turns over — most of the time — a “live” customer to the agent. Screen pops are used sometimes for prospective customers (e.g., to go against demographic data and judge the approximate family income of the prospective customer by the zip code or address). Using an Internet connection, telecommuters can participate in the full functionality of an office-based communications system while still working from home or a remote office. The organization benefits from lower operating costs (in major cities, the cost of office space is frequently $5 to $10 thousand per person, per year), extended-hour coverage, more robust/quicker response to central disaster occurrences, and better workforce management. JTAPI applications via an ISDN line provide the functionality over a wide range of equipment; the telecommuting agents can answer calls, transfer calls, and participate in ACD routing as appropriate to their skill set. Developers in Java use the JMF API to synchronize and control timebased data such as audio and video from within a Java application or applet. On the surface, it is analogous to the popular RealAudio Player familiar to home Internet users who listen to their favorite stations over the Internet (via packetization of music). However, this Java media player controls audio, video, and MIDI across all Java-enabled platforms. JMF will support new CODECs. One of Sun Microsystem’s goals for JMF is to support as many media types as possible and to make JMF function on a wide variety of hardware platforms. JMF players can be written in the Java programming language. JMF includes the following audio formats: • • • • • • • •
AIFF. Audio interchange file format AU. UNIX file format DVI. Digital Video Interactive G.723. An important standard, it is the ITU-T’s recommendation for compressed audio over standard POTS lines GSM. Standard for cellular compression IMAP4. Interactive Multimedia Association standard for multimedia systems MPEG. Moving Picture Experts Group; standard for reduction of storage requirements for video PCM. Pulse code modulation; a common algorithm for converting an analog voice signal into a digital representation 57
AU1438_frame_C03 Page 58 Tuesday, November 5, 2002 12:09 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION • RMF. Rich Music Format; allows high-quality music and sound to be delivered over the Internet or IP circuits • WAV. Well-established digital representation of sound waves JMF 2.0 will support sound and video capture. Organizations that need to disseminate audio/video information should consider standards such as JMF when developing their multimedia infrastructure. The advantages and disadvantages of TAPI, TSAPI, and JTAPI are summarized in Exhibit 3. CSTA Computer-Supported Telephony Application (CSTA) was initially championed by Siemens and has been adopted by the European Computer Manufacturers’ Association. It is vendor and media independent, and includes media blending of telephony using computer applications. For example, a PC running a softphone application can communicate through a CSTA interface to a telephone system connected to the PSTN to make a phone call. It has not been as popular in the United States as in Europe. TAPI 3.x, with the support of Microsoft, could further marginalize CTSA and other telephony APIs. Linux Telephony API The open source movement is here to stay. Computer telephony is included with all the other functions that Linux, the open source operating system, provides. Quicknet Technologies is one of the vendors that provides a full set of drivers for Linux. OhPhone (http://www.openh323.org) is an open source program that developers use as a model for Linux computer telephony systems. As Linux systems continue to proliferate, more tools will undoubtedly be available. Exhibit 3. API
Comparison of the Major Computer Telephony APIs Advantages
TSAPI Most mature product Supports complex routing and other functions needed by call centers Better for environment with hardware/ software diversity TAPI Most popular by far Suitable for small companies (direct connection between phone and computer) JTAPI More open Has more features than TAPI Focus on reusable objects and object libraries
58
Disadvantages Novell has dropped support Third party is only option (may not be economical for small business)
Restricted to Windows-based platform
Has not had the same level of public exposure as TAPI Newer releases of TAPI could become a de facto standard
AU1438_frame_C03 Page 59 Tuesday, November 5, 2002 12:09 PM
CTI Concepts and Applications ActiveX ActiveX is part of Microsoft’s component technology known as COM (Component Object Model) and now DCOM (distributed COM for communication across networks). ActiveX allows developers to build dynamic, telephony-enabled Web pages by using software objects rather than codingdetailed TAPI functions. When the end user views a page that has ActiveX controls, the components of the HTML page are downloaded to the user’s hard drive and stay there for later use. The controls are objects created to perform specific telephony (or perhaps graphical) functions on the Internet, intranet, or telephony network. This is a convenient way to keep an application updated on each user’s desktop. ActiveX controls mesh well with the Windows operating system interface (as one would expect), but are also being ported to Macintosh and UNIX. As a result, ActiveX may further encroach on the market share of Java. ActiveX operates with a different philosophy than Java. Java and JavaBeans run exclusively in the Java Virtual Machine. This preserves security by preventing the Java executable from getting into the memory of other applications or executing low-level disk-drive functions. This reduces some of the capability of Java (e.g., use of the right mouse button). On the other hand, Java works easily across multiple operating systems. Microsoft is attempting to address the security holes by promoting its Authenticode 2.0 security system. USING COMPONENT SOFTWARE Writing code in Visual Basic, using TAPI to direct real-time telephony events, is for those who enjoy programming challenges and have lots of time. Real-world developers generally use packaged software components such as Microsoft’s ActiveX or Sun Microsystems’ JavaBeans to insert telephony features into applications. Productivity is greatly enhanced if real-time events can be handled by a package. The programmer must still understand what happens in a telephony environment (e.g., ring no answer), but the details can be handled by the component technology. Java offers reusable components called “beans.” There are beans specifically designed for telephony that use JTAPI as their interface standard. Beans can be simple, such as a visual slider bean that allows the user to vary a field between 0 and 100 percent, or a calendar bean that provides data information. Like predecessor packages such as Visual Basic, Powerbuilder, or Delphi, JavaBeans can be assembled to construct applications (including telephony applications). 59
AU1438_frame_C03 Page 60 Tuesday, November 5, 2002 12:09 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION Using ActiveX, for example, telephony functions can be set up via drag and drop of reusable components in an application like MS Access or Visual Basic. Using these components, the developer can answer calls, select caller data from an IVR repository, generate voice prompts, and initiate screen pops. As the graphical, reusable components become easier to assemble and debug, it is likely that the industry will move toward more customization of software rather than the limited choice, off-the-shelf, hard-to-change software previously available. There are many packages that customer development staff can use to enhance development of telephony applications. For example, Prime CTI provides middleware that enables Windows-based contact managers, personal information managers (PIMs), and other database programs to interface with traditional PBXs and key systems. Another vendor, Sunny Beach Technology, offers a low-cost, speech-enabled TAPI telephony package (Active Call Center). Automated attendants, answering machines, voice mail-to-e-mail, and other applications can be developed rapidly. Packages such as these reduce the complexity of TAPI and TSAPI into a smaller group of simpler calls that accomplish the same functions. DISTRIBUTED VERSUS DESKTOP CT As CTI first began to be deployed in large organizations, the solutions (such as TAPI) were desktop oriented. Telephony boards had to be purchased for each client. Upgrades were expensive, because a technician had to “touch” each workstation; also, telephony boards placed in workstations (such as a fax board) cannot be shared. The solution that large organizations are moving toward is client/server (distributed) telephony. By placing telephony functions on a server, clients can move to a software-only environment (no local telephony hardware). Advantages to this approach include: • The bulk of the maintenance is at the server rather than at the workstation. • Telephony resources can be shared throughout the organization. • Upgrades are greatly simplified. • Enterprisewide cost is significantly reduced; because resources are shared, economies of scale necessarily obtain. • Ports, boards, and other capacity-sensitive resources can be added to the server without affecting the desktop. SMS Controls Short messaging service (SMS) applications are growing rapidly. Typically read on a wireless PDA, pager, or cell phone, these short messages are an
60
AU1438_frame_C03 Page 61 Tuesday, November 5, 2002 12:09 PM
CTI Concepts and Applications excellent vehicle for timely alerts or notifications. Telephony events can initiate SMS transmissions. For example, a high-priority voice mail might trigger a message alerting someone of a special trading deal that needs to be handled immediately. Allen-Martin, Inc. offers a package, amSMS, that provides tools for implementing SMS within a TAPI environment. amSMS is written in Visual Basic 6.0 and uses TAPI for modem call control. By using the TAP (Telocator Alphanumeric Protocol) as its communication mechanism, amSMS can communicate with many devices. INTEROPERABILITY STANDARDS The telephony world tended to be proprietary and closed until the 1990s. As applications became more complex, and a much larger number of CTI vendors entered the field (as a result of increased user demand), the ability for hardware/software to interoperate became more important. A full discussion of telephony standards would require many tomes. Some of the more important/visible are discussed here. MVIP Multi-Vendor Integration Protocol (MVIP or MVIP-90) has become the standard for integrating heterogeneous technologies such as telephone circuit boards. Within a single hardware platform (e.g., a PC chassis), multiple vendor cards can coexist as long as they conform to the MVIP standard. MVIP supports 32 Mbps throughput and 256 full-duplex paths on which data can flow (this should be adequate for even enterprise-level CT). In place since 1990, MVIP has been widely adopted for voice, fax, data, and video services by more than 170 vendors manufacturing board-level MVIP products. Consider, for example, the hardware requirements for a large audio conference or traffic originating from a T3 or SONET link. MVIP ensures that there will be adequate capacity to handle all the traffic within a single chassis or across multiple-chassis nodes. The following quote from Mitel’s Communicating Objects Division provides the flavor of the standard: The MVIP bus consists of communications hardware and software that allows printed circuit cards from multiple vendors to exchange information in a standardized digital format. The MVIP bus consists of eight 2-MB serial highways and clock signals that are routed from one card to another over a ribbon cable. Each of these highways is partitioned into 32 channels for a total capacity of 256 full-duplex voice channels on the MVIP bus. These serial link from one card to another…2
61
AU1438_frame_C03 Page 62 Tuesday, November 5, 2002 12:09 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION H.100 and H.110 Part of a group of interoperability agreements, H.100 specifies hardware configurations at the chassis card slot for a CT bus interface. It allows the industry to concentrate on the more-pressing issue of software compatibility by reducing or eliminating incompatible buses. H.100 supplants both the competing MVIP-90 and SCbus standards. The specification was developed by an ECTF (Enterprise Computer Telephony Forum) working group. H.110 differs from H.100 in that H.110 supports CompactPCI hot swapping. S.100, S.200, and S.300 Without voice-processing software standards, multiple (different vendor) applications could not run on the same physical box. S.100, also developed by ECTF, specifies how CT applications can be developed in an open environment, independent of supporting software. For example, an organization may want to run fax-on-demand and text-to-speech applications from different vendors on the same NT server. If both are S.100, then (at least in theory) they should be able to coexist on the same box. According to Call Center News (www.callcenternews.com): S.200 defines a client/server protocol corresponding to the S.100 APIs. It defines the messages that are exchanged between the client application and the resource server. S.200 will enable the mixing of applications and servers from different vendors. S.300 makes it possible for developers to easily add different vendors’ technologies to a computer telephony server without the need for rewriting applications.
Both S.200 and S.300 are still under development. H.323 Voice and other time-sensitive traffic are rapidly moving to packet-based networks. Anticipating this trend, the ITU developed a standard in 1996 that specifies how call control, channel setup, and CODEC specifications should work for real-time traffic (i.e., traffic that does not have a guaranteed QoS to ensure that the flow of media is smooth). The H.323 protocol supports ITU G.711 and G.723 audio standards. It also supports the Internet Engineering Task Force’s specifications for controlling audio flow to improve voice quality. Microsoft’s NetMeeting 3.01 is an example of an application that uses the H.323 Protocol. NetMeeting enables real-time, point-to-point audioconferencing over the Internet or corporate intranet. It includes features such as half- and full-duplex audio support for real-time conversations, automatic microphone sensitivity-level settings, use of MMX-enabled voice CODECs to improve compression performance and microphone muting. By operating under the umbrella of the H.323 standard, this package can interoperate with 62
AU1438_frame_C03 Page 63 Tuesday, November 5, 2002 12:09 PM
CTI Concepts and Applications other H.323 audioconferencing, such as Netscape Conference. NetMeeting also includes functions beyond audio, such as white boarding, videoconferencing, file transfer, shared clipboard, and chat. SIP Session Initiation Protocol (SIP) was discussed in Chapter 2. As a straightforward, text-based protocol, it links IP phones, soft switches, and gateways. SIP lets other protocols perform functions such as security (e.g., IPSec); its mission is to link devices and do session initiations. Because SIP is a bare-bones protocol, other specialty protocols can be added without changing the fundamental session initiations that allow devices to talk to each other. H.323, on the other hand, tries to exhaustively specify many more conditions and thus is not as flexible or efficient. LDAP Lightweight Directory Access Protocol (LDAP) is often viewed more as a data standard than a telephony standard. However, because it houses the “white pages” for an organization (i.e., it is a directory service), it becomes important for telephony addressing as well. In other words, “John Doe” might have an entry that includes not only his Internet ID, building, and room number, but also his telephone number and fax number. LDAP can function as a directory on its own or can link to a full-featured, X.500-compatible directory service. LDAP runs over TCP/IP networks and is relatively broad in its scope of addressing. For example, it can also store JPEG photographs, µ-law encoded sounds, URLs, and PGP (Pretty Good Privacy) keys. DEVELOP VERSUS BUY IT organizations have always faced the “make or buy” decision. With the introduction of JavaBeans and ActiveX development tools, it is much easier now for organizations to telephony-enable existing applications rather than having to buy a product for every need. This both reduces cost and increases application flexibility. Suppose, for example, a company has an intranet “find an employee” application. It would be far better to telephony enable that application to allow the user to dial the person found via the search engine. ActiveX interfaces nicely with Internet Explorer and could produce the functionality relatively quickly. Otherwise, a separate package must be purchased (usually on a per-seat basis) and installed on an already-crowded desktop (in terms of memory and other resources). It also means yet another package for users to learn. Microsoft has developed an architecture called WOSA (Windows Open Services Architecture). Using WOSA allows programmers to write a set of APIs that can be used to telephony-enable potentially thousands of applications that are now WOSA-compliant. 63
AU1438_frame_C03 Page 64 Tuesday, November 5, 2002 12:09 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION APPLICATION GENERATORS AND CT ARCHITECTURE One method to quickly generate telephony-enabled applications is to use an application generator, such as DavaViews’ DV-Centro. The best application generators have the following characteristics: • Generate code in a standard graphical and object-oriented language such as Visual C++; code generation should work both ways — code can be generated from graphical design and graphical design should be reconfigured when code changes are imported back into the design • Link application to Web pages via ActiveX, Java, or similar architectures • Interface graphically with the developer • Maintain an object library for real-world telephony functions • Limit or eliminate traditional, line-by-line coding • Generate both source and executable code CTI architecture is a serious, long-term decision for an organization. Sales presentations and demos, unfortunately, do not reveal all the key elements that will make a platform successful for an organization (whether an end-user organization or a systems developer). Development and longterm maintenance are two sides of the same coin and should be carefully considered or costs will escalate over time. There are somewhere between 50 and 100 application generators on the market now. Following are some characteristics that should be considered when selecting an architecture: • True client/server architecture is essential; without it, systems cannot be scaled. • Portability requires Java or a “Java-like” script language. In addition, the application generator must be able to address various PBX and network standards such as E1 and ISDN, function in different languages, and access a plethora of databases. • The GUI should provide help for new developers, but not restrict the work of experienced developers. • As in Visual Basic “project” files, all elements of an application should be tagged and saved as a unit. • The script language should be easy to learn and readily adaptable to the unpredictable nature of caller interactions. Graphical flow of call processing must be obtainable from script code. • Documentation generation must be automated and part of the package. • Modular architecture for CTI applications, as was true even for COBOL mainframe programming, is essential for effective, long-term maintenance. As technology changes, only certain modules need to change, not the entire application. 64
AU1438_frame_C03 Page 65 Tuesday, November 5, 2002 12:09 PM
CTI Concepts and Applications • The hardware platform should have no single-point-of-failure, or least have the capability of functioning that way, if such robustness is required. For example, if all switching is concentrated on a single component, and it fails, the entire system fails. For critical business applications, this is not acceptable. Nonstop/redundant processing (via “hot-swappable” hardware) also supports 24/7 operations. • Functions such as text-to-speech and database lookup should function independently in order to increase overall processing efficiency and allow programmers to independently add/change/delete modules as needed. • Full Web enablement is essential. Resources such as telephony, fax, and access to databases should be available via a Web browser. • System monitoring via SNMP to a monitoring package such as HP’s OpenView or IBM’s NetView is necessary for remote monitoring. • Simulation tools, which allow component testing without live calls, are an important consideration. The CT package should mesh with the hardware testers available on the market. MIDDLEWARE EXAMPLE The generic definition of middleware is a software package that sits between one system (client) and another system (server) and facilitates the exchange of information between the two. To clarify this rather fuzzy definition, the following describes an illustrative middleware product, Concentric Solutions by Spanlink Communications. An Example Middleware Product: Concentric Solutions Concentric Solutions is a graphical development product that adds turnkey CT capabilities to Windows-based programs (including help desk software, personal information managers, and contact managers). It supports telephony enablement via TAPI and TSAPI. According to Spanlink (www.spanlink.com): Concentric Solutions is easily installed with little or no training and requires no customer programming or setup costs. Concentric Solutions’ advanced CTI functions include inbound screen pops, intelligent call routing and call screening, coordinated voice and data transfers, outbound preview dialing, and robust screen-based telephony.
The critical eye of telecommunications managers that have implemented real-world CTI systems may take issue with the words “easily installed with little or no training.” However, the description is relative — compared to straight TAPI calls out of Visual Basic or Visual C++, middleware is simple to implement. According to Kevin Avery, Vice President of Marketing for Spanlink, “installation accounts for only 7 percent of the lifetime cost of a CTI 65
AU1438_frame_C03 Page 66 Tuesday, November 5, 2002 12:09 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION application; the rest is spent on maintenance and changes.” Hence, packages such as Concentric Solutions, which provide for a simpler maintenance and development interface, allow customers to perform most of the work themselves. Avery noted that his firm was able to develop a comprehensive solution for a large computer manufacturer in less than two months; 9000 agents and 11 contact centers were included in the customization, 15 days were required for the analysis, and a 38-page document was created. The customer can now make modifications to call flow and link applications as needed, without assistance from Spanlink. The following description of Concentric Solutions is based on a Spanlink white paper available on its Web page. Other firms, such as Aurora Systems, Pronexus, and Teledata Solutions, have competing middleware products. Concentric Solutions’ telephony enablement of the Remedy Help Desk system will serve to illustrate the process and features of a middleware package. Remedy is a well-established help desk product that automates the tracking and reporting of problem calls (plus many other functions not addressed here). In a typical Remedy implementation, an ACD routes the call to a help desk agent’s desk either via ANI or through the use of an IVR. When Remedy is telephony enabled, screen pops show caller information (from the Remedy database) as the call comes in. Without the use of middleware, a simple screen pop might take weeks or months of expensive programming. It would also require some modifications in the event that the PBX is changed. Exhibit 4 illustrates Concentric Solutions integration with the Remedy Action Request System, using the client/server model. The mechanics of the screen-pop process are as follows: • The PBX routes a call to an agent’s station. • The PBX also communicates with Concentric Solutions (on the CTI server). • A DDE (dynamic data exchange) process is executed and a Remedy Action Request macro runs on the agent’s desktop. • Screens on the desktop are opened with relevant caller history, current information, etc. The call processing flow at a more detailed level is as follows: • Incoming call arrives. Calling number, called number, caller input, and time of call are identified as parameters. • Call rules are checked (in priority order) against the incoming call parameters.
66
AU1438_frame_C03 Page 67 Tuesday, November 5, 2002 12:09 PM
CTI Concepts and Applications
Help Desk Agent
4
User Call
PBX/ACD
Help Desk phone rings; AR System screen "pops open" with caller info filled in
Help Desk PC 1
User calls the Help Desk; ACD takes caller info, routes call to appropriate agent; info passed to CTI server
CTI Server 2
Exhibit 4.
Action Request System Server
AR System Client DDE AR System Client
CTI server sends message to FastCall on agent desktop
3
FastCall passes caller info to AR system
Concentric Solutions Integration (Client/Server Model)
• A determination is made whether incoming call should be accepted (answered) or forwarded. — If incoming call is forwarded, determine if it is forwarded to attendant, voice mail, or a different extension. — If incoming call is accepted, determine if any triggers (events that start a programmatic process) should be executed. • Execute appropriate triggers for screen pop or other processes, based on incoming call parameters. Number information is passed via ANI, DNIS (Dialed Number Identification Service; for toll-free numbers) or via IVR. Concentric Solutions itself is a garden-variety Windows application (XP or NT) and is stored in its own subdirectory and program group. It supports TSAPI and TAPI on the telephony side and DDE/keystroke macro interfaces for the computer. Exhibit 5 is a screen print of call control keys that allows a user to access telephony features from a workstation. The second major component of the software is shown in Exhibit 6: the Administration Program menu provides for the logical flow of telephone calls. Call rules, view call activity, application paths (i.e., to reach executable subdirectories), and other preferences are included in this menu. When Concentric Solutions is first installed, it must be tailored to the existing telephony and network/server environment. Exhibit 7 illustrates
67
AU1438_frame_C03 Page 68 Tuesday, November 5, 2002 12:09 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION
Exhibit 5.
Exhibit 6.
Call Control Keys
Administration Program
the screen that is used to specify telephony rules for the package. The generic process includes the following: • Define the Concentric Solutions application paths that specify how to launch other Windows applications. This creates a list of application names (e.g., Action Request System, MS Word, etc.) that can be invoked when a call comes in. • Define the Concentric Solutions application triggers that are the specific actions to be taken when a trigger is fired. Triggers are the specific functions in the other applications that are invoked by Concentric Solutions (e.g., invoke the Action Request System client and pass parameters to a submit macro). 68
AU1438_frame_C03 Page 69 Tuesday, November 5, 2002 12:09 PM
CTI Concepts and Applications
Exhibit 7.
Specification of Telephony Rules
• Define lists of calling numbers. These and caller input lists describe who might be calling, who they might be calling for, and the types of input the caller might give to an IVR. These lists are referenced by various Concentric Solutions processes. • Define other setup parameters, such as default area codes, extension formats, etc. These parameters help Concentric Solutions interpret telephone numbers. • Define sets of incoming call rules that determine which application triggers to execute for different types of calls and call parameters. These rules provide the essential logic of processing. To present a more in-depth picture of a CTI implementation, some of the more salient steps of the Concentric Solutions/Remedy installation are described here. Although these steps may seem somewhat lengthy, they are considerably less effort than coding 5000 lines of TAPI calls in C++! • Flowchart the integration process. See Exhibit 8. • Create an Action Request System macro (in Remedy) that records the result you want to see when a call is routed to the agent desktop. Typically, this macro includes displaying a query list of the open service requests for the caller and a Submit window to enter a new request. The caller telephone number is recorded into the macro as a parameter that is passed from Concentric Solutions and used as the key to identify the caller and perform the query list display. • Define a Concentric Solutions application path. This tells Concentric Solutions where to find the application to be run. See Exhibit 9.
69
AU1438_frame_C03 Page 70 Tuesday, November 5, 2002 12:09 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION
Define the FastCall "Application Paths" that specify how to launch other Windows applications.
This creates a list of application names (e.g. AR System, MS Word, etc.) that can be invoked when a call comes in
Define the FastCall "Application Triggers" that are the specific actions to be taken when a trigger has "fired."
Triggers are the specific functions in the other applications that are invoked by FastCall (e.g. invoke the AR System client and pass parameters to a Submit macro)
Define lists of "Calling Numbers," Called Numbers," and "Caller Input Lists" that describe who might be calling, who they might be calling for, and the types of input the caller might give to an IVR.
Define other set-up parameters, such as default area codes, extension formats, etc.
These lists are referenced by various FastCall processes.
These parameters help FastCall interpret telephone numbers.
Define sets of "Incoming Call Rules" which determine which Application The Rules link all of the pieces together. Triggers to execute for different types of calls and call parameters.
Exhibit 8.
Flowchart the Integration Process
• Specify an application trigger. This tells Concentric Solutions what process or function to execute when a call is being processed. Application triggers can send a DDE command to another application and can be defined for both incoming and outgoing calls. In the example shown in Exhibit 10, an incoming call trigger is added by clicking the Add button. The application trigger is given the name Pop so that it can be referenced by the call rules to be defined later. The trigger runs the ARSystemLaunch program, which executes the program at C:\Remedy\ARUSER.exe. The trigger type specifies one of three possible values: Keystroke Macro, DDE Client Trigger, or DDE Server Trigger. Concentric Solutions can be either a DDE client telling another application what to do, or a DDE server responding to a command from another application. • Configure the DDE Client. Clicking the Configure button (see Exhibit 10) brings up the client configuration screen (Exhibit 11). This screen specifies the details of the DDE command to run the Remedy AR system and execute the macro. The details of the DDE command structure are outside the scope of this book, but are well documented by 70
AU1438_frame_C03 Page 71 Tuesday, November 5, 2002 12:09 PM
CTI Concepts and Applications
Exhibit 9.
Exhibit 10.
Define Concentric Solutions Application Path
Concentric Solutions Incoming Call Triggers Configuration Screen
71
AU1438_frame_C03 Page 72 Tuesday, November 5, 2002 12:09 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION
Exhibit 11.
Configure DDE Client
Microsoft and many third-party publishers. The data format field in Exhibit 11 contains the specific DDE command structure. The Timeout is set to match the organization’s network environment; MACRO launching should be less than two seconds for most organizations. • Define the telephone number structure. The formatting screen for PBX telephone number scheme is shown in Exhibit 12. It is used to define how Concentric Solutions interprets telephone numbers passed from the switch. This screen is similar in function to the familiar Windows XP dial-up networking screens (e.g., include or do not include area code). • Set up incoming call rules. At this point, the rules governing the actions to be taken for incoming calls need to be specified. The previously mentioned application triggers are part of the actions if the qualification is true. There can be multiple incoming-call rules defined; and for a given call, they are evaluated in priority order until one is found to be true. If an incoming call does match the criteria setup for a rule, Concentric Solutions will take the action selected in the Action Desired field (see Exhibit 13). Also, if a rule is true, Concentric Solutions will execute predefined alerts that have been defined under the Alert button. The net result of the steps described is that when a telephone call is routed to a particular desktop, the corresponding Remedy Help Desk Action Request System will pop open a set of windows with relevant information about the caller (e.g., “Mr. Jones, I see you called last week about your hard drive; we will send out a technician within two hours.”). 72
AU1438_frame_C03 Page 73 Tuesday, November 5, 2002 12:09 PM
CTI Concepts and Applications
Exhibit 12.
Exhibit 13.
Telephone Number Format Configuration Screen
Concentric Solutions Incoming Call Rules Configuration Screen
73
AU1438_frame_C03 Page 74 Tuesday, November 5, 2002 12:09 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION Clearly, middleware is a benefit for organizations that want to telephony enable existing applications using in-house resources. Although there may be niche applications that require lower-level programming, the majority of applications can be well served using middleware — not to mention the savings and significantly shortened completion time. OTHER EXAMPLES OF CTI APPLICATIONS The following descriptions of demos from Nuance Corporation (www.nuance.com) illustrate the power that CT applications have developed as a result of better algorithms, faster processors, and newer development models. Travel Plan Example The Nuance Travel Plan demo shows how easy it is to get information about fares and schedules using speech. By simply interacting with the system in a normal voice, you can enter an itinerary and get real-time travel information. The demo includes information on 250 airports in the United States and abroad. While you are trying the demo, notice these Nuance features: • Speaker independence. For a system to be a success, it has to be able to understand a wide variety of regional and even international speakers. Nuance’s software does not need to be trained to recognize your voice, even if you are a nonnative English speaker. • Natural language. People refer to cities and airports by many different names. Because of this, a speech system has to be robust enough to handle wide variations not only in accents, but also in the way you refer to cities. For example, John F. Kennedy International Airport can also be called New York Kennedy, JFK, Kennedy International, and other variations. Nuance builds these variations into the system so that it is easy to use. When selecting flights from the list offered, try saying phrases like “the second American flight,” or “the seven-thirty-five flight.” • Barge-in. This demo allows you to speak when you want to. At any time, you can respond to a question even if the system is speaking. If you are already familiar with the system, barge-in enables even faster completion of transactions. • Fast and easy to use. Getting flight information using touch tones is a slow and frustrating process. Even when you know the flight number before you call, you still have to work through layers of menus and instructions to get the information you want. If you do not know your flight number, you will almost always end up waiting on hold to speak to a customer service agent. With Nuance speech recognition, you get the information you need quickly and easily. 74
AU1438_frame_C03 Page 75 Tuesday, November 5, 2002 12:09 PM
CTI Concepts and Applications Banking Example Nuance Communications’ Better Banking demonstration highlights the convenience of self-service banking. The demonstration incorporates personal account management, funds transfer, and bill payment functions. Listen for these Nuance advantages: • Natural grammars. The demonstration shows Nuance’s ability to allow natural phrasing of dollar amounts, dates, and transaction types. For example: — Dollar amounts can be “twenty-five dollars and thirty-two cents” or “twenty-five thirty-two” — Dates can be “June third,” “June third 1998,” “today,” or “the first of the month” • Natural language understanding. Nuance’s sophisticated natural language processing accurately determines the correct meaning of naturally phrased commands such as “transfer from checking to savings” or “transfer three hundred dollars to savings from checking.” • Continuous speech. All Nuance software supports continuous speech. No artificial pauses are required. • Accurate recognition. High accuracy and easy transaction modification are essential for user confidence. By combining superior core recognition technology and friendly interface design, you can be confident that your transactions will be submitted correctly. • Speaker independence. The system does not need to be trained for your voice, so it is easy to use from the very first call. How to Use the Demo. Perform any of the following transaction types
supported in the demo: • Check Account Balance — “What’s my savings account balance?” (or checking or VISA accounts) — “Check my checking account balance” • Review Recent — Checks — Withdrawals — Deposits — Transfers — Bill payments — Transactions — Press * to interrupt the list and return to the menu • Transfer Money — “Transfer money from savings to checking” (say “help” to hear a full list) — “Transfer $325 to checking from savings” • Pay Bills 75
AU1438_frame_C03 Page 76 Tuesday, November 5, 2002 12:09 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION — The phone company (or Pacific Bell) — Bank of America (or BofA, or the mortgage) — VISA — American Express (or Amex) — New York Times (or the paper, or the Times) — Pacific Gas and Electric (or the power company, or PG&E) — Say “help” to hear a full list of payees — At any point, you can say “help” to get more information. Just say “goodbye” or hang up to end the demo. Another Help Desk Package Another example of CTI enabled help desk software is Focus CT Suite from Decisif, a small Canadian software-development company. It provides the following features: • • • • •
Generation of problem identification tickets Tracks support calls from inception to problem resolution Escalation of calls based on time elapsed or nature of call Built-in knowledge base and inventory module Report generator provides real-time information on activity levels, resource constraints, and agent performance • Call information can be accessed over the Web interface and agents can update, close, and transfer calls off-site Decisif describes its CTI products as completely “open.” For example, end users can add additional applications from the base package purchased from Decisif by programming with: • ActiveX • OCX (OLE custom control from Microsoft; allows prewritten packages of code to be downloaded from Web servers) • OLE (object linking and embedding; allows Windows programs to exchange information) • DDE (dynamic data exchange; allows two programs running under Windows to share data) • TSAPI • TAPI The company also supports database access via: • DBF (dBase, FoxPro) • ODBC (allows various databases to be accessed via a common interface) • MDB (MS Access) • SQL (SQL server)
76
AU1438_frame_C03 Page 77 Tuesday, November 5, 2002 12:09 PM
CTI Concepts and Applications Decisif is only one of hundreds of vendors that have elected to make their CT packages open to changes. In evaluating platforms/packages, the vendor’s tools and ability to interface should be examined carefully to ensure that packages can be linked and modified as needed for the organization’s needs. Telephony applications need to be modified as much as traditional IT systems. SUMMARY CTI technology, from the simplest Windows dialer to sophisticated routing/database lookup systems, to advanced speech recognition, is becoming a standard tool of the communications industry’s repertoire. As it becomes easier to telephony-enable existing applications via middleware and standards-based development products, users will begin to appreciate and demand even more telephony functions in day-to-day business applications. The wise organization will develop an infrastructure and conceptual architecture before agents and users begin demanding product on the desktop. Notes 1. Microsoft Windows 2000 Server white paper. 2. Newton’s Telecom Dictionary, 18th edition.
77
AU1438_frame_C03 Page 78 Tuesday, November 5, 2002 12:09 PM
AU1438_frame_C04 Page 79 Tuesday, November 5, 2002 12:10 PM
Chapter 4
Interactive Voice Response … if I had a voice unwearying. Homer, The Iliad, 8th Century B.C.E.
WHY IVR? Interactive voice response (IVR) systems are the unwearying workhorses of the telephone industry. Everyone who uses a phone is familiar with the ubiquitous “For sales, press 1, for service, press 2.” These systems provide answers to taxpayers preparing their tax returns at 2:00 a.m. on April 15th, send brochures via fax-on-demand to potential customers, use text-tospeech to provide the latest natural gas prices, and validate credit card numbers. Without IVR, the business (and nonprofit) world would be slower, less efficient, and far more manpower intensive. Without IVR, many firms would be forced to close. One reason IVR systems (sometimes called VRUs for voice response units) are so successful is that they are based on a simple input device — the telephone set. While the 12 buttons on the standard analog set may not be the ultimate in user friendliness, they are familiar and constantly used. A well-designed IVR system requires only that the caller have a telephone and a POTs (plain old telephone) line. IVR applications with speech recognition capabilities allow rotary phone customers to use the system. In short, IVR satisfies the three “A’s — Anyone can use it, Anywhere, Anytime. CTI applications, on the other hand, usually require a PC to visually display information (although an IVR application can start up a CTI application). Another major driver is cost savings. In a May 2001 study, the Gartner Group reported that the average human agent cost per transaction was $5.50 versus $0.45 for an IVR system. Exhibits 1 and 2 show a similar drastic decrease in the cost per call using voice-enabled IVR and a short payback period, respectively. Even if these exhibits are a bit sanguine, it is clear that IVRs, particularly speech-enabled IVRs, are virtually “no brainers” for organizations that field a significant number of calls.
79
AU1438_frame_C04 Page 80 Tuesday, November 5, 2002 12:10 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION
Live Agent IVR Speech Port 32,000 35,000
$1.28
28,350
30,000 25,000 20,000 15,000 10,000
$0.10
5,000 0
Calls per Year
Cost per Call
Exhibit 1. Comparative Cost of Processing a Call for Human Agent versus Speech-Enabled IVR (Courtesy of Nuance Corporation)
$564,000 $600,000 $406,714
$157,020 cost savings per month
$400,000 $229,766 $200,000
$52,819
0 -$200,000
Months to Payback: 3.6 January
February
March
April
-$124,128 May
Exhibit 2. Payback Time on Initial Investment for a Speech Enabled IVR (Courtesy of Nuance Corporation)
The next section describes the rich feature sets offered by medium- to high-end IVRs. As with other technologies, high-end capabilities will move inexorably down the cost ladder, possibly culminating in personal IVRs linked to an MS Access database running on an ordinary desktop PC. IVR FEATURE SETS As with any long-standing technology, IVR features have converged to the most useful across many applications. With VoiceXML (discussed later) 80
AU1438_frame_C04 Page 81 Tuesday, November 5, 2002 12:10 PM
Interactive Voice Response and ubiquitous IP integration, the feature set has entered another phase of innovation and will undoubtedly expand. Following is a representative list of functions that are present in most mid- to high-level systems: • DNIS (dialed number identification service) recognition. System can perform functions such as providing selective information based on the number dialed. • Text-to-speech ability. By providing this service, prerecorded, “canned” messages are not required; dynamic, real-time information can be spoken to the caller. • Scalable architecture. For example, IBM’s Voice Response for AIX platform can scale from 2 to 480 ports. Some vendor’s top-end units for large call centers may include thousands of ports. Ideally, multiple systems can be networked to form a single, logical unit for higher volumes. • Multiple development platforms. Whether developers use VoiceXML or Java or both, the platform should support the development effort and use standards-based software. • Intuitive graphical design. Some organizations will want to use a graphical package to simplify designing call processing and menus. Graphical packages are de rigueur for modern IVR systems (unless they are low-end, “canned” type systems without much development capability). For example, the Avaya Conversant 8.0 software release includes a graphical service creation tool, Voice@Work. • Integration modules. The world of telephony has suddenly become more complex. With “communications servers” replacing traditional PBXs, IVR systems need to be able to communicate with many different platforms. For example, Spanlink offers a Cisco ICM (intelligent contact manager) integration module that links the Avaya Conversant to the ICM via Cisco’s peripheral gateway. Spanlink’s module runs on the Conversant and includes a library of external functions within Script Builder or Voice@Work (see www.spanlink.com). Another integration enabler is Cleo Communications’ Host Interface version 8.5 for Avaya Conversant. By using screen scrapes and other techniques, the IVR can be easily connected to mainframe applications, thus making available the considerable volume of data that continues to reside on “big iron” platforms. • Speech recognition (sometimes called automatic speech recognition or ASR) will be discussed in more depth later, but clearly this technology is ramping up quickly. A key benefit is that it end-runs some of the mind-numbing levels of push-button menus. For example, a customer can say “white, button-down shirt” and go straight to the most relevant part of the voice-enabled catalog. The Gartner Group projects that within a year, 30 percent of all IVR applications will have advanced speech recognition built in. 81
AU1438_frame_C04 Page 82 Tuesday, November 5, 2002 12:10 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION • Multiple forms of line connectivity. The IVR system should have a variety of interfaces, including standard T1, ISDN, SS7, CAS 2.0 (Communicating Applications Specification, relating to fax), E1 and IP (Voiceover-IP). Unless purchasing a low-end system, the buyer should not settle for an IVR system that does not support IP connectivity, because IVR applications need access to the data network for data retrieval and database updates. Also, for some systems, IP connectivity replaces physical ports, making capacity increases less onerous. • Database connectivity. Standard SQL access to relational databases such as Oracle, Sybase, Informix, and Microsoft Access is important. • Monitoring capabilities. Because IVR systems frequently front-end calls, their health and effectiveness should be closely monitored. Call handling statistics should be included: the number and length of calls for each line, points in the menu where callers hang up, the number of times a specific menu is repeated consecutively during the same call (indicating excessive menu complexity), etc. • Pager messages. Based on defined criteria, pager and other notification capabilities should be available. • Call transfer to other extensions. A useful addition is the ability to present the caller ID so that the recipient can accept or reject the call. • Multiple language support. • Audit trail and logging. Retention of caller details, including selections made during the call, should be available. • Web site connectivity. Avaya’s Vonetix server allows its Conversant platform to connect directly with Web sites, eliminating the requirement for organizations to create back-end interfaces to applications and information sources for automated telephone access. REPRESENTATIVE SYSTEMS To provide a real-world picture of IVR technology, the Avaya Technologies Conversant platform (sometimes called CVIS) will be used as a model to illustrate hardware and software components that may be found in the market. Obviously, the IVR market is varied, in terms of call capacity, application features, database links, compliance with standards, redundancy, costs, etc. The Conversant is a high-end line that competes with Nortel’s Open IVR/Symposium Integrated IVR and similar products. Hardware The Conversant platform is linked to the PBX via a digital circuit card connected to a TDM bus cable or via a tip/ring circuit card over a standard telephone line. The Conversant can be used with a non-Avaya PBX by linking with the tip/ring circuit card (analog lines to the PBX).
82
AU1438_frame_C04 Page 83 Tuesday, November 5, 2002 12:10 PM
Interactive Voice Response Other components include a serial port, a video port for a display terminal, a keyboard port, a floppy disk drive, hard disks (connected via SCSI bus cables for higher speed), a backup cartridge tape drive, and a floppy disk. The base operating system is Unixware. Other IVR systems may operate on XP, UNIX, or Linux. The operating system must be able to efficiently multitask when many callers simultaneously access the system — particularly if there are multiple applications running. A later section in this chapter addresses the effects of memory and disk speed, and how application design affects performance. Much of the functionality in an IVR platform comes from the cards inserted into the chassis. The Conversant, for example, has a card for (not a complete list): • • • • • • • •
Circuits (both analog and digital, E1, T1) External alarms Fax Speech processing (signal processor) IP connections (Ethernet 10/100) Host (mainframe) interface PBX interface Video
Applications (programs) are loaded on the hard drive and run either partially or wholly in memory. All the backup, redundancy, and performance concerns associated with any server apply to the IVR hardware: • • • • •
• • • • •
Redundant power supplies Backup CPUs Fast backplane Adequate memory (to prevent swapping/paging from the hard drive) Mirrored hard drives (RAID compliance — data is written to the hard drive with error correction codes that allow it to be reconstructed if a single drive fails) Multiple, independent cooling fans UPS, surge protector, etc. Failover capability — the failure of one component does not bring down the entire IVR system Self-monitoring capability, with alerts (low and high priority) Hot swappable components (can replace cards on-the-fly)
83
AU1438_frame_C04 Page 84 Tuesday, November 5, 2002 12:10 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION Development Software Example: Script Builder, Voice@Work, and @Work Studio Working with voice response applications is much like traditional client/server development packages. Actions by the user must be anticipated, data validated, database records updated, and appropriate response provided to the caller. The following excerpt from Avaya’s Intuity Conversant System documentation demonstrates some of the techniques used to build an application using traditional scripting (less proprietary methods, using VoiceXML, will be discussed later): After giving a yes/no question, pause to give the caller time to respond, then present the possible answers. The prompt will stop playing as soon as the recognizer detects a spoken “yes” or “no” or a Touch-Tone signal. For example: “You said 64587. Is this correct?” [a 1.5 second pause] “Please say yes or no.” Use the prompt-and-collect action to ask the question, play a series of silence phrases, then present the options. The following excerpt shows an example of how part of your Script Builder code will look if you ask the caller for five digits, then confirm the entry within the prompt and collect action. Prompt & Collect Prompt Speak with Interrupt Phrase: “Please enter your five-digit customer number” Input Mode: US_DIG Min Number of Digits: 05 Max Number of Digits: 05 Checklist Case: “Input OK” Speak with Interrupt Phrase: “You said” Field $CI_VALUE as C Phrase: “Is this correct?” Phrase: “sil.500” Phrase: “sil.500” Phrase: “sil.500” Phrase: “Please say yes or no.” Confirm Case: “Initial Timeout” Reprompt Case: “Too Few Digits” Reprompt Case: “No More Tries” Quit End Prompt and Collect 84
AU1438_frame_C04 Page 85 Tuesday, November 5, 2002 12:10 PM
Interactive Voice Response Avaya’s development packages, Voice@Work and @Work Studio, automate some of these tasks by using graphical tools. Voice@Work is an object-oriented development toolkit that uses Sun Microsystems’ Java and JavaBeans. With this toolkit (plus another package, @Work Studio), developers can write IVR applications without writing code. They can also port the same objects to several platforms, including IBM’s VisualAge. As discussed in other chapters, Java is platform independent and thus spares developers the need to rewrite applications for different vendors’ hardware/software. Voice@Work also includes modules that allow a team of developers to share speech, database tables, host definitions, speech recognition, and other resources. Another feature is the ability to develop remotely without impacting production. In practice, with the shortage of CTI programming/analysis skills, the ability to remotely log on and develop applications has become essential. An important consideration (discussed in more depth in another chapter) is the ability to generate code from the visual constructs (objects) developed. @Work Studio provides tools to generate call answer, business rules-based routing, integrated desktop, reporting, and data warehousing. Given the proliferation of visual and object-oriented tools, the argument for in-house development of telephony applications is getting stronger all the time, and it is often faster and less expensive (after the initial learning curve). Although the development trend is toward VoiceXML, scripting will continue to play an important role. Even VoiceXML version 2.0 is considered somewhat weak in call control, whereas proprietary systems permit a finer level of control. APPLICATIONS OF IVR General Benefits The previous discussion outlined specifications of the Avaya IVR platform in order to give the reader a feel for the hardware and software required to implement IVR systems. The following sections reflect applications that are implemented by many IVR vendors across a broad range of business and government organizations. The general benefits of IVR include the following: • Elimination or reduction of operator/agent time. In particular, it avoids peak staffing issues during times of heavy transactions (e.g., vacation season for travel agents). • Elimination of agent exhaustion from repetitive questions. Retaining agents in call centers is a challenging task. Repetitive questions can be reduced or eliminated for many applications. 85
AU1438_frame_C04 Page 86 Tuesday, November 5, 2002 12:10 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION • Value added for wait time. A well-designed IVR system can tell callers the amount of waiting time for a live agent — would they like to hear about the company’s newest products, answer a survey, or order products via Touch-Tone? • Entertainment to reduce caller frustration. • Ability to broadcast a repetitive message (to a voice mail system, for example). • Mechanism for casual users who, for example, would not want to dial into a Web site because they do not have a PC or it is too much trouble to boot up their PC for a simple transaction. • Provision to supply constantly updated information via voice (especially with a text-to-speech facility). • Availability 24 hours a day, 365 days a year. • Capture of customer/caller information that would otherwise be unavailable. • Streamlines operations. Specific Applications Following are applications that apply across a number of business environments: • Banking by phone, including account balances, cleared checks, and fund transfers • Validation of credit cards and other financial instruments • Human resources: vacation and sick information, 401(k) info, savings plan, HR policies, insurance eligibility, payroll information (e.g., yearto-date FICA paid) • Order entry • Inventory queries and stock availability • Educational institutions: transcripts, grades, and course registration • Sales results • Payables status • Investor relations: stock price, dividend reinvestment plans, major announcements • General bulletin board • Help desk: automated answers to most common questions • Survey automation • Fax-on-demand: request specific fax information based on an indexed menu • Health care, test results, physician referral • Tax information • Employee scheduler • Medical lab test results • Refund status • Catalog sales 86
AU1438_frame_C04 Page 87 Tuesday, November 5, 2002 12:10 PM
Interactive Voice Response • • • • •
Shipment status Physician selection Power outage reporting Parts availability Auto repair status
Some IVR applications process a massive number of calls. For example, the online magazine www.bizjournals.com reports the following statistics for the State of Wisconsin Department of Revenue’s IVR: 80,000 registered users and 660,000 electronic tax payments for 2001; expected growth to 100,000 registered users, processing more than a million payments per year. Many of these applications are available from vendors as industry tailored, off-the-shelf products. For example, Fijitsu’s Intervoice IVR offers packages for employee benefits, bank processing, credit union, bill payment, 401(k), and health care. Outbound Messaging and Audiotext Outbound messaging is a simple IVR function. Organizations as diverse as hotels and political groups use outbound messaging to deliver a voice message inexpensively. Examples include: • Hotel wake-up calls • Emergency notifications for businesses and schools, including power outages and safety issues • Medication reminders for patients • Notification of specific events • Political or commercial ads Outbound messaging is different from predictive dialing; the intent is to get the word out, not necessarily to connect a customer to a live operator. An example system is GetVocal’s VoiceXML power dialer platform (www.getvocal.com). Audiotext is a straightforward application that provides callers with a spoken message based on a selection. In some PBXs, Audiotext can be on a card rather than within the IVR. Cards will typically have limitations such as the number of messages and the length of those messages. IVR systems will be more flexible and have considerably more capacity (both disk space and available ports). Enron Corporation, prior to its 2001 bankruptcy, developed an innovative Audiotext application called “Enron Radio.” Using menus that were loosely configured as radio stations, various Audiotext messages from management were made available to employees. They could dial in and select topics of interest, such as “How we are doing in India” or “Changes in employee benefits.” This application was run on the Avaya Conversant platform. 87
AU1438_frame_C04 Page 88 Tuesday, November 5, 2002 12:10 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION Call Center Applications IVR is the sine qua non of call centers — they live or die by the efficiency and customer satisfaction provided by their voice response systems (as well as their CTI applications). Call centers are the drivers of telephony innovation for large-scale IVR systems. These centers not only need more automated tools than other businesses, but have a plethora of reporting capabilities that allow them to quantitatively justify the investment. Fortunately, many of the technology advances originating from high-end call centers are quickly promulgated to low-end IVR systems (some costing less than $2000). Some of the applications found in call center IVRs include the following: • Front-end all calls. Many call centers run all calls through an IVR in order to offload as many calls as possible away from agents. For large call centers, the IVR may front-end the calls, but take direction for an integrated contact manager such as Cisco’s ICM. • Announcements. Callers can listen to a standard statement regarding products, services, etc. The IVR system can also use mathematical algorithms to announce to the caller an estimated wait time. Announcements can be tailored to the caller’s interests. For example, if a caller selects option “3” to order fishing gear, announcements can discuss fishing conditions at a popular local fishing site, or tips on the best lures to use in cloudy weather. • Bulletin boards. Provide detailed information to callers on specific, selected topics. • Leave message. Callers in the queue can elect to leave a message for callback, then hang up. The first agent available calls the customer back. The callback can be scheduled or immediate. • Supervisor observation. Allows supervisors to monitor agent’s performance. • Advance information collection. While in queue or simply on the front end of a call, the IVR system can collect necessary information (e.g., account number or claim number) so that the agent has the information needed when the call is answered. • Tracing calls. Calls are recorded for follow-up. This could be for malicious or emergency calls or simply for sales leads. Also, some systems allow selective recording so that only the critical part of the conversation is recorded — usually the actual order or decision by the caller (“Yes, I would like ten blue shirts, 35x17, with the monogram WAY”). Some organizations are beginning to use a PBX firewall, such as Secure Logix’s TeleWall, to augment tracing. For example, the firewall can provide real-time alerts when a specific caller ID is detected. • Conferencing. One agent can conference with another agent (e.g., a second-tier technical help engineer) while the customer is on the 88
AU1438_frame_C04 Page 89 Tuesday, November 5, 2002 12:10 PM
Interactive Voice Response phone. Of course, the second or subsequent conferee need not be physically in the same building. • Fax-on-demand. Callers can receive fax information either directly or while they are in the queue. Typically, fax-on-demand applications will attempt several times to deliver the fax (the number of tries is a system-level option). • “Sticky” data. Information obtained in one part of the IVR system is passed to other applications so that the customer does not have to repeat the same information to another agent (having to repeat information is near the top of customer complaints about badly designed IVR systems). To handle this process, the applications must have the ability to return to the original application (or script). Sticky data should be a component of any middleware/systems design evaluation. FAX-ON-DEMAND Although fax-on-demand is merely another application like those listed prevously, its wide deployment in the IVR world merits a closer look. With approximately 100 million fax machines worldwide, it is an ideal medium to disperse information to customers, sales reps, etc. It is also easy to use and, unlike e-mail, always looks exactly the way it was sent. Powerpoint presentations and other large images can be faxed more easily than e-mailed to destinations with only dial-up links. With the proliferation of virtual fax services such as Efax (www.efax.com), faxing gets a boost; the outgoing fax relies on traditional dial-up, Group 3 transmission, but the receiving party gets an attachment via e-mail. The attachment is created on an Efax server and is based on a proprietary algorithm that greatly reduces its size. There are several ways that the application can send out the information. Using an index, which lists documents by code (e.g., document #100 = claim form, document #101 = application for insurance, document #203 = engineering schematics for widget A, etc.), users use Touch-Tone to indicate the document number (or say the number). More advanced systems allow the user to “say” the document — “send me information on your Equity 4 mutual fund.” Fax transmissions can be delayed until off-hours to reduce toll charges. Fax-on-demand applications can enter variable information at the time of request. For example, benefits, wholesale electricity prices, mortgage rates, and account balance information can be dynamically updated (from the appropriate database) just prior to transmission. For many simple applications, the user can simply fax to a designated IVR port in order to store the document for later transmission to customers or other callers. Broadcast fax can send to hundreds or thousands of recipients. For volume environments, attention needs to be given to the fax database of recipients. The system can maintain a list of recipients who no longer wish to 89
AU1438_frame_C04 Page 90 Tuesday, November 5, 2002 12:10 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION receive faxes, thus saving on long distance charges. Cover pages can be tailored and retry options can be set to an appropriate value. APPLICATIONS DEVELOPMENT Avaya’s Script Builder and Voice@Work were discussed in an earlier section as a representative development environment. There are more than a hundred vendors in the growing market for IVR development platforms. As DSP (digital signal processors) chip processing power grows (Moore’s law of doubling processing power every 18 months continues to hold), packages to develop IVR and speech-related applications have become more user friendly, graphical, and effective. The programming legacy of reusable parts, abstraction from lower-level details, application generators, and use of “templates” thrives in the voice processing world. To illustrate the details of IVR application development, two software packages, Apex Voice Communication’s OmniVox and Pronexus’ VBVoice, are presented here. Omnivox Omniview Exhibit 3 shows a sample screen from Apex Voice Communication’s Service Creation Environment (an applications generator). It allows the developer to “drag and drop” icons reflecting call functions (Apex calls it a “flowcharting tool”). Various commands include call control, messaging, database access, faxing, and outdialing. C functions (for the more technically inclined) are available for functions not provided in the GUI-based package. In addition to the flowcharting features, system utilities are needed for both development and daily operations. For the Apex OmniView service creation tool, they include: • • • • • • •
Start and stop line commands Debugging mode Error logs Call counts reporting from call detail records Online display of all calls Fax queue showing jobs to be sent Fax append (allows several documents to be merged into one fax document). • Speech editor (can cut and paste) • Voice mailbox management
Using a graphical interface to script telephony applications simplifies development significantly. The flowchart allows the developer to see the logical progression of the call and move functions around using cut and paste and other standard Windows tools. With a right-click of the mouse, the supporting commands can be viewed and changed. Messages can be recorded as needed with a speech editor. 90
Exhibit 3. OmniView Development Screen (Courtesy of Apex Voice Communications, www.apexvoice.com)
AU1438_frame_C04 Page 91 Tuesday, November 5, 2002 12:10 PM
Interactive Voice Response
91
AU1438_frame_C04 Page 92 Tuesday, November 5, 2002 12:10 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION OmniVox software runs on Windows NT, Windows 2000, UNIX, and Linux. It is designed to work with both TDM (traditional telephony) and IPbased networks, greatly simplifying the porting of applications from one architecture to another. Also, OmniVox runs in mixed TDM and IP environments — a more typical configuration for most organizations for the next few years. Some of the OmniVox modules for the Windows environment include: • • • • • • • • • • • •
Voice mail Speech editor CodeBase Signaling System 7 interface Voice recognition IP TTS (text-to-speech) SNMP (Simple Network Management Protocol) ODBC Macro and virtual chat Dial pulse detection Multiparty conferencing Debit card processing
Pronexus VBVoice Pronexus VBVoice is based on Microsoft’s Visual Basic (VB) and relies on a series of specific telephony buttons that are included on the left side of the VB screen. Exhibit 4 shows a sample development screen. The developer selects specific commands, places them on the grid, and then connects them with a line drawing tool to create the application. The following sample feature descriptions (not all-inclusive) are courtesy of Pronexus (www.pronexus.com): • Rapid design interface. Drag and drop components with built-in telephony operations, including: — Inbound and outbound line management — Message recording and playback — Database integration to SQL, ODBC, MS Access, and most other data sources — Dialing, routing, scheduling, bridging, digit collection • Programmable components. A VB coding layer within each component allows you to build specialized event handling and even override component functionality. VBVoice contains an extensive set of built-in objects, methods, and properties that are called from VB code. • Open database connectivity. ODBC allows applications to pull information from virtually any data source, including SQL Server, Oracle, and
92
AU1438_frame_C04 Page 93 Tuesday, November 5, 2002 12:10 PM
Interactive Voice Response
Exhibit 4.
•
•
•
•
Pronexus VBVoice Sample Development Screen
MS Access databases, leveraging your current database infrastructure. Integration with enterprise applications. Integrate with existing enterprise applications using COM+ or DDE standards, and communicate using TCP/IP messaging. Composite controls. VBVoice is the only toolkit that has “composite controls” for building component object models that are flexible at runtime and can be reused or called from different applications. This feature allows you to quickly build subsystems, package them into a composite control, and connect to or from any other VBVoice control. Support for high-density applications. VBVoice applications can optionally implement a separate thread for each voice-processing channel and do not require separate components for thread management. This results in a highly efficient, multitasking architecture, capable of supporting more voice processing power (240 lines per server). Platform interoperability. VBVoice supports leading voice hardware from Dialogic and Brooktrout. It also supports the telephony application programming interface (TAPI) from Microsoft, resulting in interoperability with many voice platforms hosting the required TAPI 93
AU1438_frame_C04 Page 94 Tuesday, November 5, 2002 12:10 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION
•
•
•
•
• •
• •
•
94
service providers (including some communications and media servers). Workgroup telephony. Communicate over TCP/IP. Improves customer care and workflow efficiency through screen pops, active server pages, text chats, and “call-me” functionality. VBVoice Professional provides ten licenses of AgentX, a messaging object that enables communication between network applications. Remote messaging and monitoring. Uses a Web browser to send messages or commands over the Internet. Monitors applications and control system operations from a remote workstation. VBFax. VBFax is integrated into VBVoice Professional to expand voice applications with fax-on-demand, fax broadcasting, network fax service, fax storage and forwarding, e-mail fax gateway, and imaging. VBFax can be used in two different ways: (1) with Telcom Fax Server, and (2) with supporting Dialogic fax cards for same-call fax applications. Gammalink now opens doors for high-density fax on T1 and E1. Voice editing. Using Announce!, you can record high-quality voice prompts for your IVR system. Sound studio power is at your fingertips with a complete set of visual editing and special effects tools. VBWAP. Build, test, and demonstrate innovative WAP applications with component-based tools in the VB environment. Dynamic resource allocation. Prepaid services, conferencing, PCswitching, or “follow-me” can be developed quickly, keeping hardware costs to a minimum. PC-PBX functions. Use for call queue, ACD, ring generation, outbound dial, and station interface support. Telcom Fax server. Telcom Fax server provides a fax server platform running on Windows NT, supporting any number of desktop clients (available from Pronexus) for complete office fax integration in addition to clients built using the VBFax controls. Each client can be linked to multiple servers. VBFax can be used independently or as an integral part of VBVoice, the award-winning Windows telephony integration toolkit from Pronexus. VBFax links to a Windows printer driver for creation of fax documents and includes a fax viewer to preview, modify, and maintain inbound and outbound faxes. Features include dynamic grouping for sending and receiving faxes, network fax server control, remote logging, and support for sending and receiving faxes during a call. VBFax comes with sample applications for fax-back and other fax services. Announce! provides a flexible Windows environment for recording and editing prompts for voice processing systems. Record prompts using a sound card or a voice card. Create individual voice files in VOX or WAV format or create multiple indexed phrase files in a VBASE40 (VAP) format. Use actual script text as the indexed label for each phrase. Drag and drop script labels to easily select phrases for recording and
AU1438_frame_C04 Page 95 Tuesday, November 5, 2002 12:10 PM
Interactive Voice Response editing. Record hands-free in live sessions. Transfer bulk recordings from tape using unique features such as automatic silence detection and script prompting. With Announce!, recording and managing voice prompts has never been easier! • VBVoice supports Nuance’s sophisticated natural language speech recognition software. Automatic speech recognition (ASR) enables a computer to understand speaker-independent commands from a telephone for navigation through an automated interactive voice response (IVR) system. It can also use the communication as data input. When the audio input arrives, a speech recognition engine processes the input against a grammar (a collection of predefined words or phrases), translates it, and passes text to the application. Different grammars can be used during the call flow. Speech recognition is enabled through a VBVoice control that routes call flow when a command is recognized. No-match routing is also built in. In the interest of space, not all available functions were listed here. The key point to remember is that packages for IVR programming significantly reduce the level of effort and expertise required. As larger numbers of industry applications are developed, they are made available as ready-touse software. A final word on the development environment. The first contact an end user/customer has with an IVR system is the voice. If it is poor quality, there is an immediate negative impression — it is judged unprofessional. Thus, a critical part of the development package is a studio-effects capability, including professional voice editing, music mixing utilities, sequential.WAV (or other format) compilation, and other tools of the voice trade. SPEECH RECOGNITION Beyond Touch-Tone Phase one of telephony automation is to use IVR in place of or along with human agents. Phase two is to use human speech to supplant Touch-Tone as the primary IVR interface. Before DSP chips achieved the speed they have today, the IVR developer had two choices for automated speech recognition: (1) a few words, such as digits 0 through 9, that could be spoken by any caller and recognized by the speech decoder or (2) large vocabulary, but confined to a single speaker (requiring training, as found in packages such as IBM’s ViaVoice). Although English has a foundation of only about 40 phonemes, the mathematical models required to convert speech into discrete words (using Markov algorithms, among others) are demanding indeed. With faster chips and larger memories, today’s IVRs can recognize large vocabularies from a diverse caller population. Ron Croen, CEO of Nuance 95
AU1438_frame_C04 Page 96 Tuesday, November 5, 2002 12:10 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION Communications, uses the following example to illustrate the power and convenience of automated natural speech understanding: With Touch-Tone: Press 1 to buy shares, 2 to sell shares, or 3 for quotes “1” Enter the stock symbol “43936291” Enter the number of shares you want to buy “500” Press 1 for market price, or 2 to specify another price “1” OR With natural speech: What would you like to do? “I want to buy 500 shares of Coca Cola at market.”
Speech recognition provides a way for users to cut through the timeconsuming and frustrating Touch-Tone menus found in some IVR systems. Ultimately, it allows the caller to access more information or make more choices than would be possible with a menu tree. Two factors speed the development of voice recognition: (1) moderately priced chips are fast enough to process the algorithms in real-time, and (2) the heuristic approach of learning from millions of live interactions allows developers to better tune systems. The following sections illustrate some of the applications of speech recognition in the marketplace. With both home PCs and PDAs becoming powerful enough to recognize speech, the general population is becoming more comfortable with machine communication via voice. Security The combination of increased concern about worldwide terrorism and an ever-increasing reliance on electronic commerce means that biometric forms of authentication will become more common. Voice verification is one of the less-intrusive technologies and has the added benefit of working from a distance (no thumbprint or uncomfortable feeling looking into a retina scanner1). The Cahners In-Stat Group estimates that sales of speech recognition packages will reach $2.7 billion by 2005. Speech recognition works best in environments where the individual’s voice can be prerecorded for later comparison. Nuance Corporation’s (www.nuance.com) Verifier 3.0 uses voice prints to verify key data such as name, social security number, and PIN. The best option for an IVR system that uses speech recognition is to use a package and development platform that has built-in voice authentication. Most speech verification engines allow the organization to select the acceptable level of tolerance. If pattern 96
AU1438_frame_C04 Page 97 Tuesday, November 5, 2002 12:10 PM
Interactive Voice Response matching is strict, some legitimate users will be blocked out; if too loose, unauthorized individuals could potentially gain access through the IVR. Typically, another piece of information is required, such as a password, so that two-factor authentication is required before access is granted. In addition to Nuance, other vendors supplying voice verification include Veritel, Buytel, Persay, and Vocent. Voice Dialer — Parlance There is a lot to be said for doing one thing extremely well. Voice dialers do nothing more than connect one party to another via a spoken name. Avaya Technologies’ Name Dialer and Parlance’s Name Connector are examples of this technology. Using Parlance’s product as an example, here is how voice dialing works from the end-users’ perspective: • Dial a number (always the same number) that connects the user to the Parlance server. • After the beep, speak the name of the person (or department, service, etc.) you want to reach. • The server repeats back the name: “If you stay on the line, you will be connected to John Doe.” • If the name is wrong, press the star key and say the name again; if it is right, just stay on the line. • After a slight pause, you are connected to your party. Parlance uses off-the-shelf, Intel-based hardware, the NT 4.0 operating system, and its own proprietary voice recognition system to make this work. The voice communications department supplies text-based names (no recording of end-users’ names required) that are translated by the Parlance engine into phonemes that are compared to the spoken voice. Usually, the text names can be downloaded from an existing directory or phone book. Nicknames are essential and are generated by the software for known common names (Dick for Richard, Bill for William, Janie for Juanita, etc.). However, “Bubba,” “Slim,” and other such nonstandard nicknames must be entered manually. As of this writing, Parlance has a limit of 50,000 names (nicknames are not counted against the limit) for a single machine/directory. For very large organizations, cascading voice connection systems are required. For example “West Coast” and “East Coast” servers could be set up. An employee working in the East Coast division could dial the East Coast server, say “West Coast,” and be connected to the West Coast name connector. From there, the employee could say the name of any employee in the West Coast division.
97
AU1438_frame_C04 Page 98 Tuesday, November 5, 2002 12:10 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION The accuracy of voice dialing is approximately 90 to 95 percent. Users with soft voices, those whose native language is not English, and very fast speakers sometimes have lower hit rates using Parlance. Speaking on a cellular phone or over a speaker phone will also somewhat reduce accuracy. The Parlance name database was developed from both United States and United Kingdom speakers and will be accurate for those populations. Despite its limited function, voice dialing is extremely popular with users once they grasp what it means — driving down the highway and being able to call or leave a voice mail for anyone in the organization by pressing one button on their cellular phone. Or not having to take the time to look up a number in the company telephone book. Or having to look up the number, dial the number, only to find out that the phone book is out of date. Vendors selling these services tout the reduction in number of calls received by human telephone operators, and they imply that cost savings can be obtained by eventually reducing operator head count. In practice, the greatest benefit is efficiency of the workforce. For example, an executive in a car no longer needs to call an administrative assistant and ask to be transferred to another party. Whether operator head count can be reduced depends on a number of factors, such as how many internal versus external calls the organization receives, the aptitude of the workforce for new technology, and the kinds of services the operators offer (simple transfer versus research on what department can best help the caller). One Houston-based energy company installed Parlance and found that the number of calls to the name connector servers exceeded the number of calls received by live operators. A name connector service can be set up as a network outside the premises of the organization. Key customers, suppliers, government agencies, disaster recovery groups, etc. can be set up to speed access. Of course, the name “Dial Tone” with a number of “9” is not a good idea — hackers will find it quickly. Generally, any pronounceable name going to any specific telephone number (domestic or international) can be safely inserted into the database. From an operational perspective, the following housekeeping functions are required to keep the name connector up-to-date and efficient: • Regular (weekly or monthly) feed from the organization’s online or manual telephone directory to a text file that can be imported into Parlance’s text directory. • Routine examination of names that are difficult to pronounce or are pronounced in a manner that is not obvious from the spelling. Phonetic spellings are entered into the database so that Parlance can recognize the name from the pronunciation. For example, “Fahim Bhakro” might be spelled phonetically as “Fa heem Bac row.” 98
AU1438_frame_C04 Page 99 Tuesday, November 5, 2002 12:10 PM
Interactive Voice Response • Backups of customer specific files. • Digital port connection (the software must be set up beforehand for the specific PBX transfer function, which will vary by vendor). Name Dialer — Avaya Avaya Technologies, through Bell Labs, has developed a package with capabilities similar to the Parlance Name Connector. The Large Vocabulary Recognizer, run on Avaya’s Conversant IVR server, can store more than 20,000 names in a single directory. USAA in San Antonio, Texas, has installed and used the Large Vocabulary Recognizer Dialer to help employees more easily reach each other and to offload work from the internal telephone operators. In addition, USAA has plans to use the product to offer customers a more-intelligent automated response. For example, rather than the traditional IVR “for loans, press 1,” the USAA Savings Bank will use the Conversant and Large Vocabulary Recognizer to respond to a question such as “I need a loan.” TEXT-TO-SPEECH (TTS) Although the quality of prerecorded phrases is high —“Thank you for ordering from ACME Widget Company” — there are many situations where prerecorded.WAV or other audio files may not be practical. For example: • When audio responses vary constantly: “Today’s high will be 95 degrees in Houston with a 55 percent probability of precipitation.” IVR reading of e-mail over a cell phone is another example (more on this in the chapter on unified messaging). • When audio files are too large for convenient storage. • When keying information in a system, audible verification may be required. For example, when a PC user enters “427,” the response is fourhundred and twenty-seven. With respect to an IVR system, the feedback could be a voice summary of inputs to a form or application. Often, text-to-speech is used to speak the results of an IVR database inquiry. This is particularly useful for individuals away from the office, using cell phones, etc. As of this writing, Microsoft is promoting its SAPI 5.1 (speech application programming interface) TTS engine. This interface provides speech services for the desktop, mobile devices, and IVR applications. It uses COM components. Although discussing it here in relation to TTS, SAPI is a general application programming interface (API) for many kinds of speech processing. Automation languages such as Visual Basic, Windows Script Host, and Jscript are included. There are, of course, other APIs available, such as Sun Microsystem’s Java Speech.
99
AU1438_frame_C04 Page 100 Tuesday, November 5, 2002 12:10 PM
COMPUTER TELEPHONY INTEGRATION: SECOND EDITION VOICEXML Voice eXtensible Markup Language (VXML or VoiceXML) is an extension of HTML (Hypertext Markup Language) that enables input by speech recognition and output by audio. Where HTML relies on typing or pointing with a mouse, VXML accepts spoken input or Touch-Tones and uses available speakers or headphones to provide the required feedback. Applications are no different than one would expect for traditional IVR systems — quoting stock prices, providing driving instructions to corporate headquarters, or listing bank balances. According to the VoiceXML forum (www.voicexml.org), the following Web changes have contributed to the proliferation of voice technology over the Internet: • Browsers now support rich text, including graphics, pictures, sound, video, etc. • Web server power has increased. In particular, Web pages are routinely generated by programs, templates, etc. They are dynamic. • Web data representation is now more sophisticated, making it easier to transfer XML formatted data into another format. • The Internet has improved performance, bandwidth, and QoS. • Web application tools have been steadily enhanced. The real benefit of VXML is that it supports telephone access to Web services and supports Web site browsing through voice recognition, as well as traditional Touch-Tone. As a result, data can be stored on Web servers and retrieved via VXML, greatly simplifying access to business information. Before VXML, access to data was available only through storage of information directly on the IVR server or using middleware to access legacy systems. Programmers have spent thousands of hours integrating IVRs with backoffice systems, using 3270 screen scrapes and other techniques. This hard-core, “tackling down the middle” approach tends to favor the CPE (customer premises equipment) approach to IVR, because an ASP (application service provider) model would not function well in such a customized environment. Telera, based in Campbell, California, (www.telera.com) is an example of a provider that uses a distributed architecture based on VXML. A “cool” feature of Telera’s offering is that its clients’ toll-free calls are sent to the nearest Telera server, which may answer customers’ questions without incurring long distance charges; only if the customer needs to speak to a live representative does the call go out to the IXC. Exhibit 5 is a simplified diagram of a VoiceXML configuration. The following segment of VXML code shows some sample syntax to obtain a bank customer’s account number: