SAS 9.2 Language Reference ®
Dictionary
®
SAS Documentation
The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2009. SAS ® 9.2 Language Reference: Dictionary. Cary, NC: SAS Institute Inc. SAS® 9.2 Language Reference: Dictionary Copyright © 2009, SAS Institute Inc., Cary, NC, USA ISBN 978-1-59994-592-7 All rights reserved. Produced in the United States of America. For a hard-copy book: No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, or otherwise, without the prior written permission of the publisher, SAS Institute Inc. For a Web download or e-book: Your use of this publication shall be governed by the terms established by the vendor at the time you acquire this publication. U.S. Government Restricted Rights Notice. Use, duplication, or disclosure of this software and related documentation by the U.S. government is subject to the Agreement with SAS Institute and the restrictions set forth in FAR 52.227–19 Commercial Computer Software-Restricted Rights (June 1987). SAS Institute Inc., SAS Campus Drive, Cary, North Carolina 27513. 1st electronic book, February 2009 1st printing, March 2009 SAS® Publishing provides a complete selection of books and electronic products to help customers use SAS software to its fullest potential. For more information about our e-books, e-learning products, CDs, and hard-copy books, visit the SAS Publishing Web site at support.sas.com/publishing or call 1-800-727-3228. SAS® and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are registered trademarks or trademarks of their respective companies.
Contents What’s New vii Overview vii SAS System Features viii SAS Language Elements x
PART
1
Dictionary of Language Elements Chapter 1
1
4 Introduction to the SAS 9.2 Language Reference: Dictionary
3
The SAS Language Reference: Dictionary 3 Syntax Conventions for the SAS Language 4
Chapter 2
4 SAS Data Set Options
9
Definition of Data Set Options 10 Syntax 10 Using Data Set Options 10 Data Set Options by Category 12 Dictionary 14 Data Set Options Documented in Other SAS Publications
Chapter 3
4 Formats
71
81
Definition of Formats 84 Syntax 84 Using Formats 85 Byte Ordering for Integer Binary Data on Big Endian and Little Endian Platforms Data Conversions and Encodings 89 Working with Packed Decimal and Zoned Decimal Data 90 Working with Dates and Times Using the ISO 8601 Basic and Extended Notations Formats by Category 99 Dictionary 108 Formats Documented in Other SAS Publications 277
Chapter 4
4 Functions and CALL Routines
287
Definitions of Functions and CALL Routines 298 Syntax 298 Using Functions and CALL Routines 300 Function Compatibility with SBCS, DBCS, and MBCS Character Sets Using Random-Number Functions and CALL Routines 306 Date and Time Intervals 319 Pattern Matching Using Perl Regular Expressions (PRX) 322 Using Perl Regular Expressions in the DATA Step 323 Writing Perl Debug Output to the SAS Log 331 Perl Artistic License Compliance 332
305
88
94
iv
Base SAS Functions for Web Applications 333 Functions and CALL Routines by Category 333 Dictionary 359 Functions and CALL Routines Documented in Other SAS Publications References 1213
Chapter 5
4 Informats
1207
1215
Definition of Informats 1217 Syntax 1218 Using Informats 1219 Byte Ordering for Integer Binary Data on Big Endian and Little Endian 1221 Platforms Working with Packed Decimal and Zoned Decimal Data 1223 Reading Dates and Times Using the ISO 860 Basic and Extended Notations Informats by Category 1232 Dictionary 1238 Informats Documented in Other Base SAS Publications 1368
Chapter 6
4 Statements
1377
Definition of Statements 1379 DATA Step Statements 1379 Global Statements 1385 Dictionary 1388 SAS Statements Documented in Other SAS Publications
Chapter 7
4 SAS System Options
1756
1763
Definition of System Options 1767 Syntax 1767 Using SAS System Options 1768 Comparisons 1776 SAS System Options by Category 1776 Dictionary 1789 SAS System Options Documented in Other SAS Publications
PART
2
1999
Dictionary of Component Object Language Elements Chapter 8
4 Component Objects
2023
DATA Step Component Objects 2023 The DATA Step Component Interface 2023 Dot Notation and DATA Step Component Objects Rules When Using Component Objects 2025
2024
4 Hash and Hash Iterator Object Language Elements Chapter 10 4 Java Object Language Elements 2085
Chapter 9
Java Object Methods by Category Dictionary 2086
2021
2085
2027
1227
v
PART
3
Appendixes Appendix 1
2111
4 DATA Step Debugger
Introduction Basic Usage
2113
2114 2115
Advanced Usage: Using the Macro Facility with the Debugger Examples 2117 Commands 2129 Dictionary
Appendix 2
2130
4 Perl Regular Expression (PRX) Metacharacters
Tables of Perl Regular Expression (PRX) Metacharacters
4 SAS Utility Macro 2153 Appendix 4 4 Recommended Reading 2157 Appendix 3
Recommended Reading
Index
2116
2159
2157
2145
2145
vi
vii
What’s New
Overview The SAS 9.2 Base new features, language elements, and enhancements to the language elements continue to expand the capabilities of SAS:
3 SAS now supports the next generation Internet Protocol, IPv6, as well as IPv4. 3 The DATA step component Java object enables instantiation of Java classes and accessing fields and methodsChapter 10, “Java Object Language Elements,” on page 2085 on resultant objects.
3 The SAS logging facility is a new logging subsystem that can be used to collect, categorize, and filter log events and write them to various output devices. The logging facility can be used to log SAS server events or events that are initiated from SAS programs. This feature is new for SAS 9.2 Phase 2.
3 In addition to SAS Monospace and SAS Monospace Bold TrueType fonts, new TrueType fonts are available when you install SAS.
3 Universal Printing now supports Scalable Vector Graphics (SVG), Portable Network Graphics (PNG), and PDFA-1b print output formats.
3 You can access remote files by using the Secure File Transfer Protocol (SFTP) access method.
3 SAS now reads and writes ISO 8601 dates, time, and intervals. 3 In support of batch programming, if a program terminates without completion, the new checkpoint mode enables programs to be resubmitted in restart mode, resuming with the DATA or PROC step that was executing when the program terminated.
3 In the “Functions and CALL Routines ”section there are several new and enhanced functions as well as functions that were previously in other products and that are now part of Base SAS. The functions that moved from the Risk Dimensions product calculate the call and put prices from European options on futures, based on various models. The functions that moved from SAS/ETS return information about various date and time intervals. The functions from SAS High-Performance Forecasting return specific dates.
3 The documentation for string functions and CALL routines now has a restriction that identifies whether theses functions and CALL routines support Single Byte
viii
What’s New
Character Sets (SBCS), Double Byte Character Sets (DBCS), or Multi-Byte Character Sets (MBCS). This distinction is important because improper use of these functions and CALL routines can result in unexpected behavior in programs that are written in a non-English language. The description for the restrictions is located in the Function Compatibility with DBCS, MBCS, and SBCS Character Sets section of the documentation.
3 In a DATA step, you can track the execution of code within a DO group. The DATA statement has an optional argument for you to write a note to the SAS log when the DO statement begins and ends.
3 New SAS system options enable you to set a default record length, specify options for accessing PDF files, specify values for Scalable Vector Graphics, support the checkpoint mode and the restart mode, and support fonts.
3 Some of the new features for the DATA step object attributes, operators, and methods remove all items from the hash object without deleting the instance of the hash object, consolidate the FIND and ADD methods into a single method call, return the number of items in the hash object, and specifies a starting key item for iteration.
3 In previous versions of SAS Language Reference: Dictionary, references to language elements in other publications were included in their respective dictionary for each language element type. For example, you could find a reference for the $BIDI format in the format dictionary entries. You can now find references to language elements that are documented in other publications within each section for the language element types. Online, this section appears just before the dictionary entries for each language element type. In the PDF or print copy, this section appears as the last topic for each language element type. A section that describes how SAS syntax is written has been added. This section contains examples of how to interpret the syntax.
SAS System Features Checkpoint Mode and Restart Mode If a batch program terminates before it completes and it was started in checkpoint mode, the program can be resubmitted in restart mode, resuming with the DATA or PROC step that was executing when the program terminated. DATA and PROC steps that have already completed do not need to be rerun. See “Checkpoint Mode and Restart Mode” in SAS Language Reference: Concepts.
Support for ISO 8601 Basic and Extended Time Notations In SAS 9.1.3, the formats and informats that support the ISO 8601 basic and extended time notations were documented in the SAS 9.1.3 XML LIBNAME: User’s Guide. These formats and informats have been renamed and are now documented in SAS Language Reference: Dictionary. The new names clearly distinguish the basic and extended formats and informats. You can see the renamed formats and informats in their respective sections in the topics that follow. In addition, a new CALL routine, IS8601_CONVERT, converts ISO 8601 intervals to datetime and duration values, and datetime and duration values to an ISO 8601 interval.
What’s New ix
Support for IPv6 SAS 9.2 introduces support for the "next generation" of Internet Protocol, IPv6, which is the successor to the current Internet Protocol, IPv4. Rather than replacing IPv4 with IPv6, SAS 9.2 supports both protocols. A primary reason for the new protocol is that the limited supply of 32-bit IPv4 address spaces is being depleted. IPv6 uses a 128-bit address scheme, which provides more IP addresses than IPv4 did. For more information, see Internet Protocol Version 6 (IPV6) in SAS Language Reference: Concepts.
Universal Printing and New TrueType Fonts In SAS 9.2, all Universal Printers and many SAS/GRAPH devices use the FreeType engine to render TrueType fonts for output in all of the operating environments that SAS software supports. In addition, by default, many SAS/GRAPH device drivers and all Universal Printers generate output using ODS styles, and these ODS styles use TrueType fonts. In addition to SAS Monospace and SAS Monospace Bold, 40 additional fonts (TrueType) are available when you install SAS:
3 3 3 3
Three Latin fonts compatible with Microsoft Ten graphic symbol fonts Eight multilingual Unicode fonts Nineteen monolingual Asian fonts
New Universal printers include the following: PDFA
produces an archivable PDF compliant with PDF/A-1b .
PNG
produces Portable Network Graphics, which is a raster image format that is designed to replace the older simple GIF and the more complex TIFF format.
PNGt
produces transparent Portable Network Graphics.
SVG
produces Scalable Vector Graphics, which is a language for describing two-dimensional graphics and graphical applications in XML.
SVGt
produces transparent Scalable Vector Graphics.
SVGnotip
produces Scalable Vector Graphics without tooltips.
SVGView
produces Scalable Vector Graphics with controls to navigate through multi-page SVG documents.
SVGZ
produces compressed Scalable Vector Graphics.
For more information, see Printing with SAS in SAS Language Reference: Concepts.
SAS Logging Facility Language Elements The SAS logging facility is a flexible, configurable logging subsystem that you can use to collect, categorize, and filter log events and write them to a variety of output devices. The SAS language now includes autocall macros, functions, and DATA step component objects for creating logging facility components that categorize log events. The logging facility and the SAS log are two separate logging systems. For more
x
What’s New
information, including the reference documentation for the logging facility language elements, see SAS Logging: Configuration and Programming Reference. This feature is new for SAS 9.2 Phase 2.
WHERE-Expression Processing In a WHERE expression, the LIKE operator now supports an escape character. The escape character enables you to search for the percent sign (%) and the underscore (_) characters in values. For more information, see “Syntax of WHERE Expression” in SAS Language Reference: Concepts.
DATA Step Java Object The DATA step component Java object enables you to instantiate Java classes and access fields and methods on the resultant objects. Although the documentation for the DATA step component Java object for SAS 9.2 Phase 1 has been available on http://support.sas.com, the documentation is available in SAS Help and Documentation for SAS 9.2 Phase 2.
Viewing Help and ODS Output in the Remote Browser The remote browser has been used in some operating environments in prior releases of SAS to view SAS Help and ODS HTML output. You can now view SAS Help and ODS HTML output, and PDF and RTF output under z/OS, OpenVMS, UNIX, and Windows 64-bit environments. Windows 32-bit environments use the SAS browser to view Help and ODS output. You enable remote browsing by configuring these system options: HELPBROWSER= specifies whether youwant to use the remote browser or the SAS browser. HELPHOST=
specifies the name of the computer where the remote browser sends Help and ODS output.
HELPPORT=
specifies the port number for the remote browser client.
For more information about remote browsing, see the Help documentation for your operating environment: OpenVMS, UNIX, Windows, z/OS
SAS Language Elements Data Set Options The DLDMGACTION=NOINDEX data set option has a new argument. The NOINDEX argument automatically repairs the data set without the indexes and integrity constraints, deletes the index file, updates the data file to reflect the disabled indexes and integrity constraints, and limits the data file to be opened only in INPUT mode.
What’s New xi
Formats 3 The following formats are new: $BASE64X converts character data to ASCII text using Base 64 encoding. $N8601B writes ISO 8601 duration, datetime, and interval forms using the basic notations PnYnMnDTnHnMnS and yyyymmddThhmmss. $N8601BA writes ISO 8601 duration, datetime, and interval forms using the basic notations PyyyymmddThhmmss and yyyymmddThhmmss. $N8601E writes ISO 8601 duration, datetime, and interval forms using the extended notations PnYnMnDTnHnMnS and yyyy-mm-ddThh:mm:ss. $N8601EA writes ISO 8601 duration, datetime, and interval forms using the extended notations Pyyyy-mm-ddThh:mm:ss and yyyy-mm-ddThh:mm:ss. $N8601EH writes ISO 8601 duration, datetime, and interval forms for the extended notations Pyyyy-mm-ddThh:mm:ss and yyyy-mm-ddThh:mm:ss, using a hyphen ( - )for omitted components. $N8601EX writes ISO 8601 duration, datetime, and interval forms for the extended notations Pyyyy-mm-ddThh:mm:ss and yyyy-mm-ddThh:mm:ss, using an x for each digit of an omitted component. $N8601H writes ISO 8601 duration, datetime, and interval forms PnYnMnDTnHnMnS and yyyy-mm-ddThh:mm:ss, dropping omitted components in duration values and using a hyphen ( - )for omitted components in datetime values. $N8601X writes ISO 8601 duration, datetime, and interval forms PnYnMnDTnHnMnS and yyyy-mm-ddThh:mm:ss, dropping omitted components in duration values and using an x for each digit of an omitted component in datetime values. B8601DA writes date values using the IOS 8601 base notation yyyymmdd. B8601DN writes the date from a datetime value using the ISO 8601 basic notation yyyymmdd. B8601DT writes datetime values in the ISO 8601 basic notation yyyymmddThhmmssffffff. B8601DZ writes datetime values in the Coordinated Universal Time (UTC) time scale using the ISO 8601 datetime and time zone basic notation yyyymmddThhmmss+|–hhmm.
xii What’s New
B8601LZ writes time values as local time by appending a time zone offset difference between the local time and UTC, using the ISO 8601 basic time notation hhmmss+|–hhmm. B8601TM writes time values using the ISO 8601 basic notation hhmmssffff. B8601TZ adjusts time values to the Coordinated Universal Time (UTC) and writes them using the ISO 8601 basic time notation hhmmss+|–hhmm. BESTD prints numeric values, lining up decimal places for values of similar magnitude, and prints integers without decimals. E8601DA writes date values using the ISO 8601 extended notation yyyy-mm-dd. E8601DN writes the date from a SAS datetime value using the ISO 8601 extended notation yyyy-mm-dd. E8601DT writes datetime values in the ISO 8601 extended notation yyyy-mm-ddThh:mm:ss.ffffff. E8601DZ writes datetime values in the Coordinated Universal Time (UTC) time scale using the ISO 8601 datetime and time zone extended notation yyyy-mm-ddThh:mm:ss+|–hh:mm. E8601LX writes time values as local time, appending the Coordinated Universal Time (UTC) offset for the local SAS session, using the ISO 8601 extended time notation hh:mm:ss+|–hh:mm. E8601TM writes time values using the ISO 8601 extended notation hh:mm:ss.ffffff. E8601TZ adjusts time values to the Coordinated Universal Time (UTC) and writes the values using the ISO 8601 extended notation hh:mm:ss+|–hh:mm. MDYAMPM writes datetime values in the form mm/dd/yy hh:mm AM|PM. The year can be either two or four digits. This feature is new for SAS 9.2 Phase 2 and later. PERCENTN produces percentages, using a minus sign for negative values. SIZEK writes a numeric value in the form nK for kilobytes. This feature is new for SAS 9.2 Phase 2 and later. SIZEKB writes a numeric value in the form nKB for kilobytes. This feature is new for SAS 9.2 Phase 2 and later.
What’s New xiii
SIZEKMG writes a numeric value in the form nKB for kilobytes, nMB for megabytes, or nGB for gigabytes. This feature is new for SAS 9.2 Phase 2 and later. VMSZN generates VMS and MicroFocus COBOL zoned numeric data. 3 The following formats were previously documented in other publications and are now part of this document: WEEKUw. writes a week number in decimal format by using the U algorithm. WEEKVw. writes a week number in decimal format by using the V algorithm. WEEKWw. writes a week number in decimal format by using the W algorithm. 3 The following format is enhanced: DATEw. In addition to writing dates in the form ddmmmyy or ddmmmyyyy, the DATEw. format now writes dates in the form dd-mmm-yyyy.
Functions and CALL Routines 3 The following functions and CALL routines are new: ALLCOMB generates all combinations of the values of n variables taken k at a time in a minimal change order. ALLPERM generates all permutations of the values of several variables in a minimal change order. ARCOSH returns the inverse hyperbolic cosine. ARSINH returns the inverse hyperbolic sine. ARTANH returns the inverse hyperbolic tangent. CALL ALLCOMB generates all combinations of the values of n variables taken k at a time in a minimal change order. CALL ALLCOMBI generates all combinations of the indices of n objects taken k at a time in a minimal change order. CALL GRAYCODE generates all subsets of n items in a minimal change order. CALL ISO8601_CONVERT converts an ISO 8601 interval to datetime and duration values, or converts datetime and duration values to an ISO 8601 interval. CALL LEXCOMB generates all distinct combinations of the non-missing values of n variables taken k at a time in lexicographic order.
xiv
What’s New
CALL LEXCOMBI generates all combinations of the indices of n objects taken k at a time in lexicographic order. CALL LEXPERK generates all distinct permutations of the non-missing values of n variables taken k at a time in lexicographic order. CALL LEXPERM generates all distinct permutations of the non-missing values of several variables in lexicographic order. CALL SORTC sorts the values of character arguments. CALL SORTN sorts the values of numeric arguments. CATQ concatenates character or numeric values by using a delimiter to separate items and by adding quotation marks to strings that contain the delimiter. CHAR returns a single character from a specified position in a character string. CMISS counts the number of missing arguments. COUNTW counts the number of words in a character expression. DIVIDE returns the result of a division that handles special missing values for ODS output. ENVLEN returns the length of an environment variable. EUCLID returns the Euclidean norm of the non-missing arguments. FINANCE computes financial calculations such as deprecation, maturation, accrued interest, net present value, periodic savings, and internal rates of return. FINDW searches a character string for a word. FIRST returns the first character in a character string. GCD returns the greatest common divisor for one or more integers. GEODIST returns the geodetic distance between two latitude and longitude coordinates. GRAYCODE generates all subsets of n items in a minimal change order. INTFIT returns a time interval that is aligned between two dates.
What’s New xv
INTGET returns an interval based on three date or datetime values. INTSHIFT returns the shift interval that corresponds to the base interval. INTTEST returns 1 if a time interval is valid, and returns 0 if a time interval is invalid. LCM returns the smallest multiple that is exactly divisible by every number in a set of numbers. LCOMB computes the logarithm of the COMB function—that is, the logarithm of the number of combinations of n objects taken r at a time. LEXCOMB generates all distinct combinations of the non-missing values of n variables taken k at a time in lexicographic order. LEXCOMBI generates all combinations of the indices of n objects taken k at a time in lexicographic order. LEXPERK generates all distinct permutations of the non-missing values of n variables taken k at a time in lexicographic order. LEXPERM generates all distinct permutations of the non-missing values of several variables in lexicographic order. LFACT computes the logarithm of the FACT (factorial) function. LOG1PX returns the log of 1 plus the argument. LPERM computes the logarithm of the PERM function—that is, the logarithm of the number of permutations of n objects, with the option of including r number of elements. LPNORM returns the Lp norm of the second argument and subsequent non-missing arguments. MD5 returns the result of the message digest of a specified string. MODEXIST determines whether a software image exists in the version of SAS that you have installed. MSPLINT returns the ordinate of a monotonicity-preserving interpolating spline. RENAME renames a member of a SAS library, an external file, or a directory. SUMABS returns the sum of the absolute values of the non-missing arguments.
xvi
What’s New
TRANSTRN removes or replaces all occurrences of a substring in a character string. WHICHC searches for a character value that is equal to the first argument, and returns the index of the first matching value. WHICHN searches for a numeric value that is equal to the first argument, and returns the index of the first matching value. ZIPCITYDISTANCE returns the geodetic distance between two zip code locations.
3 The descriptions of the arguments in the following functions are enhanced: DOPEN opens a directory, and returns a directory identifier value. EXIST verifies the existence of a SAS library member. FOPEN opens an external file and returns a file identifier value. FEXIST verifies the existence of an external file that is associated with a fileref. FILENAME assigns or deassigns a fileref to an external file, a directory, or an output device. FILEREF verifies whether a fileref has been assigned for the current SAS session. LIBNAME assigns or deassigns a libref for a SAS library. LIBREF verifies that a libref has been assigned. MOPEN opens a file by directory ID and member name, and returns either the file identifier or a 0. PATHNAME returns the physical name of a SAS library or an external file, or returns a blank.
3 The following functions were previously in Risk Dimensions, and are now in Base SAS: BLACKCLPRC calculates the call price for European options on futures, based on the Black model. BLACKPTPRC calculates the put price for European options on futures, based on the Black model. BLKSHCLPRT calculates the call price for European options, based on the Black-Scholes model.
What’s New xvii
BLKSHPTPRT calculates the put price for European options, based on the Black-Scholes model. GARKHCLPRC calculates the call price for European options on stocks, based on the Garman-Kohlhagen model. GARKHPTPRC calculates the put price for European options on stocks, based on the Garman-Kohlhagen model. MARGRCLPRC calculates the call price for European options on stocks, based on the Margrabe model. MARGRPTPRC calculates the put price for European options on stocks, based on the Margrabe model.
3 The following functions were previously in SAS/ETS, and are now in Base SAS: INTCINDEX returns the cycle index, given a date, time, or datetime value. INTCYCLE returns the date, time, or datetime interval at the next higher seasonal cycle, given a date, time, or datetime interval. INTFMT returns a recommended format, given a date, time, or datetime interval. INTINDEX returns the seasonal index, given a date, time, or datetime interval and value. INTSEAS returns the length of the seasonal cycle, given a date, time, or datetime interval.
3 The following functions were previously in SAS High-Performance Forecasting, and are now in Base SAS: HOLIDAY returns the date of the specified holiday for the specified year. NWKDOM returns the date for the nth occurrence of a weekday for the specified month and year.
3 The following functions were moved from SAS Language Reference: Dictionary to the SAS/IML documentation: MODULEIC calls an external routine and returns a character value (in the IML environment only). MODULEIN calls an external routine and returns a numeric value (in the IML environment only). CALL MODULEI calls an external routine without any return code (in the IML environment only).
xviii What’s New
3 The following functions and CALL routines are enhanced: CALL POKE can now write floating-point numbers directly into memory on a 32–bit platform. CALL POKELONG can now write floating-point numbers directly into memory on 32-bit and 64-bit platforms. CALL SCAN returns the position and length of a given word from a character expression. DATDIF now has new values for the basis argument, and has a reference to a document that is published by the Securities Industry Association. FSEP now has an optional argument for a hexadecimal character delimiter. INDEX now has an example that shows how leading and trailing spaces are handled. INDEXW can now have alternate delimiters. If you use an alternate delimiter, then INDEXW does not recognize the end of the text as the end data. Another example has also been added to the function. INTCK now has a fifth argument in the syntax. Retail calendar intervals that are ISO 8601 compliant, and custom intervals have been added. INTNX can now use retail calendar intervals that are ISO 8601 compliant. INTCINDEX, INTCYCLE, INTFIT, INTFMT, INTGET, INTINDEX, INTSEAS, INTSHIFT, and INTTEST are now able to use retail calendar intervals that are ISO 8601 compliant. LAG now has more information about memory limits. LIBNAME now has sections that explain how to use the LIBNAME function with one, two, three, and four arguments. OPEN has a new fourth argument. This argument specifies whether the first argument is a two-level name (data set name) or a filename. SCAN returns the nth word from a character expression. TRANSTRN has been rewritten. TRANWRD has an updated Comparisons section and a new example. WEEK now has enhanced documentation for the U, V, and W descriptors.
What’s New xix
ZIPSTATE now has information about Army Post Office (APO) and Fleet Post Office (FPO) codes. 3 The RX set of functions and CALL routines have been removed from the documentation. They have been replaced by a set of PRX functions and CALL routines, which have been available in previous versions of SAS, and which provide superior functionality. The following RX functions and CALL routines were removed: RXMATCH function RXPARSE function RXCHANGE CALL routine RXFREE CALL routine RXSUBSTR CALL routine 3 The SCANQ function and the CALL SCANQ routine have been removed from the documentation and replaced by the superior functionality of the SCAN function and CALL SCAN routine.
Informats 3 The following informats are new: $BASE64X converts ASCII text to character data by using Base 64 encoding. $N8601B reads complete, truncated, and omitted forms of ISO 8601 duration, datetime, and interval values that are specified in either the basic or extended notations. $N8601E reads ISO 8601 duration, datetime, and interval values that are specified in the extended notation. B8601DA reads date values that are specified in the ISO 8601 basic notation yyyymmdd. B8601DN reads date values that are specified the ISO 8601 basic notation yyyymmdd and returns SAS datetime values where the time portion of the value is 000000. B8601DT reads datetime values that are specified in the ISO 8601 basic notation yyyymmddThhmmssffffff. B8601DZ reads datetime values that are specified in the Coordinated Universal Time (UTC) time scale using the ISO 8601 datetime basic notation yyyymmddThhmmss+|–hhmm or yyyymmddThhmmssffffffZ. B8601TM reads time values that are specified in the ISO 8601 basic notation hhmmssffffff. B8601TZ reads time values that are specified in the ISO 8601 basic time notation hhmmssfffff+|–hhmm or hhmmssffffffZ.
xx
What’s New
E8601DA reads date values that are specified in the ISO 8601 extended notation yyyy-mm-dd. E8601DN reads date values that are specified in the ISO 8601 extended notation yyyy-mm-dd and returns SAS datetime values where the time portion of the value is 000000. E8601DT reads datetime values that are specified in the ISO 8601 extended notation yyyy-mm-ddThh:mm:ss.ffffff. E8601DZ reads datetime values that are specified in the Coordinated Universal Time (UTC) time scale using the ISO 8601 datetime extended notation hh:mm:ss+|–hh:mm.fffff orhh:mm:ss.fffffZ. E8601LZ reads Coordinated Universal Time (UTC) values that are specified in the ISO 8601 extended notation hh:mm:ss+|–hh:mm.fffff or hh:mm:ss.fffffZ and converts them to the local time. E8601TM reads time values that are specified in the ISO 8601 extended notation hh:mm:ss.ffffff. E8601TZ reads time values that are specified in the ISO 8601 extended time notation hh:mm:ss+|–hh:mm.ffffff or hh:mm:ssZ. S3270FZDB reads zoned decimal data in which zeros have been left blank. This feature is new for SAS 9.2 Phase 2 and later. SIZEKMG reads numeric data that is appended to the letters K, M, or G. This feature is new for SAS 9.2 Phase 2 and later. VMSZN reads VMS and MicroFocus COBOL zoned numeric data.
3 The following informat is enhanced: TRAILSGN In addition to reading trailing plus (+) and minus (–) signs, the TRAILSGN informat now reads values that contain commas.
3 The following informats were previously documented in other publications and are now part of this document: WEEKUw. reads the format of the number-of-week value within the year and returns a SAS date value using the U algorithm. WEEKVw. reads the format of the number-of-week value within the year and returns a SAS date value using the V algorithm. WEEKWw. reads the format of the number-of-week value within the year and returns a SAS date value using the W algorithm.
What’s New xxi
Statements 3 The following statements are new: CHECKPOINT EXECUTE_ALWAYS enables you to execute the DATA or PROC step that immediately follows without considering the checkpoint-restart data. FILENAME, SFTP Access Method enables you to access remote files by using the SFTP protocol. SYSECHO enables IOM clients to manually track the progress of a segment of a submitted SAS program.
3 The following statements are enhanced: %INCLUDE 3 The filename of a file that is located in an aggregate storage location and does not have a valid SAS name can be used as a fileref if the filename is enclosed in quotation marks. 3 The maximum line limit is now 6K. ABORT Two new optional arguments enable you to do the following: 3 cause the execution of the submitted statements to be canceled. 3 suppress the output of all variables to the SAS log. ATTRIB The TRANSCODE=NO attribute is not supported by some SAS Workspace Server clients. In SAS 9.2, if the attribute is not supported, variables with TRANSCODE=NO are replaced (masked) with asterisks (*). Before SAS 9.2, variables with TRANSCODE=NO were transcoded. BY The BY statement honors the linguistic collation of data that is sorted by using the SORT procedure with the SORTSEQ=LINGUISTIC option. DATA Three new optional arguments enable you to do the following: 3 write a note to the SAS log for the beginning and end of each level of nesting DO statements. 3 specify the maximum number of nested LINK statements. 3 suppress the output of all variables to the SAS log. DECLARE 3 Data set options can now be used with the dataset: argument tag. 3 Three new argument tags enable you to do the following: 3 maintain a summary count of hash object keys. 3 ignore duplicate keys when loading a data set into the hash object. 3 specify whether multiple data items are allowed for each key. FILE
3 The filename of a file that is located in an aggregate storage location and does not have a valid SAS name can be used as a fileref if the filename is enclosed in quotation marks.
xxii What’s New
3 A new option enables you to specify a character string as an alternate delimiter (other than a blank) to be used for LIST output. FILENAME, CATALOG Access Method You can now specify RECFM=S (stream–record format). FILENAME, EMAIL (SMTP) Access Method
3 You can now specify a file attachment without an extension. 3 A new option enables you to specify the priority of the e-mail message. FILENAME, FTP Access Method Six new FTP options enable you to do the following:
3 specify the name of an authentication domain metadata object that references credentials (user ID and password) in order to connect to the FTP server without your having to explicitly specify the credentials.
3 specify that the member type of DATA is automatically appended to the member name when you use the DIR option.
3 enable autocall macro retrieval of lowercase directory or member names from FTP servers.
3 save the user ID and password after the user ID and password prompt are successfully executed.
3 specify the line delimiter to use for variable-record formats: carriage return followed by a line feed, a line feed only, or a NULL character.
3 specify the length of the FTP server response message. FILENAME, SFTP Access Method In SAS 9.2 Phase 2 and later, two new SFTP options enable you to do the following:
3 specify the fully qualified pathname and the filename of the batch file that contains the SFTP commands. These commands are submitted when the SFTP access method is executed.
3 specify an SFTP response wait time in milliseconds. FILENAME, URL Access Method
3 N can now be used as an alias for a stream-record format (RECFM=S). 3 Five new URL options enable you to do the following: 3 specify the name of an authentication domain metadata object that references credentials (user ID and password) in order to connect to the proxy or Web server without your having to explicitly specify the credentials.
3 specify a fileref to which the header information is written when a file is opened using the URL access method. The header information is the same information that is written to the SAS log.
3 specify a user name with which you can access the proxy server. 3 specify a password with which you can access the proxy server. 3 specify the line delimiter to use when RECFM=V. FILENAME, WebDAV Access Method
3 For SAS 9.2 Phase 2 and later, the FILENAME statement, WebDAV Access Method is available for use in the z/OS operating environment.
3 The SASBAMW keyword in the FILENAME statement syntax has been changed to WEBDAV.
What’s New
xxiii
3 Three new WebDAV options enable you to do the following: 3 access directory files. 3 specify that a file extension is automatically appended to the filename when you use the DIR option.
3 retry lowercase directory or member names from WebDAV servers by using an autocall macro. FOOTNOTE a new argument enables you to specify formatting options for the ODS HTML, RTF, and PRINTER(PDF) destinations. INFILE
3 The filename of a file that is located in an aggregate storage location and does not have a valid SAS name can be used as a fileref if the filename is enclosed in quotation marks.
3 A new option enables you to specify a character string as an alternate delimiter (other than a blank) to be used for LIST output.
3 A new optional argument specifies the type of device or the access method that is used if the fileref points to an input or output device or location that is not a physical file. LIBNAME for WebDAV Server Access
3 When you assign a libref to a file on a WebDAV server, the path (URL location), user ID, and password are associated with that libref. After the first libref is assigned, the user ID and password will be validated on subsequent attempts to assign another libref to the same library.
3 SAS will honor a lock request on a file on a WebDAV server only if the file is already locked by another user.
3 Two new WebDAV options enable you to do the following: 3 specify the name of an authentication domain metadata object that references credentials (user ID and password) in order to connect to the WebDAV server without your having to explicitly specify the credentials.
3 prompt the user for an ID and password. MERGE a new argument enables you to specify at least two existing SAS data sets by using either a numbered range list or a named prefix list. SET
3 a new argument creates and names a variable that stores the name of the SAS data set from which the current observation is read. The stored name can be a data set name or a physical name. The physical name is the name by which the operating environment recognizes the file.
3 a new argument enables you to specify at least two existing SAS data sets by using either a numbered range list or a named prefix list. TITLE added an argument that enables you to specify formatting options for the ODS HTML, RTF, and PRINTER(PDF) destinations.
xxiv
What’s New
System Options 3 The following system options are new: CGOPTIMIZE specifies the level of optimization to perform during code optimization. This feature is new for SAS 9.2 Phase 2 and later. CMPMODEL= specifies the output model type for the MODEL procedure. DEFLATION= specifies the level of compression for device drivers that support the Deflate compression algorithm. DMSPGMLINESIZE= specifies the maximum number of characters in a Program Editor line. EMAILFROM when sending an e-mail that uses SMTP, specifies whether the e-mail option FROM is required in either the FILE or the FILENAME statement. FILESYNC= specifies when operating system buffers that contain contents of permanent SAS files are written to disk. FONTEMBEDDING specifies whether font embedding is enabled in Universal Printer and SAS/ GRAPH printing. FONTRENDERING= specifies whether SAS/GRAPH devices that are based on the SASGDGIF, SASGDTIF, and SASGDIMG modules render fonts by using the operating system or by using the FreeType font engine. GSTYLE specifies whether ODS styles can be used in the generation of graphs that are stored as GRSEG catalog entries. HELPBROWSER= specifies the browser to use for SAS Help and ODS output. This feature is new for SAS 9.2 Phase 2 and later. HELPHOST= specifies the name of the computer where the remote browser is to send Help and ODS output. This feature is new for SAS 9.2 Phase 2 and later. HELPPORT= specifies the port number for the remote browser client. This feature is new for SAS 9.2 Phase 2 and later. HTTPSERVERPORTMAX= specifies the highest port number that can be used by the SAS HTTP server for remote browsing. This feature is new for SAS 9.2 Phase 2 and later. HTTPSERVERPORTMIN= specifies the lowest port number that can be used by the SAS HTTP server for remote browsing. This feature is new for SAS 9.2 Phase 2 and later. IBUFNO= specifies an optional number of extra buffers to be allocated for navigating an index file. SAS automatically allocates a minimal number of buffers in order
What’s New
xxv
to navigate the index file. Typically, you do not need to specify extra buffers. However, using IBUFNO= to specify extra buffers could improve execution time by limiting the number of input/output operations that are required for a particular index file. INTERVALDS= specifies a SAS data set that contains user-supplied holidays that can be used by the INTNX and INTCK functions. This feature is new for SAS 9.2 Phase 2 and later. JPEGQUALITY specifies the JPEG quality factor that determines the ratio of image quality to the level of compression for JPEG files processed by the SAS/GRAPH JPEG device driver. LRECL= specifies the default logical record length to use for reading and writing external files. PDFACCESS specifies whether text and graphics from PDF documents can be read by screen readers for the visually impaired. PDFASSEMBLY specifies whether PDF documents can be assembled. PDFCOMMENT specifies whether PDF document comments can be modified. PDFCONTENT specifies whether the contents of a PDF document can be changed. PDFCOPY specifies whether text and graphics from a PDF document can be copied. PDFFILLIN specifies whether PDF forms can be filled in. PDFPAGELAYOUT specifies the page layout for PDF documents. PDFPAGEVIEW specifies the page viewing mode for PDF documents. PDFPASSWORD specifies the password to use to open a PDF document and the password used by a PDF document owner. PDFPRINT specifies the resolution to print PDF documents. PDFSECURITY specifies the printing permissions for PDF documents. PRIMARYPROVIDERDOMAIN= specifies the domain name of the primary authentication provider. This feature is new for SAS 9.2 Phase 2 and later. S2V specifies the starting position to begin reading a file specified in a %INCLUDE statement, an autoexec file, or an autocall macro file with a variable length format.
xxvi
What’s New
SORTVALIDATE specifies whether the SORT procedure verifies that a data set is sorted according to the variables in the BY statement when the sort indicator metadata indicates a user-specified sort order. SQLCONSTDATETIME specifies whether the SQL procedure replaces references to the DATE, TIME, DATETIME, and TODAY functions in a query with their equivalent constant values before the query executes. SQLREDUCEPUT for the SQL procedure, specifies the engine type that a query uses for which optimization is performed by replacing a PUT function in a query with a logically equivalent expression. SQLREDUCEPUTOBS for the SQL procedure when the SQLREDUCEPUT= system option is set to NONE, specifies the minimum number of observations that must be in a table for PROC SQL to consider optimizing the PUT function in a query. SQLREDUCEPUTVALUES= for the SQL procedure when the SQLREDUCEPUT= system option is set to NONE, specifies the minimum number of SAS format values that can exist in a PUT function expression in order for PROC SQL to consider optimizing the PUT function in a query. SQLREMERGE specifies whether the SQL procedure can process queries that use remerging of data. SQLUNDOPOLICY= specifies whether the SQL procedure keeps or discards updated data if errors occur while the data is being updated. STEPCHKPT specifies whether to run a batch program in checkpoint-restart mode. In checkpoint-restart mode, if a batch program terminates during execution, the program can be restarted beginning with the DATA or PROC step that was executing when the program terminated. STEPCHKPTLIB specifies the libref which identifies the library that contains the checkpoint-restart data. STEPRESTART specifies whether to start a batch program using the checkpoint data. SVGCONTROLBUTTONS specifies whether to display the paging control buttons and an index in a multi-page SVG document. SVGHEIGHT specifies the height of the viewport unless the SVG output is embedded in another SVG output; specifies the value of the HEIGHT attribute of the outermost element in the SVG file. SVGPRESERVEASPECTRATIO specifies whether to force uniform scaling of SVG output; sets the preserveAspectRatio attribute on the outermost element.
What’s New xxvii
SVGTITLE specifies the title in the title bar of the SVG output; specifies the value of the element in the SVG file. SVGVIEWBOX specifies the coordinates, width, and height that are used to set the viewBox attribute on the outermost element, which enables SVG output to scale to the viewport. SVGWIDTH specifies the width of the viewport unless the SVG output is embedded in another SVG output; specifies the value of the width attribute of the outermost element in the SVG file. SVGX specifies the x-axis coordinate of one corner of the rectangular region into which an embedded element is placed; specifies the x attribute on the outermost element of the SVG file. SVGY specifies the y-axis coordinate of one corner of the rectangular region into which an embedded element is placed; specifies the y attribute on the outermost element of the SVG file. UPRINTCOMPRESSION specifies whether to enable compression of Universal Printer and SAS/GRAPH print files. VARLENCHK= specifies the type of message to write to the SAS log if the length of a variable is increased when the input data set is read using the SET, MERGE, UPDATE, or MODIFY statements. This option is new for SAS 9.2 Phase 2. 3 The following system options have a new argument: DLDMGACTION=NOINDEX For data sets, automatically repairs the data set without the indexes and integrity constraints, deletes the index file, updates the data file to reflect the disabled indexes and integrity constraints, and limits the data file to be opened only in INPUT mode. CMPOPT=FUNCDIFFERENCING specifies whether analytic derivatives are computed for user-defined functions. 3 The following system options are enhanced: ECHOAUTO SAS writes the autoexec file statements to the SAS log. EMAILHOST You can now specify multiple Simple Mail Transfer Protocol (SMTP) mail servers. E-mail system options All e-mail system options can now be set at any time. They are no longer restricted to being set when SAS starts. OVP The default value for the OVP system option is now NOOVP. SYSPRINTFONT= You can specify the name of a Universal Printer to which the SYSPRINTFONT system option setting applies.
xxviii What’s New
3 The syntax for the following system options is different when these system options are used after SAS starts, as compared to the syntax that is used when SAS starts. For the syntax to use when SAS starts, see the documentation for your operating environment. This feature is new for SAS 9.2 Phase 2: APPEND= Appends a value to the existing value of the specified system option. INSERT= Inserts the specified value as the first value of the specified system option.
3 The following system options are no longer supported and have been removed from the documentation: BATCH no longer has an impact on the settings for the LINESIZE, OVP, PAGESIZE, and SOURCE system options when SAS executes. GISMAPS SAS 9.2 no longer supplies U.S. Census Tract maps for SAS/GIS.
DATA Step Object Attributes, Operators, and Methods 3 For SAS 9.2 Phase 2 and later, the Java object language elements in Chapter 10, “Java Object Language Elements,” on page 2085 are now documented in SAS Language Reference: Dictionary.
3 The following hash and hash iterator methods are new: CLEAR removes all items from the hash object without deleting the hash object instance. EQUALS determines whether two hash objects are equal. FIND_NEXT sets the current list item to the next item in the current key’s multiple item list and sets the data for the corresponding data variables. FIND_PREV sets the current list item to the previous item in the current key’s multiple item list and sets the data for the corresponding data variables. HAS_NEXT determines whether there is a next item in the current key’s multiple data item list. HAS_PREV determines whether there is a previous item in the current key’s multiple data item list. REF consolidates the FIND and ADD methods into a single method call. REMOVEDUP removes the data that is associated with the specified key’s current data item from the hash object. REPLACEDUP replaces the data that is associated with the current key’s current data item with new data.
What’s New xxix
SETCUR specifies a starting key item for iteration. SUM retrieves the summary value for a given key from the hash table and stores the value in a DATA step variable. SUMDUP retrieves the summary value for the current data item of the current key and stores the value in a DATA step variable.
3 The following hash object method is enhanced: DEFINEDONE added an optional argument that enables recovery from memory failure when loading a data set into a hash object.
3 The following hash object attribute is new: ITEM_SIZE returns the number of items in the hash object.
3 The _NEW_ statement has been reclassified as an operator. 3 For SAS 9.2 Phase 2 and later, the items in a multiple data item list are now maintained in the order in which you insert them.
xxx
What’s New
1
1
P A R T
Dictionary of Language Elements Chapter
1. . . . . . . . . . Introduction to the SAS 9.2 Language Reference: Dictionary 3
Chapter
2 . . . . . . . . . . SAS Data Set Options
Chapter
3 . . . . . . . . . . Formats
Chapter
4 . . . . . . . . . . Functions and CALL Routines
Chapter
5 . . . . . . . . . . Informats
Chapter
6 . . . . . . . . . . Statements
Chapter
7 . . . . . . . . . . SAS System Options
9
81
1215 1377 1763
287
2
3
CHAPTER
1 Introduction to the SAS 9.2 Language Reference: Dictionary The SAS Language Reference: Dictionary 3 Syntax Conventions for the SAS Language 4 Overview of Syntax Conventions for the SAS Language Syntax Components 4 Style Conventions 5 Special Characters 6 References to SAS Libraries and External Files 6
4
The SAS Language Reference: Dictionary SAS Language Reference: Dictionary provides detailed reference information for the major language elements of Base SAS software:
3 3 3 3 3 3 3 3
data set options formats functions and CALL routines informats statements SAS system options. hash and hash iterator DATA step component object attributes and methods Java DATA step component object attributes and methods
It also includes the following four appendixes:
3 3 3 3
DATA step debugger Perl Regular Expression (PRX) Metacharacters SAS utility macro Recommended reading.
For extensive conceptual information about the SAS System and the SAS language, including the DATA step, see SAS Language Reference: Concepts.
4
Syntax Conventions for the SAS Language
4
Chapter 1
Syntax Conventions for the SAS Language Overview of Syntax Conventions for the SAS Language SAS uses standard conventions in the documentation of syntax for SAS language elements. These conventions enable you to easily identify the components of SAS syntax. The conventions can be divided into these parts: 3 syntax components 3 style conventions 3 special characters 3 references to SAS libraries and external files
Syntax Components The components of the syntax for most language elements include a keyword and arguments. For some language elements only a keyword is necessary. For other language elements the keyword is followed by an equal sign (=). keyword
specifies the name of the SAS language element that you use when you write your program. Keyword is a literal that is usually the first word in the syntax. In a CALL routine, the first two words are keywords. In the following examples of SAS syntax, the keywords are the first words in the syntax: CHAR (string, position) CALL RANBIN (seed, n, p, x); ALTER (alter-password) BEST w. REMOVE In the following example, the first two words of the CALL routine are the keywords: CALL RANBIN(seed, n, p, x) The syntax of some SAS statements consists of a single keyword without arguments: DO; ... SAS code ... END; Some system options require that one of two keyword values be specified: DUPLEX | NODUPLEX
argument
specifies a numeric or character constant, variable, or expression. Arguments follow the keyword or an equal sign after the keyword. The arguments are used by SAS to process the language element. Arguments can be required or optional. In the syntax, optional arguments are enclosed between angle brackets.
SAS 9.2 Language Reference: Dictionary
4
Style Conventions
5
In the following example, string and position follow the keyword CHAR. These arguments are required arguments for the CHAR function: CHAR (string, position) Each argument has a value. In the following example of SAS code, the argument string has a value of ’summer’, and the argument position has a value of 4: x=char(’summer’, 4);
In the following example, string and substring are required arguments, while modifiers and startpos are optional. FIND(string, substring < ,modifiers> Note: In most cases, example code in SAS documentation is written in lowercase with a monospace font. You can use uppercase, lowercase, or mixed case in the code that you write. 4
Style Conventions The style conventions that are used in documenting SAS syntax include uppercase bold, uppercase, and italic: UPPERCASE BOLD
identifies SAS keywords such as the names of functions or statements. In the following example, the keyword ERROR is written in uppercase bold: ERROR;
UPPERCASE
identifies arguments that are literals. In the following example of the CMPMODEL= system option, the literals include BOTH, CATALOG, and XML: CMPMODEL = BOTH | CATALOG | XML
italics
identifies arguments or values that you supply. Items in italics represent user-supplied values that are either one of the following:
3 nonliteral arguments In the following example of the LINK statement, the argument label is a user-supplied value and is therefore written in italics: LINK label;
3 nonliteral values that are assigned to an argument In the following example of the FORMAT statement, the argument DEFAULT is assigned the variable default-format: FORMAT = variable-1 ; Items in italics can also be the generic name for a list of arguments from which you can choose (for example, attribute-list). If more than one of an item in italics can be used, the items are expressed as item-1, ..., item-n.
6
Special Characters
4
Chapter 1
Special Characters The syntax of SAS language elements can contain the following special characters: =
an equal sign identifies a value for a literal in some language elements such as system options. In the following example of the MAPS system option, the equal sign sets the value of MAPS: MAPS = location-of-maps
angle brackets identify optional arguments. Any argument that is not enclosed in angle brackets is required. In the following example of the CAT function, at least one item is required: CAT (item-1 )
|
a verticle bar indicates that you can choose one value from a group of values. Values that are separated by the vertical bar are mutually exclusive. In the following example of the CMPMODEL= system option, you can choose only one of the arguments: CMPMODEL = BOTH | CATALOG | XML
...
an ellipsis indicates that the argument or group of arguments following the ellipsis can be repeated. If the ellipsis and the following argument are enclosed in angle brackets, then the argument is optional. In the following example of the CAT function, the ellipsis indicates that you can have multiple optional items: CAT (item-1 )
’value’ or “value”
indicates that an argument enclosed in single or double quotation marks must have a value that is also enclosed in single or double quotation marks. In the following example of the FOOTNOTE statement, the argument text is enclosed in quotation marks: FOOTNOTE < ods-format-options ’text’ | “text”>;
;
a semicolon indicates the end of a statement or CALL routine. In the following example each statement ends with a semicolon: data namegame; length color name $8; color = ’black’; name = ’jack’; game = trim(color) || name; run;
References to SAS Libraries and External Files Many SAS statements and other language elements refer to SAS libraries and external files. You can choose whether to make the reference through a logical name (a
SAS 9.2 Language Reference: Dictionary
4
References to SAS Libraries and External Files
7
libref or fileref) or use the physical filename enclosed in quotation marks. If you use a logical name, you usually have a choice of using a SAS statement (LIBNAME or FILENAME) or the operating environment’s control language to make the association. Several methods of referring to SAS libraries and external files are available, and some of these methods depend on your operating environment. In the examples that use external files, SAS documentation uses the italicized phrase file-specification. In the examples that use SAS libraries, SAS documentation uses the italicized phrase SAS-library. Note that SAS-library is enclosed in quotation marks: infile file-specification obs = 100; libname libref ’SAS-library’;
8
9
CHAPTER
2 SAS Data Set Options Definition of Data Set Options 10 Syntax 10 Using Data Set Options 10 Using Data Set Options with Input or Output SAS Data Sets How Data Set Options Interact with System Options 11 Data Set Options by Category 12 Dictionary 14 ALTER= Data Set Option 14 BUFNO= Data Set Option 15 BUFSIZE= Data Set Option 16 CNTLLEV= Data Set Option 18 COMPRESS= Data Set Option 19 DLDMGACTION= Data Set Option 21 DROP= Data Set Option 22 ENCRYPT= Data Set Option 23 FILECLOSE= Data Set Option 24 FIRSTOBS= Data Set Option 25 GENMAX= Data Set Option 27 GENNUM= Data Set Option 28 IDXNAME= Data Set Option 29 IDXWHERE= Data Set Option 31 IN= Data Set Option 32 INDEX= Data Set Option 33 KEEP= Data Set Option 35 LABEL= Data Set Option 36 OBS= Data Set Option 38 OBSBUF= Data Set Option 43 OUTREP= Data Set Option 45 POINTOBS= Data Set Option 47 PW= Data Set Option 48 PWREQ= Data Set Option 49 READ= Data Set Option 50 RENAME= Data Set Option 51 REPEMPTY= Data Set Option 53 REPLACE= Data Set Option 54 REUSE= Data Set Option 55 SORTEDBY= Data Set Option 56 SPILL= Data Set Option 58 TOBSNO= Data Set Option 65 TYPE= Data Set Option 65 WHERE= Data Set Option 67
10
10
Definition of Data Set Options
4
Chapter 2
WHEREUP= Data Set Option 68 WRITE= Data Set Option 70 Data Set Options Documented in Other SAS Publications 71 SAS Companion for Windows 71 SAS Companion for OpenVMS on HP Integrity Servers 71 SAS Companion for UNIX Environments 71 SAS Companion for z/OS 72 SAS National Language Support: Reference Guide 72 SAS Scalable Performance Data Engine: Reference 73 SAS/ACCESS for Relational Databases: References 74
Definition of Data Set Options Data set options specify actions that apply only to the SAS data set with which they appear. They let you perform the following operations:
3 3 3 3
renaming variables selecting only the first or last n observations for processing dropping variables from processing or from the output data set specifying a password for a data set
Syntax Specify a data set option in parentheses after a SAS data set name. To specify several data set options, separate them with spaces. (option-1=value-1) These examples show data set options in SAS statements:
3
data scores(keep=team game1 game2 game3);
3
data mydata(index=(b k) label=’label for my data set’ drop=p read=secret);
3
data new(drop=i n index=(j combo=(x1 a1 a20 b1 b50 )));
3
data idxdup2(compress=yes index=(ok1 ok2 ssn/unique ok3));
3
proc print data=new(drop=year);
3
set old(rename=(date=Start_Date));
Using Data Set Options Using Data Set Options with Input or Output SAS Data Sets Most SAS data set options can apply to either input or output SAS data sets in DATA steps or procedure (PROC) steps. If a data set option is associated with an input data set, the action applies to the data set that is being read. If the option appears in the DATA statement or after an output data set specification in a PROC step, SAS applies the action to the output data set. In the DATA step, data set options for output data sets must appear in the DATA statement, not in any OUTPUT statements that might be present.
SAS Data Set Options
4
How Data Set Options Interact with System Options
11
Some data set options, such as COMPRESS=, are meaningful only when you create a SAS data set because they set attributes that exist for the duration of the data set. To change or cancel most data set options, you must re-create the data set. You can change other options (such as PW= and LABEL=) with PROC DATASETS. For more information, see the “DATASETS Procedure” in Base SAS Procedures Guide. When data set options appear on both input and output data sets in the same DATA or PROC step, first SAS applies data set options to input data sets. Then SAS evaluates programming statements or applies data set options to output data sets. Likewise, data set options that are specified for the data set being created are applied after programming statements are processed. For example, when using the RENAME= data set option, the new names are not associated with the variables until the DATA step ends. In some instances, data set options conflict when they are used in the same statement. For example, you cannot specify both the DROP= and KEEP= data set options for the same variable in the same statement. Timing can also be an issue in some cases. For example, if using KEEP= and RENAME= on a data set specified in the SET statement, KEEP= needs to use the original variable names. SAS processes KEEP= before the data set is read. The new names specified in RENAME= apply to the programming statements that follow the SET statement.
How Data Set Options Interact with System Options Many system options and data set options share the same name and have the same function. System options remain in effect for all DATA and PROC steps in a SAS job or session unless they are respecified. The data set option overrides the system option for the data set in the step in which it appears. In this example, the OBS= system option in the OPTIONS statement specifies that only the first 100 observations are processed from any data set within the SAS job. The OBS= data set option in the SET statement, however, overrides the system option for data set TWO and specifies that only the first five observations are read from data set TWO. The PROC PRINT step prints the data set FINAL. This data set contains the first 5 observations from data set TWO, followed by the first 100 observations from data set THREE: options obs=100; data final; set two(obs=5) three; run; proc print data=final; run;
12
Data Set Options by Category
4
Chapter 2
Data Set Options by Category
Table 2.1 Categories and Descriptions of SAS Data Set Options
Category
SAS Data Set Options
Description
Data Set Control
“ALTER= Data Set Option” on page 14
Assigns an ALTER password to a SAS file that prevents users from replacing or deleting the file, and enables access to a read- and write-protected file.
“BUFNO= Data Set Option” on page 15
Specifies the number of buffers to be allocated for processing a SAS data set.
“BUFSIZE= Data Set Option” on page 16
Specifies the size of a permanent buffer page for an output SAS data set.
“CNTLLEV= Data Set Option” on page 18
Specifies the level of shared access to a SAS data set.
“COMPRESS= Data Set Option” on page 19
Specifies how observations are compressed in a new output SAS data set.
“DLDMGACTION= Data Set Option” on page 21
Specifies the action to take when a SAS data set in a SAS library is detected as damaged.
“ENCRYPT= Data Set Option” on page 23
Specifies whether to encrypt an output SAS data set.
“GENMAX= Data Set Option” on page 27
Requests generations for a new data set, modifies the number of generations for an existing data set, and specifies the maximum number of versions.
“GENNUM= Data Set Option” on page 28
Specifies a particular generation of a SAS data set.
“INDEX= Data Set Option” on page 33
Defines an index for a new output SAS data set.
“LABEL= Data Set Option” on page 36
Specifies a label for a SAS data set.
“OBSBUF= Data Set Option” on page 43
Determines the size of the view buffer for processing a DATA step view.
“OUTREP= Data Set Option” on page 45
Specifies the data representation for the output SAS data set.
“PW= Data Set Option” on page 48
Assigns a READ, WRITE, and ALTER password to a SAS file, and enables access to a password-protected SAS file.
“PWREQ= Data Set Option” on page 49
Specifies whether to display a dialog box to enter a SAS data set password.
“READ= Data Set Option” on page 50
Assigns a READ password to a SAS file that prevents users from reading the file, unless they enter the password.
“REPEMPTY= Data Set Option” on page 53
Specifies whether a new, empty data set can overwrite an existing SAS data set that has the same name.
SAS Data Set Options
Category
4
Data Set Options by Category
SAS Data Set Options
Description
“REPLACE= Data Set Option” on page 54
Specifies whether a new SAS data set that contains data can overwrite an existing data set that has the same name.
“REUSE= Data Set Option” on page 55
Specifies whether new observations can be written to freed space in compressed SAS data sets.
“SORTEDBY= Data Set Option” on page 56
Specifies how a data set is currently sorted.
“SPILL= Data Set Option” on page 58
Specifies whether to create a spill file for non-sequential processing of a DATA step view.
“TOBSNO= Data Set Option” on page 65
Specifies the number of observations to send in a client/ server transfer.
“TYPE= Data Set Option” on page 65
Specifies the data set type for a specially structured SAS data set.
“WRITE= Data Set Option” on page 70
Assigns a WRITE password to a SAS file that prevents users from writing to a file, unless they enter the password.
Miscellaneous
“FILECLOSE= Data Set Option” on page 24
Specifies how a tape is positioned when a SAS data set is closed.
Observation Control
“FIRSTOBS= Data Set Option” on page 25
Specifies the first observation that SAS processes in a SAS data set.
“IN= Data Set Option” on page 32
Creates a Boolean variable that indicates whether the data set contributed data to the current observation.
“OBS= Data Set Option” on page 38
Specifies the last observation that SAS processes in a data set.
“POINTOBS= Data Set Option” on page 47
Specifies whether SAS creates compressed data sets whose observations can be randomly accessed or sequentially accessed.
“WHERE= Data Set Option” on page 67
Specifies specific conditions to use to select observations from a SAS data set.
“WHEREUP= Data Set Option” on page 68
Specifies whether to evaluate new observations and modified observations against a WHERE expression.
“IDXNAME= Data Set Option” on page 29
Directs SAS to use a specific index to match the conditions of a WHERE expression.
“IDXWHERE= Data Set Option” on page 31
Specifies whether SAS uses an index search or a sequential search to match the conditions of a WHERE expression.
“DROP= Data Set Option” on page 22
For an input data set, excludes the specified variables from processing; for an output data set, excludes the specified variables from being written to the data set.
User Control of SAS Index Usage
Variable Control
13
14
Dictionary
4
Chapter 2
Category
SAS Data Set Options
Description
“KEEP= Data Set Option” on page 35
For an input data set, specifies the variables to process; for an output data set, specifies the variables to write to the data set.
“RENAME= Data Set Option” on page 51
Changes the name of a variable.
Dictionary
ALTER= Data Set Option Assigns an ALTER password to a SAS file that prevents users from replacing or deleting the file, and enables access to a read- and write-protected file. Valid in:
DATA step and PROC steps
Category:
Data Set Control
See:
ALTER= Data Set Option in the documentation for your operating environment.
Syntax ALTER=alter-password
Syntax Description
alter-password
must be a valid SAS name. See “Rules for Words and Names in the SAS Language” in SAS Language Reference: Concepts.
Details The ALTER= option applies to all types of SAS files except catalogs. You can use this option to assign a password to a SAS file or to access a read-protected, write-protected, or alter-protected SAS file. When replacing a SAS data set that is protected with an ALTER password, the new data set inherits the ALTER password. To change the ALTER password for the new data set, use the MODIFY statement in the DATASETS procedure. Note: A SAS password does not control access to a SAS file beyond the SAS system. You should use the operating system-supplied utilities and file-system security controls in order to control access to SAS files outside of SAS. 4
SAS Data Set Options
4
BUFNO= Data Set Option
15
See Also Data Set Options: “ENCRYPT= Data Set Option” on page 23 “PW= Data Set Option” on page 48 “READ= Data Set Option” on page 50 “WRITE= Data Set Option” on page 70 “File Protection” in SAS Language Reference: Concepts “Manipulating Passwords” in “The DATASETS Procedure” in Base SAS Procedures Guide
BUFNO= Data Set Option Specifies the number of buffers to be allocated for processing a SAS data set. DATA step and PROC steps Category: Data Set Control See: BUFNO= Data Set Option in the documentation for your operating environment. Valid in:
Syntax BUFNO= n | nK | hexX | MIN | MAX
Syntax Description n | nK
specifies the number of buffers in multiples of 1 (bytes); 1,024 (kilobytes). For example, a value of 8 specifies 8 buffers, and a value of 1k specifies 1024 buffers. hexX
specifies the number of buffers as a hexadecimal value. You must specify the value beginning with a number (0-9), followed by an X. For example, the value 2dx sets the number of buffers to 45 buffers. MIN
sets the minimum number of buffers to 0, which causes SAS to use the minimum optimal value for the operating environment. This is the default. MAX
sets the number of buffers to the maximum possible number in your operating 31 environment, up to the largest four-byte, signed integer, which is 2 -1, or approximately 2 billion.
Details The buffer number is not a permanent attribute of the data set; it is valid only for the current SAS session or job.
16
4
BUFSIZE= Data Set Option
Chapter 2
BUFNO= applies to SAS data sets that are opened for input, output, or update. A larger number of buffers can speed up execution time by limiting the number of input and output (I/O) operations that are required for a particular SAS data set. However, the improvement in execution time comes at the expense of increased memory consumption. To reduce I/O operations on a small data set as well as speed execution time, allocate one buffer for each page of data to be processed. This technique is most effective if you read the same observations several times during processing. Operating Environment Information: The default value for BUFNO= is determined by your operating environment and is set to optimize sequential access. To improve performance for direct (random) access, you should change the value for BUFNO=. For the default setting and possible settings for direct access, see the BUFNO= data set option in the SAS documentation for your operating environment. 4
Comparisons 3 If the BUFNO= data set option is not specified, then the value of the BUFNO= system option is used. If both are specified in the same SAS session, the value specified for the BUFNO= data set option overrides the value specified for the BUFNO= system option.
3 To request that SAS allocate the number of buffers based on the number of data set pages and index file pages, use the SASFILE global statement.
See Also Data Set Options: “BUFSIZE= Data Set Option” on page 16 System Options: “BUFNO= System Option” on page 1796
Statements: “SASFILE Statement” on page 1702
BUFSIZE= Data Set Option Specifies the size of a permanent buffer page for an output SAS data set. Valid in:
DATA step and PROC steps
Category: Restriction: See:
Data Set Control Use with output data sets only.
BUFSIZE= Data Set Option in the documentation for your operating environment.
Syntax BUFSIZE= n | nK | nM | nG | hexX | MAX
SAS Data Set Options
4
BUFSIZE= Data Set Option
17
Syntax Description n | nK | nM | nG
specifies the page size in multiples of 1 (bytes); 1,024 (kilobytes); 1,048,576 (megabytes); or 1,073,741,824 (gigabytes). For example, a value of 8 specifies a page size of 8 bytes, and a value of 4k specifies a page size of 4096 bytes. The default is 0, which causes SAS to use the minimum optimal page size for the operating environment. hexX
specifies the page size as a hexadecimal value. You must specify the value beginning with a number (0-9), followed by an X. For example, the value 2dx sets the page size to 45 bytes. MAX
sets the page size to the maximum possible number in your operating environment, 31 up to the largest four-byte, signed integer, which is 2 -1, or approximately 2 billion bytes.
Details The page size is the amount of data that can be transferred for a single I/O operation to one buffer. The page size is a permanent attribute of the data set and is used when the data set is processed. A larger page size can speed up execution time by reducing the number of times SAS has to read from or write to the storage medium. However, the improvement in execution time comes at the cost of increased memory consumption. To change the page size, use a DATA step to copy the data set and either specify a new page or use the SAS default. To reset the page size to the default value in your operating environment, use BUFSIZE=0. Note: If you use the COPY procedure to copy a data set to another library that is allocated with a different engine, the specified page size of the data set is not retained. Operating Environment Information: The default value for BUFSIZE= is determined by your operating environment and is set to optimize sequential access. To improve performance for direct (random) access, you should change the value for BUFSIZE=. For the default setting and possible settings for direct access, see the BUFSIZE= data set option in the SAS documentation for your operating environment. 4
See Also Data Set Options: “BUFNO= Data Set Option” on page 15 System Options: “BUFSIZE= System Option” on page 1798
4
18
CNTLLEV= Data Set Option
4
Chapter 2
CNTLLEV= Data Set Option Specifies the level of shared access to a SAS data set. Valid in:
DATA step and PROC steps
Category:
Data Set Control
Restriction:
Specify for input data sets only.
Syntax CNTLLEV=LIB | MEM | REC
Syntax Description
LIB
specifies that concurrent access is controlled at the library level. Library-level control restricts concurrent access to only one update process to the library. MEM
specifies that concurrent access is controlled at the SAS data set (member) level. Member-level control restricts concurrent access to only one update or output process to the SAS data set. If the data set is open for an update or output process, then no other operation can access the data set. If the data set is open for an input process, then other concurrent input processes are allowed but no update or output process is allowed. REC
specifies that concurrent access is controlled at the observation (record) level. Record-level control allows more than one update access to the same SAS data set, but it denies concurrent update of the same observation.
Details The CNTLLEV= option specifies the level at which shared update access to a SAS data set is denied. A SAS data set can be opened concurrently by more than one SAS session or by more than one statement, window, or procedure within a single session. By default, SAS procedures permit the greatest degree of concurrent access possible while they guarantee the integrity of the data and the data analysis. Therefore, you do not typically use the CNTLLEV= data set option. Use this option when
3 your application controls the access to the data, such as in SAS Component Language (SCL), SAS/IML software, or DATA step programming
3 you access data through an interface engine that does not provide member-level control of the data. If you use CNTLLEV=REC and the SAS procedure needs member-level control for integrity of the data analysis, SAS prints a warning to the SAS log that inaccurate or unpredictable results can occur if the data are updated by another process during the analysis.
SAS Data Set Options
4
COMPRESS= Data Set Option
19
Examples Example 1: Changing the Shared Access Level In the following example, the first SET statement includes the CNTLLEV= data set option in order to override the default level of shared access from member-level control to record-level control. The second SET statement opens the SAS data set with the default member-level control. set datalib.fuel (cntllev=rec) point=obsnum; . . . set datalib.fuel; by area;
COMPRESS= Data Set Option Specifies how observations are compressed in a new output SAS data set. Valid in:
DATA step and PROC steps
Category: Data Set Control Restriction:
Use with output data sets only.
Syntax COMPRESS=NO | YES | CHAR | BINARY
Syntax Description NO
specifies that the observations in a newly created SAS data set are uncompressed (fixed-length records). YES | CHAR
specifies that the observations in a newly created SAS data set are compressed (variable-length records) by SAS using RLE (Run Length Encoding). RLE compresses observations by reducing repeated consecutive characters (including blanks) to two-byte or three-byte representations. ON Tip: Use this compression algorithm for character data. Note: COMPRESS=CHAR is accepted by Version 7 and later versions. Alias:
4
BINARY
specifies that the observations in a newly created SAS data set are compressed (variable-length records) by SAS using RDC (Ross Data Compression). RDC combines run-length encoding and sliding-window compression to compress the file. Tip: This method is highly effective for compressing medium to large (several hundred bytes or larger) blocks of binary data (numeric variables). Because the compression function operates on a single record at a time, the record length needs to be several hundred bytes or larger for effective compression.
20
COMPRESS= Data Set Option
4
Chapter 2
Details Compressing a file is a process that reduces the number of bytes required to represent each observation. Advantages of compressing a file include reduced storage requirements for the file and fewer I/O operations necessary to read or write to the data during processing. However, more CPU resources are required to read a compressed file (because of the overhead of uncompressing each observation), and there are situations where the resulting file size might increase rather than decrease. Use the COMPRESS= data set option to compress an individual file. Specify the option for output data sets only—that is, data sets named in the DATA statement of a DATA step or in the OUT= option of a SAS procedure. Use the COMPRESS= data set option only when you are creating a SAS data file (member type DATA). You cannot compress SAS views, because they contain no data. After a file is compressed, the setting is a permanent attribute of the file, which means that to change the setting, you must re-create the file. That is, to uncompress a file, specify COMPRESS=NO for a DATA step that copies the compressed file.
Comparisons The COMPRESS= data set option overrides the COMPRESS= option on the LIBNAME statement and the COMPRESS= system option. The data set option POINTOBS=YES, which is the default, determines that a compressed data set can be processed with random access (by observation number) rather than sequential access. With random access, you can specify an observation number in the FSEDIT procedure and the POINT= option in the SET and MODIFY statements. When you create a compressed file, you can also specify REUSE=YES (as a data set option or system option) in order to track and reuse space. With REUSE=YES, new observations are inserted in space freed when other observations are updated or deleted. When the default REUSE=NO is in effect, new observations are appended to the existing file. POINTOBS=YES and REUSE=YES are mutually exclusive—that is, they cannot be used together. REUSE=YES takes precedence over POINTOBS=YES; that is, if you set REUSE=YES, SAS automatically sets POINTOBS=NO. The TAPE engine supports the COMPRESS= data set option, but the engine does not support the COMPRESS= system option. The XPORT engine does not support compression.
See Also Data Set Options: “POINTOBS= Data Set Option” on page 47 “REUSE= Data Set Option” on page 55 Statements: “LIBNAME Statement” on page 1606 System Options: “COMPRESS= System Option” on page 1816 “REUSE= System Option” on page 1925 “Compressing Data Files” in SAS Language Reference: Concepts
SAS Data Set Options
4
DLDMGACTION= Data Set Option
21
DLDMGACTION= Data Set Option Specifies the action to take when a SAS data set in a SAS library is detected as damaged. DATA step and PROC steps Category: Data Set Control Valid in:
Syntax DLDMGACTION=FAIL | ABORT | REPAIR | NOINDEX | PROMPT
Syntax Description FAIL
stops the step, issues an error message to the log immediately. This is the default for batch mode. ABORT
terminates the step, issues an error message to the log, and terminates the SAS session. REPAIR
automatically repairs and rebuilds indexes and integrity constraints, unless the data file is truncated. You use the REPAIR statement in PROC DATASETS to restore a truncated data set. It issues a warning message to the log. This is the default for interactive mode. NOINDEX
automatically repairs the data file without the indexes and integrity constraints, deletes the index file, updates the data file to reflect the disabled indexes and integrity constraints, and limits the data file to be opened only in INPUT mode. A warning is written to the SAS log instructing you to execute the PROC DATASETS REBUILD statement to correct or delete the disabled indexes and integrity constraints. See also: “REBUILD Statement” in the “DATASETS Procedure” in Base SAS
Procedures Guide “Recovering Disabled Indexes and Integrity Contraints” in SAS Language Reference: Concepts PROMPT
displays a dialog box that asks you to select the FAIL, ABORT, REPAIR, or NOINDEX action.
22
4
DROP= Data Set Option
Chapter 2
DROP= Data Set Option For an input data set, excludes the specified variables from processing; for an output data set, excludes the specified variables from being written to the data set. Valid in:
DATA step and PROC steps
Category:
Variable Control
Syntax DROP=variable-1
Syntax Description
variable-1
lists one or more variable names. You can list the variables in any form that SAS allows.
Details If the option is associated with an input data set, the variables are not available for processing. If the DROP= data set option is associated with an output data set, SAS does not write the variables to the output data set, but they are available for processing.
Comparisons 3 The DROP= data set option differs from the DROP statement in these ways: 3 In DATA steps, the DROP= data set option can apply to both input and output data sets. The DROP statement applies only to output data sets.
3 In DATA steps, when you create multiple output data sets, use the DROP= data set option to write different variables to different data sets. The DROP statement applies to all output data sets.
3 In PROC steps, you can use only the DROP= data set option, not the DROP statement.
3 The KEEP= data set option specifies a list of variables to be included in processing or to be written to the output data set.
Examples Example 1: Excluding Variables from Input In this example, the variables SALARY and GENDER are not included in processing and they are not written to either output data set: data plan1 plan2; set payroll(drop=salary gender); if hired)
Syntax Description old-name
the variable you want to rename. new-name
the new name of the variable. It must be a valid SAS name.
Details If you use the RENAME= data set option when you create a data set, the new variable name is included in the output data set. If you use RENAME= on an input data set, the new name is used in DATA step programming statements. If you use RENAME= on an input data set that is used in a SAS procedure, SAS changes the name of the variable in that procedure. If you use RENAME= with WHERE processing such as a WHERE statement or a WHERE= data set option, the new name is applied before the data is processed. You must use the new name in the WHERE expression. If you use RENAME= in the same DATA step with either the DROP= or the KEEP= data set option, the DROP= and the KEEP= data set options are applied before RENAME=. You must use the old name in the DROP= and KEEP= data set options. You cannot drop and rename the same variable in the same statement. Note: The RENAME= data set option has an effect only on data sets that are opened in output mode. 4
52
RENAME= Data Set Option
4
Chapter 2
Comparisons 3 The RENAME= data set option differs from the RENAME statement in the following ways: 3 The RENAME= data set option can be used in PROC steps and the RENAME statement cannot. 3 The RENAME statement applies to all output data sets. If you want to rename different variables in different data sets, you must use the RENAME= data set option. 3 To rename variables before processing begins, you must use a RENAME= data set option on the input data set or data sets.
3 Use the RENAME statement or the RENAME= data set option when program logic requires that you rename variables such as two input data sets that have variables with the same name. To rename variables as a file management task, use the DATASETS procedure.
Examples Example 1: Renaming a Variable at Time of Output
This example uses RENAME= in the DATA statement to show that the variable is renamed at the time it is written to the output data set. The variable keeps its original name, X, during the DATA step processing: data two(rename=(x=keys)); set one; z=x+y; run;
Example 2: Renaming a Variable at Time of Input
This example renames variable X to a variable named KEYS in the SET statement, which is a rename before DATA step processing. KEYS, not X, is the name to use for the variable for DATA step processing. data three; set one(rename=(x=keys)); z=keys+y; run;
Example 3: Renaming a Variable for a SAS Procedure with WHERE Processing
This example renames variable Score1 to a variable named Score2 for the PRINT procedure. Because the new name is applied before the data is processed, the new name must be specified in the WHERE statement. proc print data=test (rename=(score1=score2)); where score2 gt 75; run;
See Also Data Set Options: “DROP= Data Set Option” on page 22 “KEEP= Data Set Option” on page 35 Statements: “RENAME Statement” on page 1690 “The DATASETS Procedure” in Base SAS Procedures Guide
SAS Data Set Options
4
REPEMPTY= Data Set Option
53
REPEMPTY= Data Set Option Specifies whether a new, empty data set can overwrite an existing SAS data set that has the same name. DATA step and PROC steps
Valid in:
Category: Data Set Control Restriction:
Use with output data sets only.
Syntax REPEMPTY=YES | NO
Syntax Description
YES
specifies that a new empty data set with a given name replaces an existing data set with the same name. This is the default. Interaction: When REPEMPTY=YES and REPLACE=NO, then the data set is not
replaced. NO
specifies that a new empty data set with a given name does not replace an existing data set with the same name. Use REPEMPTY=NO to prevent the following syntax error from replacing the existing data set B with the new empty data set B that is created by mistake:
Tip:
data mylib.a set b;
For both the convenience of replacing existing data sets with new ones that contain data and the protection of not overwriting existing data sets with new empty ones that are created by accident, set REPLACE=YES and REPEMPTY=NO.
Tip:
Comparisons 3 For an individual data set, the REPEMPTY= data set option overrides the REPEMPTY= option in the LIBNAME statement.
3 The REPEMPTY= and REPLACE= data set options apply to both permanent and temporary SAS data sets. The REPLACE system option, however, only applies to permanent SAS data sets.
See Also Data Set Options: “REPLACE= Data Set Option” on page 54 Statement Options: REPEMPTY= in the LIBNAME statement on page 1611
54
REPLACE= Data Set Option
4
Chapter 2
System Options: “REPLACE System Option” on page 1924
REPLACE= Data Set Option Specifies whether a new SAS data set that contains data can overwrite an existing data set that has the same name. Valid in:
DATA step and PROC steps
Category:
Data Set Control
Use with output data sets only. Restriction: This option is valid only when creating a SAS data set. Restriction:
Syntax REPLACE=NO | YES
Syntax Description NO
specifies that a new data set with a given name does not replace an existing data set with the same name. YES
specifies that a new data set with a given name replaces an existing data set with the same name.
Comparisons 3 The REPLACE= data set option overrides the REPLACE system option for the individual data set.
3 The REPLACE system option only applies to permanent SAS data sets.
Example Using the REPLACE= data set option in this DATA statement prevents SAS from replacing a permanent SAS data set named ONE in a library referenced by MYLIB: data mylib.one(replace=no);
SAS writes a message in the log that tells you that the file has not been replaced.
SAS Data Set Options
4
REUSE= Data Set Option
55
See Also System Options: “REPLACE System Option” on page 1924
REUSE= Data Set Option Specifies whether new observations can be written to freed space in compressed SAS data sets. Valid in:
DATA step and PROC steps
Category: Data Set Control Restriction:
Use with output data sets only.
Syntax REUSE=NO | YES
Syntax Description NO
does not track and reuse space in compressed data sets. New observations are appended to the existing data set. Specifying the NO argument results in less efficient data storage if you delete or update many observations in the SAS data set. YES
tracks and reuses space in compressed SAS data sets. New observations are inserted in the space that is freed when other observations are updated or deleted. If you plan to use procedures that add observations to the end of SAS data sets (for example, the APPEND and FSEDIT procedures) with compressed data sets, use the REUSE=NO argument. REUSE=YES causes new observations to be added wherever there is space in the file, not necessarily at the end of the file.
Details By default, new observations are appended to existing compressed data sets. If you want to track and reuse free space by deleting or updating other observations, use the REUSE= data set option when you create a compressed SAS data set. REUSE= has meaning only when you are creating new data sets with the COMPRESS=YES data set option or system option. Using the REUSE= data set option when you are accessing an existing SAS data set has no effect.
Comparisons The REUSE= data set option overrides the REUSE= system option. REUSE=YES takes precedence over POINTOBS=YES. For example, the following statement results in a data set that has POINTOBS=NO: data test(compress=yes pointobs=yes reuse=yes);
56
SORTEDBY= Data Set Option
4
Chapter 2
Because POINTOBS=YES is the default when you use compression, REUSE=YES causes POINTOBS= to change to NO.
See Also Data Set Options: “COMPRESS= Data Set Option” on page 19 System Options: “REUSE= System Option” on page 1925
SORTEDBY= Data Set Option Specifies how a data set is currently sorted. Valid in:
DATA step and PROC steps
Category:
Data Set Control
Syntax SORTEDBY=by-clause | _NULL_
Syntax Description by-clause < / collate-name>
indicates how the data is currently sorted. by-clause
names the variables and options that you use in a BY statement in a PROC SORT step.
collate-name
names the collating sequence that is used for the sort. By default, the collating sequence is that of your operating environment. A slash (/) must precede the collating sequence. Operating Environment Information: For details about collating sequences, see the SAS documentation for your operating environment. 4
_NULL_
removes any existing sort indicator.
Details SAS determines whether a data set is already sorted by the key variable or variables in ascending order by checking the sort indicator. The sort indicator is stored in the data set descriptor information and is set from a previous sort. For detailed information on how the sort indicator is used and how it improves performance, see “The Sort Indicator” in SAS Language Reference: Concepts and the “SORTVALIDATE= System Option” in the SAS Language Reference: Dictionary.
SAS Data Set Options
4
SORTEDBY= Data Set Option
57
The following example of the CONTENTS procedure Sort Information section containing the Validated attribute set to NO, indicates that the data set was sorted using the SORTEDBY= data set option. Sort Information Sortedby var1 Validated NO Character Set ANSI
Comparisons 3 Use the CONTENTS statement in the DATASETS procedure to see how a data set is sorted.
3 The SORTEDBY= option indicates how the data is sorted, but does not cause a data set to be sorted.
Examples This example uses the SORTEDBY= data set option to specify how the data are currently sorted. The data set ORDERS is sorted by PRIORITY and by the descending values of INDATE. Once the data set is created, the sort indicator is stored with it. These statements create the data set ORDERS and record the sort indicator: libname mylib ’SAS-library’; options yearcutoff=1920; data mylib.orders(sortedby=priority descending indate); input priority 1. +1 indate date7. +1 office $ code $; format indate date7.; datalines; 1 03may01 CH J8U 1 21mar01 LA M91 1 01dec00 FW L6R 1 27feb99 FW Q2A 2 15jan00 FW I9U 2 09jul99 CH P3Q 3 08apr99 CH H5T 3 31jan99 FW D2W ;
See Also The CONTENTS statement in “The DATASETS Procedure” in Base SAS Procedures Guide “The SORT Procedure” in Base SAS Procedures Guide “The SQL Procedure” in Base SAS Procedures Guide
58
SPILL= Data Set Option
4
Chapter 2
SPILL= Data Set Option Specifies whether to create a spill file for non-sequential processing of a DATA step view. Valid in:
DATA step and PROC steps
Category:
Data Set Control
Restriction:
Valid only for a DATA step view
Syntax SPILL=YES | NO
Syntax Description YES
creates a spill file for non-sequential processing of a DATA step view. This is the default. Interaction: A spill file is never created for sequential processing of a DATA step
view. A DATA step view that generates large amounts of observations can result in a very large spill file. You must have enough disk space to accommodate the spill file.
Tip: NO
does not create a spill file or reduces the size of a spill file. Interaction: For direct (random) access, a spill file is always created even if you
specify SPILL=NO. If you do not have enough disk space to accommodate a resulting spill file from a DATA step view that generates a large amount of data, specify SPILL=NO.
Tip:
For SAS procedures that process BY-group data, consider specifying SPILL=NO in order to write only the current BY group to the spill file.
Tip:
Details When a DATA step view is opened for non-sequential processing, a spill file is created by default. The spill file contains the observations that are generated by a DATA step view. Subsequent requests for data read the observations from the spill file rather than execute the DATA step view again. The spill file is a temporary file in the WORK library. Non-sequential processing includes the following access methods, which are supported by several SAS statements and procedures. How the SPILL= data set option operates with each of the access methods is described below: random access
retrieves observations directly either by an observation number or by the value of one or more variables through an index without reading all observations sequentially. Whether SPILL=YES or SPILL=NO, a spill file is always created, because the processing time to restart a DATA step view for each observation would be costly.
BY-group access
uses a BY statement to process observations that are ordered, grouped, or indexed according to the values of one or more variables. SPILL=YES creates a spill file the size of all the data that is
SAS Data Set Options
4
SPILL= Data Set Option
59
requested from the DATA step view. SPILL=NO writes only the current BY group to the spill file. The largest size of the spill file is a size to store the largest BY group. two-pass access
performs multiple sequential passes through the data. With SPILL=NO, no spill file is created. Instead, after the first pass through the data, the DATA step view is restarted for each subsequent pass through the data. If small amounts of data are returned by the DATA step view for each restart, the processing time to restart the view might become significant. Note: With SPILL=NO, subsequent passes through the data could result in generating different data. Some processing might require using a spill file; for example, results from using random functions and computing values that are based on the current time of day could affect the data. 4
Examples Example 1: Using a Spill File for a Small Number of Large BY Groups
This example creates a DATA step view that generates a large amount of random data and uses the UNIVARIATE procedure with a BY statement. The example illustrates the effects of SPILL= with a small number of large BY groups. With SPILL=YES, all observations that are requested from the DATA step view are written to the spill file. With SPILL=NO, only the observations that are in the current BY group are written to the spill file. The information messages that are produced by this example show that the size of the spill file is reduced with SPILL=NO. However, the time to truncate the spill file for each BY group might add to the overall processing time for the DATA step view. options msglevel=i; data vw_few_large / view=vw_few_large; drop i; do byval = ’Group A’, ’Group B’, ’Group C’; do i = 1 to 500000; r = ranuni(4); output; end; end; run; proc univariate data=vw_few_large (spill=yes) noprint; var r; by byval; run; proc univariate data=vw_few_large (spill=no) noprint; var r; by byval; run;
60
SPILL= Data Set Option
Output 2.8 1 2 3 4 5 6 7 8 9 10 11
4
Chapter 2
SAS Log Output
options msglevel=i; data vw_few_large / view=vw_few_large; drop i; do byval = ’Group A’, ’Group B’, ’Group C’; do i = 1 to 500000; r = ranuni(4); output; end; end; run;
NOTE: DATA STEP view saved on file WORK.VW_FEW_LARGE. NOTE: A stored DATA STEP view cannot run under a different operating system. NOTE: DATA statement used (Total process time): real time 21.57 seconds cpu time 1.31 seconds
12 proc univariate data=vw_few_large (spill=yes) noprint; INFO: View WORK.VW_FEW_LARGE open mode: BY-group rewind. 13 var r; 14 by byval; 15 run; INFO: View WORK.VW_FEW_LARGE opening spill file for output observations. INFO: View WORK.VW_FEW_LARGE deleting spill file. File size was 22506120 bytes. NOTE: View WORK.VW_FEW_LARGE.VIEW used (Total process time): real time 40.68 seconds cpu time 12.71 seconds NOTE: PROCEDURE UNIVARIATE used (Total process time): real time 57.63 seconds cpu time 13.12 seconds
16 17 proc univariate data=vw_few_large (spill=no) noprint; INFO: View WORK.VW_FEW_LARGE open mode: BY-group rewind. 18 var r; 19 by byval; 20 run; INFO: View WORK.VW_FEW_LARGE opening spill file for output observations. INFO: View WORK.VW_FEW_LARGE truncating spill file. File size was 7502040 bytes. NOTE: The above message was for the following by-group: byval=Group A INFO: View WORK.VW_FEW_LARGE truncating spill file. File size was 7534800 bytes. NOTE: The above message was for the following by-group: byval=Group B INFO: View WORK.VW_FEW_LARGE truncating spill file. File size was 7534800 bytes. NOTE: The above message was for the following by-group: byval=Group C INFO: View WORK.VW_FEW_LARGE deleting spill file. File size was 32760 bytes. NOTE: View WORK.VW_FEW_LARGE.VIEW used (Total process time): real time 11.03 seconds cpu time 10.95 seconds NOTE: PROCEDURE UNIVARIATE used (Total process time): real time 11.04 seconds cpu time 10.96 seconds
SAS Data Set Options
4
SPILL= Data Set Option
Example 2: Using a Spill File for a Large Number of Small BY Groups
61
This example creates a DATA step view that generates a large amount of random data and uses the UNIVARIATE procedure with a BY statement. This example illustrates the effects of SPILL= with a large number of small BY groups. With SPILL=YES, all observations that are requested from the DATA step view are written to the spill file. With SPILL=NO, only the observations that are in the current BY group are written to the spill file. The information messages that are produced by this example show that the size of the spill file is reduced with SPILL=NO, and with small BY groups, this results in a large disk space savings. options msglevel=i; data vw_many_small / view=vw_many_small; drop i; do byval = 1 to 100000; do i = 1 to 5; r = ranuni(4); output; end; end; run; proc univariate data=vw_many_small (spill=yes) noprint; var r; by byval; run; proc univariate data=vw_many_small (spill=no) noprint; var r; by byval; run;
62
SPILL= Data Set Option
Output 2.9 1 2 3 4 5 6 7 8 9 10 11
4
Chapter 2
SAS Log Output
options msglevel=i; data vw_many_small / view=vw_many_small; drop i; do byval = 1 to 100000; do i = 1 to 5; r = ranuni(4); output; end; end; run;
NOTE: DATA STEP view saved on file WORK.VW_MANY_SMALL. NOTE: A stored DATA STEP view cannot run under a different operating system. NOTE: DATA statement used (Total process time): real time 0.56 seconds cpu time 0.03 seconds
12 proc univariate data=vw_many_small (spill=yes) noprint; INFO: View WORK.VW_MANY_SMALL open mode: BY-group rewind. 13 var r; 14 by byval; 15 run; INFO: View WORK.VW_MANY_SMALL opening spill file for output observations. INFO: View WORK.VW_MANY_SMALL deleting spill file. File size was 8024240 bytes. NOTE: View WORK.VW_MANY_SMALL.VIEW used (Total process time): real time 30.73 seconds cpu time 29.59 seconds NOTE: PROCEDURE UNIVARIATE used (Total process time): real time 30.96 seconds cpu time 29.68 seconds
16 17 proc univariate data=vw_many_small (spill=no) noprint; INFO: View WORK.VW_MANY_SMALL open mode: BY-group rewind. 18 var r; 19 by byval; 20 run; INFO: View WORK.VW_MANY_SMALL opening spill file for output observations. INFO: View WORK.VW_MANY_SMALL truncating spill file. File size was 65504 bytes. NOTE: The above message was for the following by-group: byval=410 INFO: View WORK.VW_MANY_SMALL truncating spill file. File size was 65504 bytes. NOTE: The above message was for the following by-group: byval=819 INFO: View WORK.VW_MANY_SMALL truncating spill file. File size was 65504 bytes. NOTE: The above message was for the following by-group: byval=1229 . . Deleted many INFO and NOTE messages for BY groups . INFO: View WORK.VW_MANY_SMALL truncating spill file. File size was 65504 bytes. NOTE: The above message was for the following by-group: byval=99894 INFO: View WORK.VW_MANY_SMALL deleting spill file. File size was 32752 bytes. NOTE: View WORK.VW_MANY_SMALL.VIEW used (Total process time): real time 29.43 seconds cpu time 28.81 seconds NOTE: PROCEDURE UNIVARIATE used (Total process time): real time 29.43 seconds cpu time 28.81 seconds
SAS Data Set Options
4
SPILL= Data Set Option
63
Example 3: Using a Spill File with Two-Pass Access This example creates a DATA step view that generates a large amount of random data and uses the TRANSPOSE procedure. The example illustrates the effects of SPILL= with a procedure that requires two-pass access processing. When PROC TRANSPOSE processes a DATA step view, the procedure must make two passes through the observations that the view generates. The first pass counts the number of observations and the second pass performs the transposition. With SPILL=YES, a spill file is created during the first pass, and the second pass reads the observations from the spill file. With SPILL=NO, a spill file is not created—after the first pass, the DATA step view is restarted. Note that for the first TRANSPOSE procedure, which does not include the SPILL= data set option, even though a spill file is used by default, the informative message about the open mode is not displayed. This action occurs to reduce the amount of messages in the SAS log for users who are not using the SPILL= data set option. options msglevel=i; data vw_transpose/view=vw_transpose; drop i j; array x[10000]; do i = 1 to 10; do j = 1 to dim(x); x[j] = ranuni(4); end; output; end; run; proc transpose data=vw_transpose out=transposed; run; proc transpose data=vw_transpose(spill=yes) out=transposed; run; proc transpose data=vw_transpose(spill=no) out=transposed; run;
64
SPILL= Data Set Option
4
Output 2.10 1 2 3 4 5 6 7 8 9 10 11
Chapter 2
SAS Log Output
options msglevel=i; data vw_transpose/view=vw_transpose; drop i j; array x[10000]; do i = 1 to 10; do j = 1 to dim(x); x[j] = ranuni(4); end; output; end; run;
NOTE: DATA STEP view saved on file WORK.VW_TRANSPOSE. NOTE: A stored DATA STEP view cannot run under a different operating system. NOTE: DATA statement used (Total process time): real time 0.68 seconds cpu time 0.18 seconds
12 13
proc transpose data=vw_transpose out=transposed; run;
INFO: View WORK.VW_TRANSPOSE opening spill file for output observations. INFO: View WORK.VW_TRANSPOSE deleting spill file. File size was 880000 bytes. NOTE: View WORK.VW_TRANSPOSE.VIEW used (Total process time): real time 2.37 seconds cpu time 1.17 seconds NOTE: There were 10 observations read from the data set WORK.VW_TRANSPOSE. NOTE: The data set WORK.TRANSPOSED has 10000 observations and 11 variables. NOTE: PROCEDURE TRANSPOSE used (Total process time): real time 4.17 seconds cpu time 1.51 seconds
14 proc transpose data=vw_transpose (spill=yes) out=transposed; INFO: View WORK.VW_TRANSPOSE open mode: sequential. 15 run; INFO: INFO: INFO: NOTE:
View WORK.VW_TRANSPOSE reopen mode: two-pass. View WORK.VW_TRANSPOSE opening spill file for output observations. View WORK.VW_TRANSPOSE deleting spill file. File size was 880000 bytes. View WORK.VW_TRANSPOSE.VIEW used (Total process time): real time 0.95 seconds cpu time 0.92 seconds
NOTE: There were 10 observations read from the data set WORK.VW_TRANSPOSE. NOTE: The data set WORK.TRANSPOSED has 10000 observations and 11 variables. NOTE: PROCEDURE TRANSPOSE used (Total process time): real time 1.01 seconds cpu time 0.98 seconds
16 proc transpose data=vw_transpose (spill=no) out=transposed; INFO: View WORK.VW_TRANSPOSE open mode: sequential. 17 run; INFO: View WORK.VW_TRANSPOSE reopen mode: two-pass. INFO: View WORK.VW_TRANSPOSE restarting for another pass through the data. NOTE: View WORK.VW_TRANSPOSE.VIEW used (Total process time): real time 1.34 seconds cpu time 1.32 seconds NOTE: The View WORK.VW_TRANSPOSE was restarted 1 times. The following view statistics only apply to the last view restart. NOTE: There were 10 observations read from the data set WORK.VW_TRANSPOSE. NOTE: The data set WORK.TRANSPOSED has 10000 observations and 11 variables. NOTE: PROCEDURE TRANSPOSE used (Total process time): real time 1.42 seconds cpu time 1.40 seconds
SAS Data Set Options
4
TYPE= Data Set Option
65
See Also Data Set Options: “OBSBUF= Data Set Option” on page 43
TOBSNO= Data Set Option Specifies the number of observations to send in a client/server transfer. Valid in:
DATA step and PROC steps
Category: Data Set Control
The TOBSNO= option is valid only for data sets that are accessed through a SAS server via the REMOTE engine.
Restriction:
Syntax TOBSNO=n
Syntax Description
n
specifies the number of observations to be transmitted.
Details If the TOBSNO= option is not specified, its value is calculated based on the observation length and the size of the server’s transmission buffers, as specified by the PROC SERVER statement TBUFSIZE= option. The TOBSNO= option is valid only for data sets that are accessed through a SAS server via the REMOTE engine. If this option is specified for a data set opened for update or accessed via another engine, it is ignored.
See Also “FOPEN Function” in SAS Component Language: Reference.
TYPE= Data Set Option Specifies the data set type for a specially structured SAS data set. Valid in:
DATA step and PROC steps
Category: Data Set Control
66
TYPE= Data Set Option
4
Chapter 2
Syntax TYPE=data-set-type
Syntax Description data-set-type
specifies the special type of the data set.
Details Use the TYPE= data set option in a DATA step to create a special SAS data set in the proper format, or to identify the special type of the SAS data set in a procedure statement. You can use the CONTENTS procedure to determine the type of a data set. Most SAS data sets do not have a specified type. However, there are several specially structured SAS data sets that are used by some SAS/STAT procedures. These SAS data sets contain special variables and observations, and they are usually created by SAS statistical procedures. Because most of the special SAS data sets are used with SAS/STAT software, they are described in the SAS/STAT User’s Guide. Some of the special data sets are CORR, COV, SSPC, EST, or FACTOR. Other values are available in other SAS software products and are described in the appropriate documentation. Note: If you use a DATA step with a SET statement to modify a special SAS data set, you must specify the TYPE= option in the DATA statement. The data-set-type is not automatically copied to the data set that is created. 4
See Also “Special SAS Data Sets” in the SAS/STAT User’s Guide “The CONTENTS Procedure” in the Base SAS Procedures Guide
SAS Data Set Options
4
WHERE= Data Set Option
67
WHERE= Data Set Option Specifies specific conditions to use to select observations from a SAS data set. DATA step and PROC steps Category: Observation Control Valid in:
Cannot be used with the POINT= option in the SET and MODIFY statements.
Restriction:
Syntax WHERE=(where-expression-1)
Syntax Description where-expression
is an arithmetic or logical expression that consists of a sequence of operators, operands, and SAS functions. An operand is a variable, a SAS function, or a constant. An operator is a symbol that requests a comparison, logical operation, or arithmetic calculation. The expression must be enclosed in parentheses. logical-operator
can be AND, AND NOT, OR, or OR NOT.
Details 3 Use the WHERE= data set option with an input data set to select observations that meet the condition specified in the WHERE expression before SAS brings them into the DATA or PROC step for processing. Selecting observations that meet the conditions of the WHERE expression is the first operation SAS performs in each iteration of the DATA step. You can also select observations that are written to an output data set. In general, selecting observations at the point of input is more efficient than selecting them at the point of output; however, there are some cases when selecting observations at the point of input is not practical or not possible.
3 You can apply OBS= and FIRSTOBS= processing to WHERE processing. For more information see “Processing a Segment of Data That is Conditionally Selected” in SAS Language Reference: Concepts.
3 You cannot use the WHERE= data set option with the POINT= option in the SET and MODIFY statements.
3 If you use both the WHERE= data set option and the WHERE statement in the same DATA step, SAS ignores the WHERE statement for data sets with the WHERE= data set option. However, you can use the WHERE= data set option with the WHERE command in SAS/FSP software. Note: Using indexed SAS data sets can improve performance significantly when you are using WHERE expressions to access a subset of the observations in a SAS data set. See “Understanding SAS Indexes” in SAS Language Reference: Concepts for a complete discussion of WHERE expression processing with indexed data sets and a list of guidelines to consider before indexing your SAS data sets. 4
68
WHEREUP= Data Set Option
4
Chapter 2
Comparisons 3 The WHERE statement applies to all input data sets, whereas the WHERE= data set option selects observations only from the data set for which it is specified.
3 Do not confuse the purpose of the WHERE= data set option. The DROP= and KEEP= data set options select variables for processing, while the WHERE= data set option selects observations.
Examples Example 1: Selecting Observations from an Input Data Set This example uses the WHERE= data set option to subset the SALES data set as it is read into another data set: data whizmo; set sales(where=(product=’whizmo’)); run;
Example 2: Selecting Observations from an Output Data Set
This example uses the WHERE= data set option to subset the SALES output data set: data whizmo(where=(product=’whizmo’)); set sales; run;
See Also Statements: “WHERE Statement” on page 1738 “WHERE-Expression Processing” in SAS Language Reference: Concepts
WHEREUP= Data Set Option Specifies whether to evaluate new observations and modified observations against a WHERE expression. Valid in: Category:
DATA step and PROC steps Observation Control
Syntax WHEREUP=NO | YES
SAS Data Set Options
4
WHEREUP= Data Set Option
Syntax Description NO
does not evaluate added observations and modified observations against a WHERE expression. YES
evaluates added observations and modified observations against a WHERE expression.
Details Specify WHEREUP=YES when you want any added observations or modified observations to match a specified WHERE expression.
Examples Example 1: Accepting Updates That Do Not Match the WHERE Expression
This example shows how WHEREUP= permits observations to be updated and added even though the modified observation does not match the WHERE expression: data a; x=1; output; x=2; output; run; data a; modify a(where=(x=1) whereup=no); x=3; replace; /* Update does not match WHERE expression */ output; /* Add does not match WHERE expression */ run;
In this example, SAS updates the observation and adds the new observation to the data set.
Example 2: Rejecting Updates That Do Not Match the WHERE Expression In this example, WHEREUP= does not permit observations to be updated or added when the update and the add do not match the WHERE expression: data a; x=1; output; x=2; output; run; data a; modify a(where=(x=1) whereup=yes); x=3; replace; /* Update does not match WHERE expression */ output; /* Add does not match WHERE expression */ run;
69
70
WRITE= Data Set Option
4
Chapter 2
In this example, SAS does not update the observation nor does it add the new observation to the data set.
See Also Data Set Option: “WHERE= Data Set Option” on page 67
WRITE= Data Set Option Assigns a WRITE password to a SAS file that prevents users from writing to a file, unless they enter the password. DATA step and PROC steps Category: Data Set Control Valid in:
Syntax WRITE=write-password
Syntax Description write-password
must be a valid SAS name. “Rules for Words and Names in the SAS Language” in SAS Language Reference: Concepts
See:
Details The WRITE= option applies to all types of SAS files except catalogs. You can use this option to assign a password to a SAS file or to access a write-protected SAS file. Note: A SAS password does not control access to a SAS file beyond the SAS system. You should use the operating system-supplied utilities and file-system security controls in order to control access to SAS files outside of SAS. 4
See Also Data Set Options: “ALTER= Data Set Option” on page 14 “ENCRYPT= Data Set Option” on page 23 “PW= Data Set Option” on page 48 “READ= Data Set Option” on page 50 “Manipulating Passwords” in “The DATASETS Procedure” in Base SAS Procedures Guide
SAS Data Set Options
4
SAS Companion for UNIX Environments
Data Set Options Documented in Other SAS Publications In addition to data set options documented in SAS Language Reference: Dictionary, data set options are also documented in the following publications:
SAS Companion for Windows Data Set Option
Description
SGIO=
Activates the Scatter/Gather I/O feature for a dataset.
SAS Companion for OpenVMS on HP Integrity Servers The data set options listed here are documented only in SAS Companion for OpenVMS on HP Integrity Servers. Other data set options in SAS Companion for OpenVMS on HP Integrity Servers contain information specific to the OpenVMS operating environment, where the main documentation is in SAS Language Reference: Dictionary. These latter data set options are not listed here. Data Set Option
Description
ALQ=
Specifies how many disk blocks to initially allocate to a new SAS data set.
ALQMULT=
Specifies the number of pages that are preallocated to a file.
BKS=
Specifies the bucket size for a new data set.
CACHENUM=
Specifies the number of I/O data caches used per SAS file.
CACHESIZE=
Controls the size of the I/O data cache that is allocated for a SAS file.
DEQ=
Specifies how many disk blocks to add when OpenVMS automatically extends a SAS data set during a write operation.
DEQMULT=
Specifies the number of pages to extend a SAS file.
LOCKREAD
Specifies whether to read a record if a lock cannot be obtained for the record.
LOCKWAIT
Indicates whether SAS should wait for a locked record.
MBF
Specifies the multibuffer count for a data set.
SAS Companion for UNIX Environments The data set options listed here are documented only in SAS Companion for UNIX Environments. Other data set options in SAS Companion for UNIX Environments
71
72
SAS Companion for z/OS
4
Chapter 2
contain information specific to the UNIX operating environment, where the main documentation is in SAS Language Reference: Dictionary. These latter data set options are not listed here. Data Set Option
Description
ALTER=
Specifies a password for a SAS file that prevents users from replacing or deleting the file, but permits read and write access.
BUFNO=
Specifies the number of buffers to be allocated for processing a SAS data set.
BUFSIZE=
Specifies the size of a permanent buffer page for an output SAS data set.
FILECLOSE=
Specifies how a tape is positioned when a SAS data set is closed.
PW=
Assigns a READ, WRITE, or ALTER password to a SAS file, and enables access to a password-protected SAS file.
USEDIRECTIO
Turns on direct I/O for a library that contains the file to which the ENABLEDIRECTIO option has been applied.
SAS Companion for z/OS The data set options listed here are documented only in SAS Companion for z/OS. Other data set options in SAS Companion for z/OS contain information specific to the z/OS operating environment, where the main documentation is in SAS Language Reference: Dictionary. These latter data set options are not listed here.
Data Set Option
Description
ALTER=
Assigns an alter password to a SAS file and enables access to a password-protected SAS file.
BUFSIZE=
Specifies the permanent buffer page size for an output SAS data set.
FILEDISP=
Specifies the initial disposition for a sequential access bound SAS data library.
SAS National Language Support: Reference Guide Data Set Option
Description
ENCODING=
Overrides the encoding to use for reading or writing a SAS data set.
SAS Data Set Options
4
SAS Scalable Performance Data Engine: Reference
SAS Scalable Performance Data Engine: Reference Data Set Option
Description
ASYNCINDEX=
Specifies to create the indexes in parallel when creating multiple indexes on an SPD Engine data set.
BYNOEQUALS=
Specifies whether the output order of data set observations with identical values for the BY variable are guaranteed to be in data set order.
BYSORT=
Specifies for the SPD Engine to perform an automatic sort when it encounters a BY statement.
COMPRESS=
Specifies to compress SPD Engine data sets on disk as they are being created.
ENCRYPT=
Specifies whether to encrypt an output SPD Engine data set.
ENDOBS=
Specifies the end observation number in a user-defined range of observations to be processed.
IDXWHERE=
Specifies to use indexes when processing WHERE expressions in the SPD Engine.
IOBLOCKSIZE=
Specifies the number of observations in a block to be stored in or read from an SPD Engine data component file that is compressed.
LISTFILES=
Specifies whether the CONTENTS procedure lists the complete pathnames of all the component files.
PADCOMPRESS=
Specifies a number of bytes to add to compression blocks in a data set opened for UPDATE.
PARTSIZE=
When an SPD Engine data set is created, specifies the largest size (in megabytes) that the data component partitions can be. This is a fixed size. This specification applies only to the data component files.
STARTOBS=
Specifies the starting observation number in a user-defined range of observations to be processed.
SYNCADD=
Specifies to process one observation at a time or multiple observations at a time.
THREADNUM=
Specifies the number of I/O threads the SPD Engine can spawn for processing an SPD Engine data set.
UNIQUESAVE=
Specifies to save observations with non-unique key values (the rejected observations) to a separate data set when appending or inserting observations to data sets with unique indexes.
WHERENOINDEX=
Specifies, when making WHERE expression evaluations, a list of indexes to exclude.
73
74
SAS/ACCESS for Relational Databases: References
4
Chapter 2
SAS/ACCESS for Relational Databases: References Data Set Option
Description
AUTHID=
Enables you to qualify the specified table with an authorization ID, user ID, or group ID.
AUTOCOMMIT=
Specifies whether to enable the DBMS autocommit capability.
BL_ALLOW_READ_ACCESS=Specifies that the original table data is still visible to readers during bulk load. BL_ALLOWWRITE_ACCESS=Specifies that table data is still accessible to readers and writers while import is in progress. BL_BADDATA_FILE=
Specifies where to put records that failed to process internally.
BL_BADFILE=
Identifies a file that contains records that were rejected during a bulk load.
BL_CODEPAGE=
Identifies the codepage that the DBMS engine uses to convert SAS character data to the current database codepage during a bulk load.
BL_CONTROL=
Identifies a file containing SQLLDR control statements that describe the data to be included in a bulk load.
BL_COPY_LOCATION=
Specifies the directory to which DB2 saves a copy of the loaded data. This option is valid only when used in conjunction with BL_RECOVERABLE=YES.
BL_CPU_PARALLELISM=
Specifies the number of processes or threads that are used when building table objects.
BL_DATA_BUFFER_SIZE=
Specifies the total amount of memory that is allocated for the bulk load utility to use as a buffer for transferring data.
BL_DATAFILE=
Identifies the file that contains the data that is loaded or appended into a DBMS table during a bulk load.
BL_DB2CURSOR=
Specifies a string that contains a valid DB2 SELECT statement that points to either local or remote objects (tables or views).
BL_DB2DEVT_PERM=
Specifies the unit address or generic device type that is used for the permanent data sets created by the LOAD utility, as well as SYSIN, SYSREC, and SYSPRINT when they are allocated by SAS.
BL_DB2DEVT_TEMP=
Specifies the unit address or generic device type that is used for the temporary data sets created by the LOAD utility (PNCH, COPY1, COPY2, RCPY1, RCPY2, WORK1, WORK2).
BL_DB2DISC=
Specifies the SYSDISC data set name for the LOAD utility.
BL_DB2ERR=
Specifies the SYSERR data set name for the LOAD utility.
BL_DB2IN=
Specifies the SYSIN data set name for the LOAD utility
BL_DB2LDCT1=
Specifies a string in the LOAD utility control statement, between LOAD DATA and INTO TABLE.
BL_DB2LDCT2=
Specifies a string in the LOAD utility control statement, between INTO TABLE table-name and (field-specification).
BL_DB2LDCT3=
Specifies a string in the LOAD utility control statement, after (field-specification)
SAS Data Set Options
4
SAS/ACCESS for Relational Databases: References
Data Set Option
Description
BL_DB2LDEXT=
Specifies the mode of execution for the DB2 LOAD utility.
BL_DB2MAP=
Specifies the SYSMAP data set name for the LOAD utility.
BL_DB2PRINT=
Specifies the SYSPRINT data set name for the LOAD utility.
BL_DB2PRNLOG=
Determines whether the SYSPRINT output is written to the SAS log.
BL_DB2REC=
Specifies the SYSREC data set name for the LOAD utility
BL_DB2RECSP=
Determines the number of cylinders to specify as the primary allocation for the SYSREC data set when it is created.
BL_DB2RSTRT=
Tells the LOAD utility whether the current load is a restart and, for a restart, indicates where to begin.
BL_DB2SPC_PERM=
Determines the number of cylinders to specify as the primary allocation for the permanent data sets that are created by the LOAD utility.
BL_DB2SPC_TEMP=
Determines the number of cylinders to specify as the primary allocation for the temporary data sets that are created by the LOAD utility.
BL_DB2TBLXST=
Indicates whether the LOAD utility runs against an existing table
BL_DB2UTID=
Specifies a unique identifier for a given run of the DB2 LOAD utility.
BL_DELETE_DATAFILE=
Deletes the data file that is created for the DBMS bulk load facility.
BL_DELIMITER=
Specifies override of the default delimiter character for separating columns of data during data transfer or retrieval during bulk load or bulk unload.
BL_DIRECT_PATH=
Sets the Oracle SQL*Loader DIRECT option.
BL_DISCARDFILE=
Identifies the file that contains the records that were filtered out of a bulk load because they did not match the criteria specified in the CONTROL file.
BL_DISCARDS=
"Specifies whether and when to stop processing a job, based on the number of discarded records.
BL_DISK_PARALLELISM=
Specifies the number of processes or threads that are used when writing datato disk.
BL_ERRORS=
Specifies whether and when to stop processing a job based on the number of failed records.
BL_EXCEPTION=
Specifies the exception table into which rows in error are copied.
BL_FAILEDDATA=
Specifies where to put records that could not be written to the database.
BL_INDEX_OPTIONS=
Enables you to specify SQL*Loader Index options with bulk loading.
BL_INDEXING_MODE=
Used to indicate which scheme the DB2 load utility should use with respect to index maintenance.
BL_KEEPIDENTITY=
Determines whether the identity column that is created during a bulk load is populated with values generated by Microsoft SQL Server or with values provided by the user.
BL_KEEPNULLS=
Indicates how NULL values in Microsoft SQL Server columns that accept NULL are handled during a bulk load.
75
76
SAS/ACCESS for Relational Databases: References
4
Chapter 2
Data Set Option
Description
BL_LOAD_METHOD=
Specifies the method by which data is loaded into an Oracle table during bulk loading.
BL_LOAD_REPLACE=
Specifies whether DB2 appends or replaces rows during bulk loading
BL_LOG=
Identifies a log file that contains information such as statistics and error information for a bulk load.
BL_METHOD=
Specifies which bulk loading method to use for DB2.
BL_OPTIONS=
Passes options to the DBMS bulk load facility, affecting how it loads and processes data.
BL_PARFILE=
Creates a file that contains the SQL*Loader command line options.
BL_PORT_MAX=
Sets the highest available port number for concurrent uploads.
BL_PORT_MIN=
Sets the lowest available port number for concurrent uploads.
BL_PRESERVE_BLANKS=
Determines how the SQL*Loader handles requests to insert blank spaces into CHAR/VARCHAR2 columns with the NOT NULL constraint.
BL_RECOVERABLE=
Determines whether the LOAD process is recoverable.
BL_REMOTE_FILE=
Specifies the base filename and location of DB2 LOAD temporary files.
BL_RETRIES=
Specifies the number of attempts to make for a job.
BL_RETURN_ WARNINGS_AS_ ERRORS=
Specifies whether SQL*Loader (bulkoad) warnings should surface in SAS through the SYSERR macro warnings or as errors.
BL_SERVER_DATAFILE=
Specifies the name and location of the data file as seen by the DB2 server instance.
BL_SQLLDR_PATH=
Specifies the location of the SQLLDR executable file.
BL_SUPPRESS_NULLIF=
Indicates whether to suppress the NULLIF clause for the specified columns when a table is created in order to increase performance.
BL_USE_PIPE=
Specifies a named pipe for data transfer.
BL_WARNING_COUNT=
Specifies the maximum number of row warnings to allow before you abort the load operation.
BUFFERS=
Specifies the number of shared memory buffers to be used for transferring data from SAS to Teradata.
BULK_BUFFER=
Specifies the number of bulk rows that the SAS/ACCESS engine can buffer for output.
BULKLOAD=
Loads rows of data as one unit.
BULKUNLOAD
Rapidly retrieves (fetches) large number of rows from a data set.
CAST=
Specifies whether data conversions should be performed on the Teradata DBMS server or by SAS.
CAST_OVERHEAD_ MAXPERCENT=
Specifies the overhead limit for data conversions to be performed in Teradata instead of SAS.
COMMAND_TIMEOUT=
Specifies the number of seconds to wait before a command times out.
CURSOR_TYPE=
Specifies the cursor type for read only and updatable cursors.
SAS Data Set Options
4
SAS/ACCESS for Relational Databases: References
Data Set Option
Description
DBCOMMIT=
Causes an automatic COMMIT (a permanent writing of data to the DBMS) after a specified number of rows have been processed.
DBCONDITION=
Specifies criteria for subsetting and ordering DBMS data.
DBCREATE_TABLE_OPTS= Specifies DBMS-specific syntax to be added to the CREATE TABLE statement. DBFORCE=
Specifies whether to force the truncation of data during insert processing.
DBGEN_NAME=
Specifies how SAS renames columns automatically when they contain characters that SAS does not allow.
DBINDEX=
Detects and verifies that indexes exist on a DBMS table. If they do exist and are of the correct type, a join query that is passed to the DBMS might improve performance.
DBKEY=
Specifies a key column to optimize DBMS retrieval. Can improve performance when you are processing a join that involves a large DBMS table and a small SAS data set or DBMS table.
DBLABEL=
Specifies whether to use SAS variable labels or SAS variable names as the DBMS column names during output processing.
DBLINK=
Specifies a link from your default database to another database on the server to which you are connected in the Sybase interface; and specifies a link from your local database to database objects on another server in the Oracle interface.
DBMASTER=
Designates which table is the larger table when you are processing a join that involves tables from two different types of databases.
DBMAX_TEXT=
Determines the length of any very long DBMS character data type that is read into SAS or written from SAS when you are using a SAS/ ACCESS engine.
DBNULL=
Indicates whether NULL is a valid value for the specified columns when a table is created.
DBNULLKEYS=
Controls the format of the WHERE clause with regard to NULL values when you use the DBKEY= data set option.
DBPROMPT=
Specifies whether SAS displays a window that prompts you to enter DBMS connection information.
DBSASLABEL=
Specifies how the engine returns column labels.
DBSASTYPE=
Specifies data types to override the default SAS data types during input processing.
DBSLICE=
Specifies user-supplied WHERE clauses to partition a DBMS query for threaded reads.
DBSLICEPARM=
Controls the scope of DBMS threaded reads and the number of DBMS connections.
DBTYPE=
Specifies a data type to use instead of the default DBMS data type when SAS creates a DBMS table.
DEGREE=
Determines whether DB2 uses parallelism.
DISTRIBUTE_ON
Specifies a column name to use in the DISTRIBUTE ON clause of the CREATE TABLE statement.
77
78
SAS/ACCESS for Relational Databases: References
4
Chapter 2
Data Set Option
Description
ERRLIMIT=
Specifies the number of errors that are allowed before SAS stops processing and issues a rollback.
ESCAPE_BACKSLASH=
Specifies whether backslashes in literals are preserved during data copy from a SAS data set to a table.
IGNORE_ READ_ONLY_COLUMNS=
Specifies whether to ignore or include columns whose data types are read-only when generating an SQL statement for inserts or updates.
IN=
Enables you to specify the database or tablespace in which you want to create a new table.
INSERT_SQL=
Determines the method that is used to insert rows into a data source.
INSERTBUFF=
Specifies the number of rows in a single DBMS insert.
KEYSET_SIZE=
Specifies the number of rows in the cursor that are keyset driven.
LOCATION=
Enables you to further specify exactly where a table resides.
LOCKTABLE=
Places exclusive or shared locks on tables.
MBUFFSIZE=
Specifies the size of the shared memory buffers to be used for transferring data from SAS to Teradata.
ML_CHECKPOINT=
Specifies the interval between checkpoint operation, in minutes.
ML_ERROR1=
Specifies the name of a temporary table that MultiLoad uses to track errors that were generated during the acquisition phase of a bulk-load operation.
ML_ERROR2=
Specifies the name of a temporary table that MultiLoad uses to track errors that were generated during the application phas of a bulk-load operation.
ML_LOG=
Specifies a prefix for the names of the temporary tables that MultiLoad uses during a bulk-load operation.
ML_RESTART=
Specifies the name of a temporary table that is used by MultiLoad to track checkpoint information.
ML_WORK=
Specifies the name of a temporary table that MultiLoad uses to store intermediate data.
MULTILOAD=
Specifies whether Teradata insert and appendoperations should use the Teradata MultiLoad utility.
MULTISTMT=
Specifies whether insert statements are to be sent to Teradata one at a time or in a group.
NULLCHAR=
Indicates how missing SAS character values are handled during insert, update, DBINDEX=, and DBKEY= processing.
NULLCHARVAL=
Defines the character string that replaces missing SAS character values during insert, update, DBINDEX=, and DBKEY= processing.
OR_PARTITION=
Allows reading, updating, and deleting from a particular partition in a partitioned table, also inserting and bulk-loading into a particular partition in a partitioned table.
OR_UPD_NOWHERE=
Specifies whether SAS uses an extra WHERE clause when updating rows with no lockingSpecifies whether SAS uses an extra WHERE clause when updating rows with no locking.
SAS Data Set Options
4
SAS/ACCESS for Relational Databases: References
Data Set Option
Description
ORHINTS=
Specifies Oracle hints to pass to Oracle from a SAS statement or SQL procedure.
PRESERVE_COL_NAMES=
Preserves spaces, special characters, and case-sensitivity in DBMS column names when you create DBMS tables.
QUALIFIER=
Specifies the qualifier to use when you are reading database objects, such as DBMS tables and views.
QUERY_TIMEOUT=
Specifies the number of seconds of inactivity to wait before canceling a query.
READ_ISOLATION_LEVEL= Specifies which level of read isolation locking to use when you are reading data. READ_LOCK_TYPE=
Specifies how data in a DBMS table is locked during a read transaction.
READ_MODE_WAIT=
Specifies during SAS/ACCESS read operations whether Teradata waits to acquire a lock or fails your request when the DBMS resource is locked by a different user.
READBUFF=
Specifies the number of rows of DBMS data to read into the buffer.
SASDATEFMT=
Changes the SAS date format of a DBMS column.
SCHEMA=
Enables you to read a data source, such as a DBMS table and view, in the specified schema.
SEGMENT_NAME=
Enables you to control the segment in which you create a table.
SET=
Specifies whether duplicate rows are allowed when creating a table.
SLEEP=
Specifies the number of minutes that MultiLoad waits before it retries logging in to Teradata.
TENACITY=
Specifies how many hours MultiLoad continues to retry logging on to Teradat if the maximum number of Teradata utilities are already running.
TRAP151=
Enables columns that cannot be updated to be removed from a FOR UPDATE OF clause so updating of columns can proceed as normal.
UPDATE_ISOLATION_ LEVEL=
Defines the degree of isolation of the current application process from other concurrently running application processes.
UPDATE_LOCK_TYPE=
Specifies how data in a DBMS table is locked during an update transaction.
UPDATE_MODE_WAIT=
Specifies during SAS/ACCESS update operations whether the DBMS waits to acquire a lock or fails your request when the DBMS resource is locked by a different user.
UPDATE_SQL=
Determines the method that is used to update and delete rows in a data source.
UPDATEBUFF=
Specifies the number of rows that are processed in a single DBMS update or delete operation.
79
80
81
CHAPTER
3 Formats Definition of Formats 84 Syntax 84 Using Formats 85 Ways to Specify Formats 85 PUT Statement 85 PUT Function 86 %SYSFUNC 86 FORMAT Statement 86 ATTRIB Statement 86 Permanent versus Temporary Association 87 User-Defined Formats 87 Byte Ordering for Integer Binary Data on Big Endian and Little Endian Platforms 88 Definitions 88 How Bytes are Ordered Differently 88 Writing Data Generated on Big Endian or Little Endian Platforms 88 Integer Binary Notation and Different Programming Languages 89 Data Conversions and Encodings 89 Working with Packed Decimal and Zoned Decimal Data 90 Definitions 90 Types of Data 90 Packed Decimal Data 90 Zoned Decimal Data 91 Packed Julian Dates 91 Platforms Supporting Packed Decimal and Zoned Decimal Data 91 Languages Supporting Packed Decimal and Zoned Decimal Data 91 Summary of Packed Decimal and Zoned Decimal Formats and Informats 92 Working with Dates and Times Using the ISO 8601 Basic and Extended Notations 94 ISO 8601 Formatting Symbols 94 Writing ISO 8601 Date, Time, and Datetime Values 95 Writing ISO 8601 Duration, Datetime, and Interval Values 96 Duration, Datetime, and Interval Formats 96 Writing Omitted Components 97 Writing Truncated Duration, Datetime, and Interval Values 98 Normalizing Duration Components 98 Fractions in Durations, Datetime, and Interval Values 98 Formats by Category 99 Dictionary 108 $ASCIIw. Format 108 $BASE64Xw. Format 109 $BINARYw. Format 110 $CHARw. Format 111
82
Contents
4
Chapter 3
$EBCDICw. Format 112 $HEXw. Format 113 $MSGCASEw. Format 114 $N8601Bw.d Format 115 $N8601BAw.d Format 116 $N8601Ew.d Format 117 $N8601EAw.d Format 119 $N8601EHw.d Format 120 $N8601EXw.d Format 121 $N8601Hw.d Format 123 $N8601Xw.d Format 124 $OCTALw. Format 125 $QUOTEw. Format 126 $REVERJw. Format 128 $REVERSw. Format 128 $UPCASEw. Format 129 $VARYINGw. Format 130 $w. Format 132 BESTw. Format 133 BESTDw.p Format 134 BINARYw. Format 136 B8601DAw. Format 137 B8601DNw. Format 138 B8601DTw.d Format 139 B8601DZw. Format 140 B8601LZw. Format 141 B8601TMw.d Format 143 B8601TZw. Format 144 COMMAw.d Format 146 COMMAXw.d Format 147 Dw.p Format 148 DATEw. Format 149 DATEAMPMw.d Format 150 DATETIMEw.d Format 152 DAYw. Format 154 DDMMYYw. Format 154 DDMMYYxw. Format 156 DOLLARw.d Format 158 DOLLARXw.d Format 159 DOWNAMEw. Format 161 DTDATEw. Format 162 DTMONYYw. Format 163 DTWKDATXw. Format 164 DTYEARw. Format 165 DTYYQCw. Format 166 Ew. Format 167 E8601DAw. Format 168 E8601DNw. Format 170 E8601DTw.d Format 171 E8601DZw. Format 172 E8601LZw. Format 173 E8601TMw.d Format 175 E8601TZw.d Format 176 FLOATw.d Format 178
Formats
FRACTw. Format 179 HEXw. Format 180 HHMMw.d Format 181 HOURw.d Format 183 IBw.d Format 184 IBRw.d Format 186 IEEEw.d Format 187 JULDAYw. Format 188 JULIANw. Format 189 MDYAMPMw.d Format 190 MMDDYYw. Format 192 MMDDYYxw. Format 194 MMSSw.d Format 196 MMYYw. Format 197 MMYYxw. Format 198 MONNAMEw. Format 200 MONTHw. Format 201 MONYYw. Format 202 NEGPARENw.d Format 203 NUMXw.d Format 204 OCTALw. Format 205 PDw.d Format 206 PDJULGw. Format 207 PDJULIw. Format 208 PERCENTw.d Format 210 PERCENTNw.d Format 211 PIBw.d Format 212 PIBRw.d Format 214 PKw.d Format 215 PVALUEw.d Format 216 QTRw. Format 217 QTRRw. Format 218 RBw.d Format 219 ROMANw. Format 220 S370FFw.d Format 221 S370FIBw.d Format 222 S370FIBUw.d Format 224 S370FPDw.d Format 225 S370FPDUw.d Format 226 S370FPIBw.d Format 227 S370FRBw.d Format 229 S370FZDw.d Format 230 S370FZDLw.d Format 231 S370FZDSw.d Format 232 S370FZDTw.d Format 233 S370FZDUw.d Format 234 SIZEKw.d Format 235 SIZEKBw.d Format 236 SIZEKMGw.d Format 238 SSNw. Format 239 TIMEw.d Format 240 TIMEAMPMw.d Format 241 TODw.d Format 243 VAXRBw.d Format 244
4
Contents
83
84
Definition of Formats
4
Chapter 3
VMSZNw.d Format 245 w.d Format 247 WEEKDATEw. Format 248 WEEKDATXw. Format 249 WEEKDAYw. Format 250 WEEKUw. Format 251 WEEKVw. Format 253 WEEKWw. Format 255 WORDDATEw. Format 257 WORDDATXw. Format 258 WORDFw. Format 259 WORDSw. Format 260 YEARw. Format 261 YYMMw. Format 262 YYMMxw. Format 263 YYMMDDw. Format 265 YYMMDDxw. Format 266 YYMONw. Format 268 YYQw. Format 269 YYQxw. Format 270 YYQRw. Format 272 YYQRxw. Format 273 Zw.d Format 275 ZDw.d Format 276 Formats Documented in Other SAS Publications 277 SAS National Language Support (NLS): Reference Guide
277
Definition of Formats A format is an instruction that SAS uses to write data values. You use formats to control the written appearance of data values, or, in some cases, to group data values together for analysis. For example, the WORDS22. format, which converts numeric values to their equivalent in words, writes the numeric value 692 as six hundred ninety-two.
Syntax SAS formats have the following form: format< w>.< d> where $ indicates a character format; its absence indicates a numeric format. format names the format. The format is a SAS format or a user-defined format that was previously defined with the VALUE statement in PROC FORMAT. For more information on user-defined formats, see “The FORMAT Procedure” in Base SAS Procedures Guide.
Formats
4
Ways to Specify Formats
85
w specifies the format width, which for most formats is the number of columns in the output data. d specifies an optional decimal scaling factor in the numeric formats. Formats always contain a period (.) as a part of the name. If you omit the w and the d values from the format, SAS uses default values. The d value that you specify with a format tells SAS to display that many decimal places. Formats never change or truncate the internally stored data values. For example, in DOLLAR10.2, the w value of 10 specifies a maximum of 10 columns for the value. The d value of 2 specifies that two of these columns are for the decimal part of the value, which leaves eight columns for all the remaining characters in the value. The remaining columns include the decimal point, the remaining numeric value, a minus sign if the value is negative, the dollar sign, and commas, if any. If the format width is too narrow to represent a value, SAS tries to squeeze the value into the space available. Character formats truncate values on the right. Numeric formats sometimes revert to the BESTw.d format. SAS prints asterisks if you do not specify an adequate width. In the following example, the result is x=**. x=123; put x= 2.;
If you use an incompatible format, such as using a numeric format to write character values, SAS first attempts to use an analogous format of the other type. If this attempt fails, an error message that describes the problem appears in the SAS log. When the value of d is greater than fifteen, the precision of the decimal value after the 15th decimal place might not be accurate.
Using Formats Ways to Specify Formats You can use formats in the following ways:
3 3 3 3 3
in a PUT statement with the PUT, PUTC, or PUTN functions with the %SYSFUNC macro function in a FORMAT statement in a DATA step or a PROC step in an ATTRIB statement in a DATA step or a PROC step.
PUT Statement The PUT statement with a format after the variable name uses a format to write data values in a DATA step. For example, this PUT statement uses the DOLLARw.d format to write the numeric value for AMOUNT as a dollar amount: amount=1145.32; put amount dollar10.2;
The DOLLARw.d format in the PUT statement produces this result: $1,145.32
86
Ways to Specify Formats
4
Chapter 3
See “PUT Statement” on page 1656 for more information.
PUT Function The PUT function converts a numeric variable, a character variable, or a constant using any valid format and returns the resulting character value. For example, the following statement converts the value of a numeric variable into a two-character hexadecimal representation: num=15; char=put(num,hex2.);
The PUT function returns a value of 0F, which is assigned to the variable CHAR. The PUT function is useful for converting a numeric value to a character value. See “PUT Function” on page 1021 for more information.
%SYSFUNC The %SYSFUNC (or %QSYSFUNC) macro function executes SAS functions or user-defined functions and applies an optional format to the result of the function outside a DATA step. For example, the following program writes a numeric value in a macro variable as a dollar amount. %macro tst(amount); %put %sysfunc(putn(&amount,dollar10.2)); %mend tst; %tst (1154.23);
For more information, see SAS Macro Language: Reference.
FORMAT Statement The FORMAT statement permanently associates a format with a variable. SAS uses the format to write the values of the variable that you specify. For example, the following statement in a DATA step associates the COMMAw.d numeric format with the variables SALES1 through SALES3: format sales1-sales3 comma10.2;
Because the FORMAT statement permanently associates a format with a variable, any subsequent DATA step or PROC step uses COMMA10.2 to write the values of SALES1, SALES2, and SALES3. See “FORMAT Statement” on page 1526 for more information. Note: If you assign formats with a FORMAT statement before a PUT statement, all leading blanks are trimmed. Formats that are associated with variables by using a FORMAT statement behave like formats that are used with a colon (:) modifier in a subsequent PUT statement. For details about using the colon format modifier, see “PUT Statement, List” on page 1678. 4
ATTRIB Statement The ATTRIB statement can also associate a format, as well as other attributes, with one or more variables. For example, in the following statement the ATTRIB statement permanently associates the COMMAw.d format with the variables SALES1 through SALES3: attrib sales1-sales3 format=comma10.2;
Formats
4
User-Defined Formats
87
Because the ATTRIB statement permanently associates a format with a variable, any subsequent DATA step or PROC step uses COMMA10.2 to write the values of SALES1, SALES2, and SALES3. See “ATTRIB Statement” on page 1400 for more information.
Permanent versus Temporary Association When you specify a format in a PUT statement, SAS uses the format to write data values during the DATA step but does not permanently associate the format with a variable. To permanently associate a format with a variable, use a FORMAT statement or an ATTRIB statement in a DATA step. SAS permanently associates a format with the variable by modifying the descriptor information in the SAS data set. Using a FORMAT statement or an ATTRIB statement in a PROC step associates a format with a variable for that PROC step, as well as for any output data sets that the procedure creates that contain formatted variables. For more information on using formats in SAS procedures, see Base SAS Procedures Guide.
User-Defined Formats In addition to the formats that are supplied with Base SAS software, you can create your own formats. In Base SAS software, PROC FORMAT allows you to create your own formats for both character and numeric variables. For more information, see “The FORMAT Procedure” in Base SAS Procedures Guide. When you execute a SAS program that uses user-defined formats, these formats should be available. The two ways to make these formats available are 3 to create permanent, not temporary, formats with PROC FORMAT
3 to store the source code that creates the formats (the PROC FORMAT step) with the SAS program that uses them. To create permanent SAS formats, see “The FORMAT Procedure” in Base SAS Procedures Guide. If you execute a program that cannot locate a user-defined format, the result depends on the setting of the FMTERR system option. If the user-defined format is not found, then these system options produce these results: System Options
Results
FMTERR
SAS produces an error that causes the current DATA or PROC step to stop.
NOFMTERR
SAS continues processing and substitutes a default format, usually the BESTw. or $w. format.
Although using NOFMTERR enables SAS to process a variable, you lose the information that the user-defined format supplies. To avoid problems, make sure that your program has access to all user-defined formats that are used.
88
Byte Ordering for Integer Binary Data on Big Endian and Little Endian Platforms
4
Chapter 3
Byte Ordering for Integer Binary Data on Big Endian and Little Endian Platforms
Definitions Integer values for binary integer data are typically stored in one of three sizes: one-byte, two-byte, or four-byte. The ordering of the bytes for the integer varies depending on the platform (operating environment) on which the integers were produced. The ordering of bytes differs between the “big endian” and “little endian” platforms. These colloquial terms are used to describe byte ordering for IBM mainframes (big endian) and for Intel-based platforms (little endian). In the SAS System, the following platforms are considered big endian: AIX, HP-UX, IBM mainframe, Macintosh, and Solaris on SPARC. The following platforms are considered little endian: Intel ABI, Linux, OpenVMS Alpha, OpenVMS Integrity, Solaris on x64, Tru64 UNIX, and Windows.
How Bytes are Ordered Differently On big endian platforms, the value 1 is stored in binary and is represented here in hexadecimal notation. One byte is stored as 01, two bytes as 00 01, and four bytes as 00 00 00 01. On little endian platforms, the value 1 is stored in one byte as 01 (the same as big endian), in two bytes as 01 00, and in four bytes as 01 00 00 00. If an integer is negative, the “two’s complement” representation is used. The high-order bit of the most significant byte of the integer will be set on. For example, –2 would be represented in one, two, and four bytes on big endian platforms as FE, FF FE, and FF FF FF FE respectively. On little endian platforms, the representation would be FE, FE FF, and FE FF FF FF. These representations result from the output of the integer binary value –2 expressed in hexadecimal representation.
Writing Data Generated on Big Endian or Little Endian Platforms SAS can read signed and unsigned integers regardless of whether they were generated on a big endian or a little endian system. Likewise, SAS can write signed and unsigned integers in both big endian and little endian format. The length of these integers can be up to eight bytes. The following table shows which format to use for various combinations of platforms. In the Signed Integer column, “no” indicates that the number is unsigned and cannot be negative. “Yes” indicates that the number can be either negative or positive. Table 3.1 SAS Formats and Byte Ordering Platform For Which the Data Was Created
Platform That Writes the Data
Signed Integer
Format
big endian
big endian
yes
IB or S370FIB
big endian
big endian
no
PIB, S370FPIB, S370FIBU
big endian
little endian
yes
S370FIB
Formats
4
Data Conversions and Encodings
Platform For Which the Data Was Created
Platform That Writes the Data
Signed Integer
Format
big endian
little endian
no
S370FPIB
little endian
big endian
yes
IBR
little endian
big endian
no
PIBR
little endian
little endian
yes
IB or IBR
little endian
little endian
no
PIB or PIBR
big endian
either
yes
S370FIB
big endian
either
no
S370FPIB
little endian
either
yes
IBR
little endian
either
no
PIBR
89
Integer Binary Notation and Different Programming Languages The following table compares integer binary notation according to programming language. Table 3.2 Integer Binary Notation and Programming Languages Language
2 Bytes
4 Bytes
SAS
IB2. , IBR2., PIB2., PIBR2., S370FIB2., S370FIBU2., S370FPIB2.
IB4., IBR4., PIB4., PIBR4., S370FIB4., S370FIBU4., S370FPIB4.
PL/I
FIXED BIN(15)
FIXED BIN(31)
FORTRAN
INTEGER*2
INTEGER*4
COBOL
COMP PIC 9(4)
COMP PIC 9(8)
IBM assembler
H
F
C
short
long
Data Conversions and Encodings An encoding maps each character in a character set to a unique numeric representation, resulting in a table of all code points. A single character can have different numeric representations in different encodings. For example, the ASCII encoding for the dollar symbol $ is 24 hexadecimal. The Danish EBCDIC encoding for the dollar symbol $ is 67 hexadecimal. In order for a version of SAS that normally uses ASCII to properly interpret a data set that is encoded in Danish EBCDIC, the data must be transcoded. Transcoding is the process of moving data from one encoding to another. When transcoding the ASCII dollar sign to the Danish EBCDIC dollar sign, the hexadecimal representation for the character is converted from the value 24 to a 67. If you want to know the encoding of a particular SAS data set, for SAS 9 and above follow these steps: 1 Locate the data set with SAS Explorer.
90
Working with Packed Decimal and Zoned Decimal Data
4
Chapter 3
2 Right-click the data set. 3 Select Properties from the menu. 4 Click the Details tab. 5 The encoding of the data set is listed, along with other information.
Some situations where data might commonly be transcoded are:
3 when you share data between two different SAS sessions that are running in different locales or in different operating environments, 3 when you perform text-string operations, such as converting to uppercase or lowercase, 3 when you display or print characters from another language, 3 when you copy and paste data between SAS sessions running in different locales. For more information on SAS features designed to handle data conversions from different encodings or operating environments, see SAS National Language Support (NLS): Reference Guide.
Working with Packed Decimal and Zoned Decimal Data
Definitions Packed decimal
specifies a method of encoding decimal numbers by using each byte to represent two decimal digits. Packed decimal representation stores decimal data with exact precision. The fractional part of the number is determined by the informat or format because there is no separate mantissa and exponent. An advantage of using packed decimal data is that exact precision can be maintained. However, computations involving decimal data might become inexact due to the lack of native instructions.
Zoned decimal
specifies a method of encoding decimal numbers in which each digit requires one byte of storage. The last byte contains the number’s sign as well as the last digit. Zoned decimal data produces a printable representation.
Nibble
specifies 1/2 of a byte.
Types of Data Packed Decimal Data A packed decimal representation stores decimal digits in each “nibble” of a byte. Each byte has two nibbles, and each nibble is indicated by a hexadecimal character. For example, the value 15 is stored in two nibbles, using the hexadecimal characters 1 and 5. The sign indication is dependent on your operating environment. On IBM mainframes, the sign is indicated by the last nibble. With formats, C indicates a positive value, and D indicates a negative value. With informats, A, C, E, and F
Formats
4
Languages Supporting Packed Decimal and Zoned Decimal Data
91
indicate positive values, and B and D indicate negative values. Any other nibble is invalid for signed packed decimal data. In all other operating environments, the sign is indicated in its own byte. If the high-order bit is 1, then the number is negative. Otherwise, it is positive. The following applies to packed decimal data representation: 3 You can use the S370FPD format on all platforms to obtain the IBM mainframe configuration. 3 You can have unsigned packed data with no sign indicator. The packed decimal format and informat handles the representation. It is consistent between ASCII and EBCDIC platforms. 3 Note that the S370FPDU format and informat expects to have an F in the last nibble, while packed decimal expects no sign nibble.
Zoned Decimal Data The following applies to zoned decimal data representation: 3 A zoned decimal representation stores a decimal digit in the low order nibble of each byte. For all but the byte containing the sign, the high-order nibble is the numeric zone nibble (F on EBCDIC and 3 on ASCII). 3 The sign can be merged into a byte with a digit, or it can be separate, depending on the representation. But the standard zoned decimal format and informat expects the sign to be merged into the last byte. 3 The EBCDIC and ASCII zoned decimal formats produce the same printable representation of numbers. There are two nibbles per byte, each indicated by a hexadecimal character. For example, the value 15 is stored in two bytes. The first byte contains the hexadecimal value F1 and the second byte contains the hexadecimal value C5.
Packed Julian Dates The following applies to packed Julian dates: 3 The two formats and informats that handle Julian dates in packed decimal representation are PDJULI and PDJULG. PDJULI uses the IBM mainframe year computation, while PDJULG uses the Gregorian computation. 3 The IBM mainframe computation considers 1900 to be the base year, and the year values in the data indicate the offset from 1900. For example, 98 means 1998, 100 means 2000, and 102 means 2002. 1998 would mean 3898. 3 The Gregorian computation allows for 2-digit or 4-digit years. If you use 2-digit years, SAS uses the setting of the YEARCUTOFF= system option to determine the true year.
Platforms Supporting Packed Decimal and Zoned Decimal Data Some platforms have native instructions to support packed and zoned decimal data, while others must use software to emulate the computations. For example, the IBM mainframe has an Add Pack instruction to add packed decimal data, but the Intel-based platforms have no such instruction and must convert the decimal data into some other format.
Languages Supporting Packed Decimal and Zoned Decimal Data Several languages support packed decimal and zoned decimal data. The following table shows how COBOL picture clauses correspond to SAS formats and informats.
92
Summary of Packed Decimal and Zoned Decimal Formats and Informats
4
Chapter 3
IBM VS COBOL II clauses
Corresponding S370Fxxx formats/informats
PIC S9(X) PACKED-DECIMAL
S370FPDw.
PIC 9(X) PACKED-DECIMAL
S370FPDUw.
PIC S9(W) DISPLAY
S370FZDw.
PIC 9(W) DISPLAY
S370FZDUw.
PIC S9(W) DISPLAY SIGN LEADING
S370FZDLw.
PIC S9(W) DISPLAY SIGN LEADING SEPARATE
S370FZDSw.
PIC S9(W) DISPLAY SIGN TRAILING SEPARATE
S370FZDTw.
For the packed decimal representation listed above, X indicates the number of digits represented, and W is the number of bytes. For PIC S9(X) PACKED-DECIMAL, W is ceil((x+1)/2). For PIC 9(X) PACKED-DECIMAL, W is ceil(x/2). For example, PIC S9(5) PACKED-DECIMAL represents five digits. If a sign is included, six nibbles are needed. ceil((5+1)/2) has a length of three bytes, and the value of W is 3. Note that you can substitute COMP-3 for PACKED-DECIMAL. In IBM assembly language, the P directive indicates packed decimal, and the Z directive indicates zoned decimal. The following shows an excerpt from an assembly language listing, showing the offset, the value, and the DC statement: offset
value (in hex)
+000000 +000003 +000006 +000009
00001C 00001D F0F0C1 F0F0D1
inst label 2 3 4 5
PEX1 PEX2 ZEX1 ZEX2
directive DC DC DC DC
PL3’1’ PL3’-1’ ZL3’1’ ZL3’1’
In PL/I, the FIXED DECIMAL attribute is used in conjunction with packed decimal data. You must use the PICTURE specification to represent zoned decimal data. There is no standardized representation of decimal data for the FORTRAN or the C languages.
Summary of Packed Decimal and Zoned Decimal Formats and Informats SAS uses a group of formats and informats to handle packed and zoned decimal data. The following table lists the type of data representation for these formats and informats. Note that the formats and informats that begin with S370 refer to IBM mainframe representation. Format
Type of data representation
Corresponding informat
Comments
PD
Packed decimal
PD
Local signed packed decimal
PK
Packed decimal
PK
Unsigned packed decimal; not specific to your operating environment
Formats
4
Summary of Packed Decimal and Zoned Decimal Formats and Informats
Format
Type of data representation
Corresponding informat
Comments
ZD
Zoned decimal
ZD
Local zoned decimal
none
Zoned decimal
ZDB
Translates EBCDIC blank (hexadecimal 40) to EBCDIC zero (hexadecimal F0); corresponds to the informat as zoned decimal
none
Zoned decimal
ZDV
Non-IBM zoned decimal representation
S370FPD
Packed decimal
S370FPD
Last nibble C (positive) or D (negative)
S370FPDU
Packed decimal
S370FPDU
Last nibble always F (positive)
S370FZD
Zoned decimal
S370FZD
Last byte contains sign in upper nibble: C (positive) or D (negative)
S370FZDU
Zoned decimal
S370FZDU
Unsigned; sign nibble always F
S370FZDL
Zoned decimal
S370FZDL
Sign nibble in first byte in informat; separate leading sign byte of hexadecimal C0 (positive) or D0 (negative) in format
S370FZDS
Zoned decimal
S370FZDS
Leading sign of (hexadecimal 60) or + (hexadecimal 4E)
S370FZDT
Zoned decimal
S370FZDT
Trailing sign of (hexadecimal 60) or + (hexadecimal 4E)
PDJULI
Packed decimal
PDJULI
Julian date in packed representation - IBM computation
PDJULG
Packed decimal
PDJULG
Julian date in packed representation - Gregorian computation
none
Packed decimal
RMFDUR
Input layout is: mmsstttF
none
Packed decimal
SHRSTAMP
Input layout is: yyyydddFhhmmssth, where yyyydddF is the packed Julian date; yyyy is a 0-based year from 1900
none
Packed decimal
SMFSTAMP
Input layout is: xxxxxxxxyyyydddF, where yyyydddF is the packed Julian date; yyyy is a 0-based year from 1900
93
94
Working with Dates and Times Using the ISO 8601 Basic and Extended Notations
4
Chapter 3
Format
Type of data representation
Corresponding informat
Comments
none
Packed decimal
PDTIME
Input layout is: 0hhmmssF
none
Packed decimal
RMFSTAMP
Input layout is: 0hhmmssFyyyydddF, where yyyydddF is the packed Julian date; yyyy is a 0-based year from 1900
Working with Dates and Times Using the ISO 8601 Basic and Extended Notations ISO 8601 Formatting Symbols The following list explains the formatting symbols that are used to notate the ISO 8601 dates, time, datetime, durations, and interval values: n
specifies a number that represents the number of years, months, or days
P
indicates that the duration that follows is specified by the number of years, months, days, hours, minutes, and seconds
T
indicates that a time value follows. Any value with a time must begin with T. Time values that are read by the extended notation informats that begin with the characters E8601 must use an uppercase T.
Requirement:
W
indicates that the duration is specified in weeks.
Z
indicates that the time value is the time in Greenwich, England, or UTC time.
+|-
the + indicates the time zone offset to the east of Greenwich, England. The - indicates the time zone offset to the west of Greenwich, England.
yyyy
specifies a four-digit year
mm
as part of a date, specifies a two-digit month, 01 - 12
dd
specifies a two-digit day, 01 - 31
hh
specifies a two-digit hour, 00 - 24
mm
as part of a time, specifies a two-digit minute, 00 - 59
ss
specifies a two-digit second, 00 - 59
fff | ffffff
specifies an optional fraction of a second using the digits 0 - 9: fff
use 1 - 3 digits for values read by the $N8601B informat and the $N8601E informat
Formats
ffffff
4
Writing ISO 8601 Date, Time, and Datetime Values
95
use 1 - 6 digits for informat other than the $N8601B informat and the $N8601E informat
Y
indicates that a year value proceeds this character in a duration
M
as part of a date, indicates that a month value proceeds this character in a duration
D
indicates that a day value proceeds this character in a duration
H
indicates that an hour value proceeds this character in a duration
M
as part of a time, indicates that a minute value proceeds this character in a duration
S
indicates that a seconds value proceeds this character in a duration
Writing ISO 8601 Date, Time, and Datetime Values SAS uses the formats in the following table to write date, time, and datetime values in the ISO 8601 basic and extended notations from SAS date, time, and datetime values. Table 3.3 Formats for Writing ISO 8601 Dates, Times, and Datetimes Date, Time, or Datetime
ISO 8601 Notation
Example
Format
Date
yyyymmdd
20080915
B8601DAw.
Time
hhmmssffffff
155300322348
B8601TMw.d
Time with time zone
hhmmss+|-hhmm
155300+0500
B8601TZw.d
hhmmssZ
155300Z
B8601TZw.d
Convert to local time with time zone
hhmmss+|-hhmm
155300+0500
B8601LZw.d
Datetime
yyyymmddThhmmssffffff
20080915T155300
B8601DTw.d
Datetime with timezone
yyyymmddThhmmss+|hhmm
20080915T155300+0500
B8601DZw.d
yyyymmddThhmmssZ
20080915T155300Z
B8601DZw.d
yyyymmdd
20080915
B8601DNw.
Date
yyyy-mm-dd
2008-09-15
E8601DAw.
Time
hh:mm:ss.ffffff
15:53:00.322348
E8601TMw.d
Time with time zone
hh:mm:ss.ffffff+|-hh:mm
15:53:00+05:00
E8601TZw.d
Convert to local time with time zone
hh:mm:ss.ffffff+|-hh:mm
15:53:00+05:00
E8601LZw.d
Datetime
yyyy-mmddThh:mm:ss.ffffff
2008-09-15T15:53:00
E8601TZw.d
Basic Notations
Write the date from a datetime Extended Notations
96
Writing ISO 8601 Duration, Datetime, and Interval Values
4
Chapter 3
Date, Time, or Datetime
ISO 8601 Notation
Example
Format
Datetime with time zone
yyyy-mmddThh:mm:ss.nnnnnn+|hh:mm
2008-0915T15:53:00+05:00
E8601DZw.d
Write the date from a datetime
yyyy-mm-dd
2008-09-15
E8601DNw.
An asterisk ( * )used in place of a date or time formatted value that is out-of-range.
Writing ISO 8601 Duration, Datetime, and Interval Values Duration, Datetime, and Interval Formats SAS writes duration, datetime, and interval values from character data using these formats: Time Component
ISO 8601 Notation
Duration - Basic Notation
PyyyymmddThhmmssfff P20080915T155300
$N8601BA
-PyyyymmddThhmmssfff -P20080915T155300
$N8601BA
Pyyyy-mmddThh:mm:ss.fff
P2008-09-15T15:53:00
$N8601EA
-Pyyyy-mmddThh:mm:ss.fff
-P2008-09-15T15:53:00
$N8601EA
PnYnMnDTnHnMnS
P2y10m14dT20h13m45s $N8601B
Duration - Extended Notation
Duration - Basic and Extended Notation
Example
Format
$N8601E -PnYnMnDTnHnMnS
-P2y10m14dT20h13m45s $N8601B $N8601E
PnW (weeks)
P6w
$N8601B $N8601E
Interval - Basic Notation
Interval- Extended Notation
yyyymmddThhmmssfff/ yyyymmddThhmmssfff
20080915T155300/ 20101113T000000
PnYnMnDTnHnMnS/ yyyymmddThhmmssfff
P2y10M14dT20h13m45s/ $N8601B 20080915T155300
yyyymmddThhmmssfff/ PnYnMnDTnHnMnS
20080915T155300/ $N8601BA P2y10M14dT20h13m45s
yyyy-mmddThh:mm:ss.fff/ yyyy-mmddThh:mm:ss.fff
2008-09-15T15:53:00/ 2010-11-13T00:00:00
PnYnMnDTnHnMnS/ yyyy-mmddThh:mm:ss.fff
P2y10M14dT20h13m45s/ $N8601E 2008-09-15T15:53:00
yyyy-mmddThh:mm:ss.fff/ PnYnMnDTnHnMnS
2008-09-15T15:53:00/ $N8601EA P2y10M14dT20h13m45s
$N8601BA
$N8601EA
Formats
4
Writing ISO 8601 Duration, Datetime, and Interval Values
Time Component
ISO 8601 Notation
Example
Datetime-Basic Notation
yyyymmddThhmmss.fff+|-20080915T155300 hhmm
$N8601BA
(all blank)
$N8601B
97
Format
$N8601BA $N8601E $N8601EA Datetime-Extended Notation
yyyy-mmddThh:mm:ss.fff+|hhmm
2008-09-15T15:53:00
$N8601EA
+04:30
(all blank)
$N8601B $N8601BA $N8601E $N8601EA
Writing Omitted Components An omitted component can be represented by a hyphen ( - ) or an x in the extended datetime form yyyy-mm-ddThh:mm:ss and in the extended duration form Pyyyy-mm-ddThh:mm:ss. Omitted components in the durations form PnYnMnDTnHnMnS are dropped, they do not contain a hyphen or x. For example, P2mT4H. The following formats write omitted components that use the hyphen and the x: Format
Datetime Form
Duration Form
Examples
$N8601H
yyyy-mmddThh:mm:ss
PnYnMnDTnHnMnS
–09-15T15:-:53
$N8601EH
yyyy-mmddThh:mm:ss
Pyyyy-mmddThh:mm:ss
P000—02T02:55:20/ 2008—15T-:-:45
$N8601X
yyyy-mmddThh:mm:ss
PnYnMnDTnHnMnS
P2Y2DT4H5M6S/ x-09-15T15:x:00
$N8601EX
yyyy-mmddThh:mm:ss
Pyyyy-mmddThh:mm:ss
P0003-x-02T02:55:20/ 2008-x-15Tx:x:45
P2Y2DT4H5M6S/ –09-15T15:-:00
Datetime values with omitted components that are formatted with either the $N8601B format or the $N8601BA format are formatted in the extended notation using the hyphen for omitted components to ensure accurate data. For example, when the month is an omitted component, the value 2008—15 is written and not 2008-15. The extended notation with hyphens is also used in place of the basic notation if a duration is formatted by using the $N8601BA format. Using the same date, P2008—15 is written and not P2008-15.
98
Writing ISO 8601 Duration, Datetime, and Interval Values
4
Chapter 3
Writing Truncated Duration, Datetime, and Interval Values Duration, datetime, or interval values can be truncated when one or more lower order values is 0 or is not significant. When SAS writes a truncated value using the formats $N8601B, $N8601BA, $N8601E, and $N8601EA, the display of the value stops at the last non-missing component. When you format a truncated value by using either the $N8601H format or the $N8601EH format, the lower order components are written with a hyphen. When you format a truncated value by using the $N8601X format or the $N8601EX format, the lower order components are written with an x. The following examples show truncated values: p00030202T1031 2008-09-15T15/2010-09-15T15:53 -p0003-03-03T-:-:P2y3m4dT5h6m 2008-09-xTx:x:x 2008
Normalizing Duration Components When a value for a duration component is greater than the largest standard value for a component, SAS normalizes the component except when the duration component is a single component. The following table shows examples of normalized duration components: Duration
Extended Normalized Duration
p3y13m
p0004-01
pt24h24m65s
P----01T-:25:05
p3y13mT24h61m
P0004-01-01T01:01
p0004-13
p0005-01
p0003-02-61T15:61:61
P0003-04-01T16:02:01
p13m
P13M
If a component contains the largest value, such as 60 for minutes or seconds, SAS normalizes the value and replaces the value with a hyphen. For example, pT12:60:13 becomes PT13:-:13. Thirty days is used to normalize a month. Dates and times in a datetime value that are greater than the standard value for the component are not normalized. They produce an error.
Fractions in Durations, Datetime, and Interval Values Ending components can contain a fraction that consists of a period or a comma, followed by one to three digits. The following examples show the use of fractions in duration, datetime, and interval values: 200809.5 P2008-09-15T10.33 2008-09-15/P0003-03-03,333
Formats
4
Formats by Category
99
Formats by Category There are four categories of formats in this list: Category
Description
Character
instructs SAS to write character data values from character variables.
Date and Time
instructs SAS to write data values from variables that represent dates, times, and datetimes.
ISO 8601
instructs SAS to write date, time, and datetime values using the ISO 8601 standard.
Numeric
instructs SAS to write numeric data values from numeric variables.
Formats that support national languages can be found in SAS National Language Support (NLS): Reference Guide. A listing of national language formats is provided in . Storing user-defined formats is an important consideration if you associate these formats with variables in permanent SAS data sets, especially those data sets shared with other users. For information on creating and storing user-defined formats, see “The FORMAT Procedure” in Base SAS Procedures Guide. The following table provides brief descriptions of the SAS formats. For more detailed descriptions, see the dictionary entry for each format. Table 3.4 Categories and Descriptions of Formats Category
Formats
Description
Character
“$ASCIIw. Format” on page 108
Converts native format character data to ASCII representation.
“$BASE64Xw. Format” on page 109
Converts character data into ASCII text by using Base 64 encoding.
“$BINARYw. Format” on page 110
Converts character data to binary representation.
“$CHARw. Format” on page 111
Writes standard character data.
“$EBCDICw. Format” on page 112
Converts native format character data to EBCDIC representation.
“$HEXw. Format” on page 113
Converts character data to hexadecimal representation.
“$MSGCASEw. Format” on page 114
Writes character data in uppercase when the MSGCASE system option is in effect.
“$OCTALw. Format” on page 125
Converts character data to octal representation.
“$QUOTEw. Format” on page 126
Writes data values that are enclosed in double quotation marks.
“$REVERJw. Format” on page 128
Writes character data in reverse order and preserves blanks.
100
Formats by Category
Category
Date and Time
4
Chapter 3
Formats
Description
“$REVERSw. Format” on page 128
Writes character data in reverse order and left aligns
“$UPCASEw. Format” on page 129
Converts character data to uppercase.
“$VARYINGw. Format” on page 130
Writes character data of varying length.
“$w. Format” on page 132
Writes standard character data.
“$N8601Bw.d Format” on page 115
Writes ISO 8601 duration, datetime, and interval forms using the basic notations PnYnMnDTnH nMnS and yyyymmddThhmmss.
“$N8601BAw.d Format” on page 116
Writes ISO 8601 duration, datetime, and interval forms using the basic notations PyyyymmddThhmmss and yyyymmdd Thhmmss.
“$N8601Ew.d Format” on page 117
Writes ISO 8601 duration, datetime, and interval forms using the extended notations PnYnMnDTn HnMnS and yyyy-mm-ddT hh:mm:ss.
“$N8601EAw.d Format” on page 119
Writes ISO 8601 duration, datetime, and interval forms using the extended notations Pyyyy-mm-ddThh:mm:ss and yyyy-mm-ddThh:mm:ss.
“$N8601EHw.d Format” on page 120
Writes ISO 8601 duration, datetime, and interval forms using the extended notations Pyyyy-mm-ddThh:mm:ss and yyyy-mm-ddThh:mm:ss, using a hyphen ( - )for omitted components.
“$N8601EXw.d Format” on page 121
Writes ISO 8601 duration, datetime, and interval forms using the extended notations Pyyyy-mm-ddThh:mm:ss and yyyy-mm-ddThh:mm:ss, using an x for digit of an omitted component.
“$N8601Hw.d Format” on page 123
Writes ISO 8601 duration, datetime, and interval forms P nYnMnDTnHnM nS and yyyy-mm-ddThh:mm:ss, dropping omitted components in duration values and using a hyphen ( - )for omitted components in datetime values.
“$N8601Xw.d Format” on page 124
Writes ISO 8601 duration, datetime, and interval forms P nYnMnDTnHnM nS and yyyy-mm-ddThh:mm:ss, dropping omitted components in duration values and using an x for each digit of an omitted component in datetime values.
“B8601DAw. Format” on page 137
Writes date values using the IOS 8601 base notation yyyymmdd .
“B8601DNw. Format” on page 138
Writes the date from a datetime value using the ISO 8601 basic notation yyyymmdd.
“B8601DTw.d Format” on page 139
Writes datetime values in the ISO 8601 basic notation yyyymmdd Thhmmssffffff.
“B8601DZw. Format” on page 140
Writes datetime values in the Coordinated Universal Time (UTC) time scale using ISO 8601 datetime and time zone basic notation yyyymmdd Thhmmss+|-hhmm.
Formats
Category
4
Formats by Category
Formats
Description
“B8601LZw. Format” on page 141
Writes time values as local time by appending a time zone offset difference between the local time and UTC, using the ISO 8601 basic time notation hhmmss+|-hhmm.
“B8601TMw.d Format” on page 143
Writes time values using the ISO 8601 basic notation hhmmssffff .
“B8601TZw. Format” on page 144
Adjusts time values to the Coordinated Universal Time (UTC) and writes them using the ISO 8601 basic time notation hhmmss+|- hhmm.
“DATEw. Format” on page 149
Writes date values in the form ddmmmyy, ddmmmyyyy , or dd-mmm-yyyy.
“DATEAMPMw.d Format” on page 150
Writes datetime values in the form ddmmmyy:hh:mm:ss.ss with AM or PM.
“DATETIMEw.d Format” on page 152
Writes datetime values in the form ddmmmyy:hh:mm:ss.ss.
“DAYw. Format” on page 154
Writes date values as the day of the month.
“DDMMYYw. Format” on page 154
Writes date values in the form ddmm yy or dd/mm/ yy , where a forward slash is the separator and the year appears as either 2 or 4 digits.
“DDMMYYxw. Format” on page 156
Writes date values in the form ddmm yy or dd-mm-yy, where the x in the format name is a character that represents the special character that separates the day, month, and year, which can be a hyphen (-), period (.), blank character, slash (/), colon (:), or no separator; the year can be either 2 or 4 digits.
“DOWNAMEw. Format” on page 161
Writes date values as the name of the day of the week.
“DTDATEw. Format” on page 162
Expects a datetime value as input and writes date values in the form ddmmmyy or ddmmmyyyy.
“DTMONYYw. Format” on page 163
Writes the date part of a datetime value as the month and year in the form mmmyy or mmmyyyy.
“DTWKDATXw. Format” on page 164
Writes the date part of a datetime value as the day of the week and the date in the form day-of-week, dd month-name yy (or yyyy).
“DTYEARw. Format” on page 165
Writes the date part of a datetime value as the year in the form yy or yyyy.
“DTYYQCw. Format” on page 166
Writes the date part of a datetime value as the year and the quarter and separates them with a colon (:).
“E8601DAw. Format” on page 168
Writes date values using the ISO 8601 extended notation yyyy-mm-dd.
“E8601DNw. Format” on page 170
Writes the date from a SAS datetime value using the ISO 8601 extended notation yyyy-mm-dd.
“E8601DTw.d Format” on page 171
Writes datetime values in the ISO 8601 extended notation yyyy-mm-ddThh:mm:ss.ffffff.
101
102
Formats by Category
Category
4
Chapter 3
Formats
Description
“E8601DZw. Format” on page 172
Writes datetime values in the Coordinated Universal Time (UTC) time scale using ISO 8601 datetime and time zone extended notations yyyy-mm-ddThh:mm:ss+|-hh:mm.
“E8601LZw. Format” on page 173
Writes time values as local time, appending the Coordinated Universal Time (UTC) offset for the local SAS session, using the ISO 8601 extended time notation hh:mm:ss+|-hh:mm.
“E8601TMw.d Format” on page 175
Writes time values using the ISO 8601 extended notation hh:mm:ss.ffffff.
“E8601TZw.d Format” on page 176
Adjusts time values to the Coordinated Universal Time (UTC) and writes them using the ISO 8601 extended notation hh:mm :ss+|-hh:mm.
“HHMMw.d Format” on page 181
Writes time values as hours and minutes in the form hh:mm .
“HOURw.d Format” on page 183
Writes time values as hours and decimal fractions of hours.
“JULDAYw. Format” on page 188
Writes date values as the Julian day of the year.
“JULIANw. Format” on page 189
Writes date values as Julian dates in the form yyddd or yyyyddd.
“MMDDYYw. Format” on page 192
Writes date values in the form mmdd yy or mm/dd/ yy , where a forward slash is the separator and the year appears as either 2 or 4 digits.
“MMDDYYxw. Format” on page 194
Writes date values in the form mmdd yy or mm-dd-yy , where the x in the format name is a character that represents the special character which separates the month, day, and year. The special character can be a hyphen (-), period (.), blank character, slash (/), colon (:), or no separator; the year can be either 2 or 4 digits.
“MMSSw.d Format” on page 196
Writes time values as the number of minutes and seconds since midnight.
“MMYYw. Format” on page 197
Writes date values in the form mmM yy, where M is the separator and the year appears as either 2 or 4 digits.
“MMYYxw. Format” on page 198
Writes date values in the form mm yy or mm-yy, where the x in the format name is a character that represents the special character that separates the month and the year, which can be a hyphen (-), period (.), blank character, slash (/), colon (:), or no separator; the year can be either 2 or 4 digits.
“MONNAMEw. Format” on page 200
Writes date values as the name of the month.
“MONTHw. Format” on page 201
Writes date values as the month of the year.
Formats
Category
4
Formats by Category
Formats
Description
“MONYYw. Format” on page 202
Writes date values as the month and the year in the form mmmyy or mmmyyyy.
“PDJULGw. Format” on page 207
Writes packed Julian date values in the hexadecimal format yyyydddF for IBM.
“PDJULIw. Format” on page 208
Writes packed Julian date values in the hexadecimal format ccyydddF for IBM.
“QTRw. Format” on page 217
Writes date values as the quarter of the year.
“QTRRw. Format” on page 218
Writes date values as the quarter of the year in Roman numerals.
“TIMEw.d Format” on page 240
Writes time values as hours, minutes, and seconds in the form hh:mm:ss.ss.
“TIMEAMPMw.d Format” on page 241
Writes time values as hours, minutes, and seconds in the form hh:mm:ss.ss with AM or PM.
“TODw.d Format” on page 243
Writes SAS time values and the time portion of SAS datetime values in the form hh:mm:ss.ss.
“WEEKDATEw. Format” on page 248
Writes date values as the day of the week and the date in the form day-of-week, month-name dd, yy (or yyyy).
“WEEKDATXw. Format” on page 249
Writes date values as the day of the week and date in the form day-of-week, dd month-name yy (or yyyy).
“WEEKDAYw. Format” on page 250
Writes date values as the day of the week.
“WEEKUw. Format” on page 251
Writes a week number in decimal format by using the U algorithm.
“WEEKVw. Format” on page 253
Writes a week number in decimal format by using the V algorithm.
“WEEKWw. Format” on page 255
Writes a week number in decimal format by using the W algorithm.
“WORDDATEw. Format” on page 257
Writes date values as the name of the month, the day, and the year in the form month-name dd, yyyy.
“WORDDATXw. Format” on page 258
Writes date values as the day, the name of the month, and the year in the form dd month-name yyyy.
“YEARw. Format” on page 261
Writes date values as the year.
“YYMMw. Format” on page 262
Writes date values in the form yyM mm, where M is a character separator to indicate that the month number follows the M and the year appears as either 2 or 4 digits.
“YYMMxw. Format” on page 263
Writes date values in the form yymm or yy-mm, where the x in the format name is a character that represents the special character that separates the year and the month, which can be a hyphen (-), period (.), blank character, slash (/), colon (:), or no separator; the year can be either 2 or 4 digits.
103
104
Formats by Category
Category
ISO 8601
4
Chapter 3
Formats
Description
“YYMMDDw. Format” on page 265
Writes date values in the form yymmdd or < yy>yy-mm-dd, where a dash is the separator and the year appears as either 2 or 4 digits.
“YYMMDDxw. Format” on page 266
Writes date values in the form yymmdd or < yy>yy-mm-dd, where the x in the format name is a character that represents the special character which separates the year, month, and day. The special character can be a hyphen (-), period (.), blank character, slash (/), colon (:), or no separator; the year can be either 2 or 4 digits.
“YYMONw. Format” on page 268
Writes date values in the form yymmm or yyyymmm .
“YYQw. Format” on page 269
Writes date values in the form yyQ q, where Q is the separator, the year appears as either 2 or 4 digits, and q is the quarter of the year.
“YYQxw. Format” on page 270
Writes date values in the form yyq or yy-q, where the x in the format name is a character that represents the special character that separates the year and the quarter or the year, which can be a hyphen (-), period (.), blank character, slash (/), colon (:), or no separator; the year can be either 2 or 4 digits.
“YYQRw. Format” on page 272
Writes date values in the form yyQ qr, where Q is the separator, the year appears as either 2 or 4 digits, and qr is the quarter of the year expressed in roman numerals.
“YYQRxw. Format” on page 273
Writes date values in the form yy qr or yy-qr, where the x in the format name is a character that represents the special character that separates the year and the quarter or the year, which can be a hyphen (-), period (.), blank character, slash (/), colon (:), or no separator; the year can be either 2 or 4 digits and qr is the quarter of the year expressed in roman numerals.
“$N8601Bw.d Format” on page 115
Writes ISO 8601 duration, datetime, and interval forms using the basic notations PnYnMnDTnH nMnS and yyyymmddThhmmss.
“$N8601BAw.d Format” on page 116
Writes ISO 8601 duration, datetime, and interval forms using the basic notations PyyyymmddThhmmss and yyyymmdd Thhmmss.
“$N8601Ew.d Format” on page 117
Writes ISO 8601 duration, datetime, and interval forms using the extended notations PnYnMnDTn HnMnS and yyyy-mm-ddT hh:mm:ss.
“$N8601EAw.d Format” on page 119
Writes ISO 8601 duration, datetime, and interval forms using the extended notations Pyyyy-mm-ddThh:mm:ss and yyyy-mm-ddThh:mm:ss.
“$N8601EHw.d Format” on page 120
Writes ISO 8601 duration, datetime, and interval forms using the extended notations Pyyyy-mm-ddThh:mm:ss and yyyy-mm-ddThh:mm:ss, using a hyphen ( - )for omitted components.
Formats
Category
4
Formats by Category
Formats
Description
“$N8601EXw.d Format” on page 121
Writes ISO 8601 duration, datetime, and interval forms using the extended notations Pyyyy-mm-ddThh:mm:ss and yyyy-mm-ddThh:mm:ss, using an x for digit of an omitted component.
“$N8601Hw.d Format” on page 123
Writes ISO 8601 duration, datetime, and interval forms P nYnMnDTnHnM nS and yyyy-mm-ddThh:mm:ss, dropping omitted components in duration values and using a hyphen ( - )for omitted components in datetime values.
“$N8601Xw.d Format” on page 124
Writes ISO 8601 duration, datetime, and interval forms P nYnMnDTnHnM nS and yyyy-mm-ddThh:mm:ss, dropping omitted components in duration values and using an x for each digit of an omitted component in datetime values.
“B8601DAw. Format” on page 137
Writes date values using the IOS 8601 base notation yyyymmdd .
“B8601DNw. Format” on page 138
Writes the date from a datetime value using the ISO 8601 basic notation yyyymmdd.
“B8601DTw.d Format” on page 139
Writes datetime values in the ISO 8601 basic notation yyyymmdd Thhmmssffffff.
“B8601DZw. Format” on page 140
Writes datetime values in the Coordinated Universal Time (UTC) time scale using ISO 8601 datetime and time zone basic notation yyyymmdd Thhmmss+|-hhmm.
“B8601LZw. Format” on page 141
Writes time values as local time by appending a time zone offset difference between the local time and UTC, using the ISO 8601 basic time notation hhmmss+|-hhmm.
“B8601TMw.d Format” on page 143
Writes time values using the ISO 8601 basic notation hhmmssffff .
“B8601TZw. Format” on page 144
Adjusts time values to the Coordinated Universal Time (UTC) and writes them using the ISO 8601 basic time notation hhmmss+|- hhmm.
“E8601DAw. Format” on page 168
Writes date values using the ISO 8601 extended notation yyyy-mm-dd.
“E8601DNw. Format” on page 170
Writes the date from a SAS datetime value using the ISO 8601 extended notation yyyy-mm-dd.
“E8601DTw.d Format” on page 171
Writes datetime values in the ISO 8601 extended notation yyyy-mm-ddThh:mm:ss.ffffff.
“E8601DZw. Format” on page 172
Writes datetime values in the Coordinated Universal Time (UTC) time scale using ISO 8601 datetime and time zone extended notations yyyy-mm-ddThh:mm:ss+|-hh:mm.
“E8601LZw. Format” on page 173
Writes time values as local time, appending the Coordinated Universal Time (UTC) offset for the local SAS session, using the ISO 8601 extended time notation hh:mm:ss+|-hh:mm.
105
106
Formats by Category
Category
Numeric
4
Chapter 3
Formats
Description
“E8601TMw.d Format” on page 175
Writes time values using the ISO 8601 extended notation hh:mm:ss.ffffff.
“E8601TZw.d Format” on page 176
Adjusts time values to the Coordinated Universal Time (UTC) and writes them using the ISO 8601 extended notation hh:mm :ss+|-hh:mm.
“BESTw. Format” on page 133
SAS chooses the best notation.
“BESTDw.p Format” on page 134
Prints numeric values, lining up decimal places for values of similar magnitude, and prints integers without decimals.
“BINARYw. Format” on page 136
Converts numeric values to binary representation.
“COMMAw.d Format” on page 146
Writes numeric values with a comma that separates every three digits and a period that separates the decimal fraction.
“COMMAXw.d Format” on page 147
Writes numeric values with a period that separates every three digits and a comma that separates the decimal fraction.
“Dw.p Format” on page 148
Prints numeric values, possibly with a great range of values, lining up decimal places for values of similar magnitude.
“DOLLARw.d Format” on page 158
Writes numeric values with a leading dollar sign, a comma that separates every three digits, and a period that separates the decimal fraction.
“DOLLARXw.d Format” on page 159
Writes numeric values with a leading dollar sign, a period that separates every three digits, and a comma that separates the decimal fraction.
“Ew. Format” on page 167
Writes numeric values in scientific notation.
“FLOATw.d Format” on page 178
Generates a native single-precision, floating-point value by multiplying a number by 10 raised to the dth power.
“FRACTw. Format” on page 179
Converts numeric values to fractions.
“HEXw. Format” on page 180
Converts real binary (floating-point) values to hexadecimal representation.
“IBw.d Format” on page 184
Writes native integer binary (fixed-point) values, including negative values.
“IBRw.d Format” on page 186
Writes integer binary (fixed-point) values in Intel and DEC formats.
“IEEEw.d Format” on page 187
Generates an IEEE floating-point value by multiplying a number by 10 raised to the dth power.
“NEGPARENw.d Format” on page 203
Writes negative numeric values in parentheses.
“NUMXw.d Format” on page 204
Writes numeric values with a comma in place of the decimal point.
Formats
Category
4
Formats by Category
Formats
Description
“OCTALw. Format” on page 205
Converts numeric values to octal representation.
“PDw.d Format” on page 206
Writes data in packed decimal format.
“PERCENTw.d Format” on page 210
Writes numeric values as percentages.
“PERCENTNw.d Format” on page 211
Produces percentages, using a minus sign for negative values.
“PIBw.d Format” on page 212
Writes positive integer binary (fixed-point) values.
“PIBRw.d Format” on page 214
Writes positive integer binary (fixed-point) values in Intel and DEC formats.
“PKw.d Format” on page 215
Writes data in unsigned packed decimal format.
“PVALUEw.d Format” on page 216
Writes p-values.
“RBw.d Format” on page 219
Writes real binary data (floating-point) in real binary format.
“ROMANw. Format” on page 220
Writes numeric values as roman numerals.
“S370FFw.d Format” on page 221
Writes native standard numeric data in IBM mainframe format.
“S370FIBw.d Format” on page 222
Writes integer binary (fixed-point) values, including negative values, in IBM mainframe format.
“S370FIBUw.d Format” on page 224
Writes unsigned integer binary (fixed-point) values in IBM mainframe format.
“S370FPDw.d Format” on page 225
Writes packed decimal data in IBM mainframe format.
“S370FPDUw.d Format” on page 226
Writes unsigned packed decimal data in IBM mainframe format.
“S370FPIBw.d Format” on page 227
Writes positive integer binary (fixed-point) values in IBM mainframe format.
“S370FRBw.d Format” on page 229
Writes real binary (floating-point) data in IBM mainframe format.
“S370FZDw.d Format” on page 230
Writes zoned decimal data in IBM mainframe format.
“S370FZDLw.d Format” on page 231
Writes zoned decimal leading–sign data in IBM mainframe format.
“S370FZDSw.d Format” on page 232
Writes zoned decimal separate leading-sign data in IBM mainframe format.
“S370FZDTw.d Format” on page 233
Writes zoned decimal separate trailing-sign data in IBM mainframe format.
“S370FZDUw.d Format” on page 234
Writes unsigned zoned decimal data in IBM mainframe format.
107
108
Dictionary
4
Chapter 3
Category
Formats
Description
“SSNw. Format” on page 239
Writes Social Security numbers.
“VAXRBw.d Format” on page 244
Writes real binary (floating-point) data in VMS format.
“VMSZNw.d Format” on page 245
Generates VMS and MicroFocus COBOL zoned numeric data.
“w.d Format” on page 247
Writes standard numeric data one digit per byte.
“WORDFw. Format” on page 259
Writes numeric values as words with fractions that are shown numerically.
“WORDSw. Format” on page 260
Writes numeric values as words.
“Zw.d Format” on page 275
Writes standard numeric data with leading 0s.
“ZDw.d Format” on page 276
Writes numeric data in zoned decimal format .
Dictionary
$ASCIIw. Format Converts native format character data to ASCII representation. Category: Alignment:
Character left
Syntax $ASCIIw.
Syntax Description w
specifies the width of the output field. Default: 1 Range: 1–32767
Details If ASCII is the native format, no conversion occurs.
Comparisons 3 On EBCDIC systems, $ASCIIw. converts EBCDIC character data to ASCIIw.
Formats
4
$BASE64Xw. Format
109
3 On all other systems, $ASCIIw. behaves like the $CHARw. format.
Examples put x $ascii3.; Value of x
Results
abc
616263
ABC
414243
();
28293B
* The results are hexadecimal representations of ASCII codes for characters. Each two hexadecimal characters correspond to one byte of binary data, and each byte corresponds to one character.
$BASE64Xw. Format Converts character data into ASCII text by using Base 64 encoding. Category:
Character
Alignment:
left
Syntax $BAS64Xw.
Syntax Description
w
specifies the width of the output field. Default: 1 Range: 1-32767
Details Base 64 is an industry encoding method whose encoded characters are determined by using a positional scheme that uses only ASCII characters. Several Base 64 encoding schemes have been defined by the industry for specific uses, such as e-mail or content masking. SAS maps positions 0 - 61 to the characters A - Z, a - z, and 0 - 9. Position 62 maps to the character +, and position 63 maps to the character /. The following are some uses of Base 64 encoding:
3 embed binary data in an XML file 3 encode passwords 3 encode URLs
110
$BINARYw. Format
4
Chapter 3
The ’=’ character in the encoded results indicates that the results have been padded with zero bits. In order for the encoded characters to be decoded, the ’=’ must be included in the value to be decoded.
Examples put x $base64x64.; Value of x
Results
"FCA01A7993BC"
RkNBMDFBNzk5M0JD
"MyPassword"
TXlQYXNzd29yZA==
"www.mydomain.com/myhiddenURL"
d3d3Lm15ZG9tYWluLmNvbi9teWhpZGRlblVSTA==
See Also Informat: “$BASE64Xw. Informat” on page 1239 The XMLDOUBLE option of the LIBNAME Statement for the XML engine, in SAS XML LIBNAME Engine: User’s Guide
$BINARYw. Format Converts character data to binary representation. Character Alignment: left Category:
Syntax $BINARYw.
Syntax Description w
specifies the width of the output field. Default: The default width is calculated based on the length of the variable to be printed. Range: 1–32767
Comparisons The $BINARYw. format converts character values to binary representation. The BINARYw. format converts numeric values to binary representation.
Formats
4
$CHARw. Format
111
Examples put @1 name $binary16.; Value of name
AB
Results ASCII
EBCDIC
----+----1----+----2
----+----1----+----2
0100000101000010
1100000111000010
$CHARw. Format Writes standard character data. Category: Character Alignment:
left
Syntax $CHARw.
Syntax Description w
specifies the width of the output field. Default: 8 if the length of variable is undefined; otherwise, the length of the variable Range: 1–32767
Comparisons 3 The $CHARw. format is identical to the $w. format. 3 The $CHARw. and $w. formats do not trim leading blanks. To trim leading blanks, use the LEFT function to left align character data, or use the PUT statement with the colon (:) format modifier and the format of your choice to produce list output. 3 Use the following table to compare the SAS format $CHAR8. with notation in other programming languages: Language
Notation
SAS
$CHAR8.
C
char [8]
COBOL
PIC x(8)
FORTRAN
A8
PL/I
A(8)
112
$EBCDICw. Format
4
Chapter 3
Examples put @7 name $char4.; Value of name
Results ----+----1
XYZ
XYZ
$EBCDICw. Format Converts native format character data to EBCDIC representation. Category:
Character
Alignment:
left
Syntax $EBCDICw.
Syntax Description
w
specifies the width of the output field. Default: 1 Range:
1–32767
Details If EBCDIC is the native format, no conversion occurs.
Comparisons 3 On ASCII systems, $EBCDICw. converts ASCII character data to EBCDIC. 3 On all other systems, $EBCDICw. behaves like the $CHARw. format.
Examples
put name $ebcdic3.;
Formats
Value of name
Results*
qrs
9899A2
QRS
D8D9E2
+;>
4E5E6E
4
$HEXw. Format
113
* The results are shown as hexadecimal representations of EBCDIC codes for characters. Each two hexadecimal characters correspond to one byte of binary data, and each byte corresponds to one character.
$HEXw. Format Converts character data to hexadecimal representation. Category: Character
left See: $HEXw. Format in the documentation for your operating environment. Alignment:
Syntax $HEXw.
Syntax Description w
specifies the width of the output field. Default: The default width is calculated based on the length of the variable to be printed. Range: 1–32767 Tip: To ensure that SAS writes the full hexadecimal equivalent of your data, make w twice the length of the variable or field that you want to represent. Tip: If w is greater than twice the length of the variable that you want to represent, $HEXw. pads it with blanks.
Details The $HEXw. format converts each character into two hexadecimal characterss. Each blank counts as one character, including trailing blanks.
Comparisons The HEXw. format converts real binary numbers to their hexadecimal equivalent.
Examples put @5 name $hex4.;
114
$MSGCASEw. Format
4
Chapter 3
Value of name
AB
Results EBCDIC
ASCII
----+----1
----+----1
C1C2
4142
$MSGCASEw. Format Writes character data in uppercase when the MSGCASE system option is in effect. Character Alignment: left Category:
Syntax $MSGCASEw.
Syntax Description w
specifies the width of the output field. Default: 1 if the length of the variable is undefined; otherwise, the length of the variable Range: 1–32767
Details When the MSGCASE= system option is in effect, all notes, warnings, and error messages that SAS generates appear in uppercase. Otherwise, all notes, warnings, and error messages appear in mixed case. You specify the MSGCASE= system option in the configuration file or during the SAS invocation. Operating Environment Information: For more information about the MSGCASE= system option, see the SAS documentation for your operating environment. 4
Examples
put name $msgcase.; Value of name
Results
sas
SAS
Formats
4
$N8601Bw.d Format
115
$N8601Bw.d Format Writes ISO 8601 duration, datetime, and interval forms using the basic notations PnYnMnDTnHnMnS and yyyymmddThhmmss. Category:
Date and Time ISO 8601
Alignment:
left
Time Zone Format:
No
ISO 8601 Element: 5.4.4 Complete representation
Syntax $N8601Bw.d
Syntax Description
w
specifies the width of the output field. Default: 50 Range: 1 - 200
The minimum length for a duration value or a datetime value is 16. The minimum length for an interval value is 16.
Requirement: d
specifies the number of digits to the right of the lowest order component. This argument is optional. Default: 0 Range: 0 - 3
Details The $N8601B format writes ISO 8601 duration, datetime, and interval values as character data for the following basic notations: PnYnMnDTnHnMnS yyyymmddThhmmss PnYnMnDTnHnMnS/yyyymmddThhmmss yyyymmddThhmmssT/PnYnMnDTnHnMnS The lowest order component can contain fractions, as in these examples: p2y3.5m p00020304T05.335
Examples put nb $n8601b.;
116
$N8601BAw.d Format
4
Chapter 3
Value of nb
Results
0002405050112FFC
P2Y4M5DT5H1M12S
2008915155300FFD
20080915T155300
2008915000000FFD2010915000000FFD
20080915T000000/20100915T000000
0033104030255FFC2008915155300FFD
P33Y1M4DT3H2M55S/20080915T155300
See Also “Working with Dates and Times Using the ISO 8601 Basic and Extended Notations” on page 94
$N8601BAw.d Format Writes ISO 8601 duration, datetime, and interval forms using the basic notations PyyyymmddThhmmss and yyyymmddThhmmss. Category:
Date and Time ISO 8601
Alignment:
left
Time Zone Format:
No
ISO 8601 Element: 5.5.4.2 Alternative format
Syntax $N8601BAw.d
Syntax Description
w
specifies the width of the output field. Default: 50 Range:
1 - 200
The minimum length for a duration value or a datetime value is 16. The minimum length for an interval value is 16.
Requirement: d
specifies the number of digits to the right of the lowest order component. This argument is optional. Default: 0 Range:
0-3
Formats
4
$N8601Ew.d Format
117
Details The $N8601BA format writes ISO 8601 duration, datetime, and interval values as character data for the following basic notations: PyyyymmddThhmmss yyyymmddThhmmss PyyyymmddThhmmss/yyyymmddThhmmss yyyymmddThhmmss/PyyyymmddThhmmss The lowest order component can contain fractions, as in these examples: p00023.5 00020304T05.335
Examples put @1 nba $N8601ba.; Value of nba
Results
00024050501127D0
P00020405T050112.5
2008915155300FFD
20080915T155300
00023040506075282008915155300FFD
P00020304T050607.33/20080915T155300
See Also “Working with Dates and Times Using the ISO 8601 Basic and Extended Notations” on page 94
$N8601Ew.d Format Writes ISO 8601 duration, datetime, and interval forms using the extended notations PnYnMnDTnHnMnS and yyyy-mm-ddThh:mm:ss. Category:
Date and Time ISO 8601
Alignment:
left
Time Zone Format:
No
ISO 8601 Element: 5.4.4 Complete representation
Syntax $N8601Ew.d
118
$N8601Ew.d Format
4
Chapter 3
Syntax Description
w
specifies the width of the output field. Default: 50 Range:
1 - 200
The minimum length for a duration value or a datetime value is 16. The minimum length for an interval value is 16.
Requirement: d
specifies the number of digits to the right of the lowest order component. This argument is optional. Default: 0 Range:
0-3
Details The $N8601B format writes ISO 8601 duration, datetime, and interval values as character data for the following basic notations: PnYnMnDTnHnMnS yyyy-mm-ddThh:mm:ss PnYnMnDTnHnMnS/yyyy-mm-ddThh:mm:ss yyyy-mm-ddThh:mm:ssT/PnYnMnDTnHnMnS The lowest order component can contain fractions, as in these examples: p2y3.5m p0002--03--04T05.335
Examples put @1 ne $n8601e.; Value of ne
Results
00024050501127D0
P2Y4M5DT5H1M12.5S
2008915155300FFD
2008-09-15T15:53:00
2008915000000FFD2010915000000FFD
2008-09-15T00:00:00/2010-09-15T00:00:00
0033104030255FFC2008915155300FFD
P33Y1M4DT3H2M55S/2008-09-15T15:53:00
See Also “Working with Dates and Times Using the ISO 8601 Basic and Extended Notations” on page 94
Formats
4
$N8601EAw.d Format
119
$N8601EAw.d Format Writes ISO 8601 duration, datetime, and interval forms using the extended notations Pyyyy-mm-ddThh:mm:ss and yyyy-mm-ddThh:mm:ss. Category:
Date and Time ISO 8601
Alignment:
left
Time Zone Format:
No
ISO 8601 Element: 5.4.4 Complete representation
Syntax $N8601EAw.d
Syntax Description
w
specifies the width of the output field. Default: 50 Range: 1 - 200
The minimum length for a duration value or a datetime value is 16. The minimum length for an interval value is 16.
Requirement: d
specifies the number of digits to the right of the lowest order component. This argument is optional. Default: 0 Range: 0 - 3
Details The $N8601EA format writes ISO 8601 duration, datetime, and interval values as character data for the following basic notations: Pyyyy-mm-ddThh:mm:ss yyyy-mm-ddThh:mm:ss Pyyyy-mm-ddThh:mm:ss/yyyy-mm-ddThh:mm:ss yyyy-mm-ddThh:mm:ss/Pyyyy-mm-ddThh:mm:ss The lowest order component can contain fractions, as in these examples: p00023.5 0002--03--04T05.335
Examples put @1 nea $N8601ea.;
120
$N8601EHw.d Format
4
Chapter 3
Value of nea
Results
00024050501127D0
P0002-04-05T05:01:12.500
2008915155300FFD
2008-09-15T15:53:00
00023040506075282008915155300FFD
P0002-03-04T05:06:07.330/2008-09-15T15:53:00
See Also “Working with Dates and Times Using the ISO 8601 Basic and Extended Notations” on page 94
$N8601EHw.d Format Writes ISO 8601 duration, datetime, and interval forms using the extended notations Pyyyy-mm-ddThh:mm:ss and yyyy-mm-ddThh:mm:ss, using a hyphen ( - )for omitted components. Category:
Date and Time ISO 8601
Time Zone Format:
No
ISO 8601 Element: 5.4.4 Complete representation
Syntax $N8601EHw.d
Syntax Description
w
specifies the width of the output field. Default: 50 Range:
1 - 200
The minimum length for a duration value or a datetime value is 16. The minimum length for an interval value is 16.
Requirement: d
specifies the number of digits to the right of the lowest order component. This argument is optional. Default: 0 Range:
0-3
Formats
4
$N8601EXw.d Format
121
Details The $N8601H format writes ISO 8601 duration, datetime, and interval values as character data, using a hyphen ( - )to represent omitted components, for the following extended notations: Pyyyy-mm-ddThh:mm:ss yyyy-mm-ddThh:mm:ss Pyyyy-mm-ddThh:mm:ss/yyyy-mm-ddThh:mm:ss yyyy-mm-ddThh:mm:ss/Pyyyy-mm-ddThh:mm:ss yyyy-mm-ddThh:mm:ss/yyyy-mm-ddThh:mm:ss Omitted datetime components are always displayed, they are never truncated.
Examples put a $n8601eh.; Value of a
Results
00023FFFFFFFFFFC2008FFF15FFFFFFD
P0002-03---T-:-:-/2008------T15:-:-
2008FFF15FFFFFFdFFFF3FF1553FFFFC
2008------T15:-:-/P---03---T15:53:-
See Also “Working with Dates and Times Using the ISO 8601 Basic and Extended Notations” on page 94
$N8601EXw.d Format Writes ISO 8601 duration, datetime, and interval forms using the extended notations Pyyyy-mm-ddThh:mm:ss and yyyy-mm-ddThh:mm:ss, using an x for digit of an omitted component. Category:
Date and Time ISO 8601
Alignment:
left
Time Zone Format:
No
ISO 8601 Element: 5.5.3, 5.5.4.1, 5.5.4.2
Syntax $N8601Xw.d
122
$N8601EXw.d Format
4
Chapter 3
Syntax Description w
specifies the width of the output field. Default: 50 Range:
1 - 200
The minimum length for a duration value or a datetime value is 16. The minimum length for an interval value is 16.
Requirement: d
specifies the number of digits to the right of the lowest order component. This argument is optional. Default: 0 Range:
0-3
Details The $N8601H format writes ISO 8601 duration, datetime, and interval values as character data, using a hyphen ( - ) to represent omitted components, for the following extended notations: Pyyyy-mm-ddThh:mm:ss yyyy-mm-ddThh:mm:ss Pyyyy-mm-ddThh:mm:ss/yyyy-mm-ddThh:mm:ss yyyy-mm-ddThh:mm:ss/Pyyyy-mm-ddThh:mm:ss yyyy-mm-ddThh:mm:ss/yyyy-mm-ddThh:mm:ss Omitted datetime components are always displayed, they are never truncated.
Examples put nex $n8601ex.; Value of nex
Results
00023FFFFFFFFFFC2008FFF15FFFFFFD
P0002-03xxTxx:xx:xx/2008--xx-xxT15:xx:xx
2008FFF15FFFFFFdFFFF3FF1553FFFFC
2008-xx-xxT15:xx:xx/Pxxxx-03-xxT15:53:xx
See Also “Working with Dates and Times Using the ISO 8601 Basic and Extended Notations” on page 94
Formats
4
$N8601Hw.d Format
123
$N8601Hw.d Format Writes ISO 8601 duration, datetime, and interval forms PnYnMnDTnHnMnS and yyyy-mm-ddThh:mm:ss, dropping omitted components in duration values and using a hyphen ( - ) for omitted components in datetime values. Category:
Date and Time ISO 8601
Alignment:
left
Time Zone Format:
No
ISO 8601 Element: 5.5.3, 5.5.4.1, 5.5.4.2
Syntax $N8601Hw.d
Syntax Description w
specifies the width of the output field. Default: 50 Range: 1 - 200
The minimum length for a duration value or a datetime value is 16. The minimum length for an interval value is 16.
Requirement: d
specifies the number of digits to the right of the lowest order component. This argument is optional. Default: 0 Range: 0 - 3
Details The $N8601H format writes ISO 8601 durations, intervals, and datetimes in the following forms, omitting components in the PnYnMnDTnHnMnS form and using a hyphen ( - )to represent omitted components in the datetime form: PnYnMnDTnHnMnS yyyy-mm-ddThh:mm:ss PnYnMnDTnHnMnS/yyyy-mm-ddThh:mm:ss yyyy-mm-ddThh:mm:ssT/PnYnMnDTnHnMnS yyyy-mm-ddThh:mm:ss/yyyy-mm-ddThh:mm:ss Omitted datetime components are always displayed, they are never truncated.
Examples put nh $n8601h.;
124
$N8601Xw.d Format
4
Chapter 3
Value of nh
Results
0002304FFFFFFFFC2008FFF15FFFFFFD
P2Y3M4D/2008------T15:-:-
FFFF102FFFFFFFFD2008FFF15FFFFFFD
---01-02T-:-:-0/2008------T15:-:-
See Also “Working with Dates and Times Using the ISO 8601 Basic and Extended Notations” on page 94
$N8601Xw.d Format Writes ISO 8601 duration, datetime, and interval forms PnYnMnDTnHnMnS and yyyy-mm-ddThh:mm:ss, dropping omitted components in duration values and using an x for each digit of an omitted component in datetime values. Category:
Date and Time ISO 8601
Alignment:
left
Time Zone Format:
No
ISO 8601 Element: 5.5.3, 5.5.4.1, 5.5.4.2
Syntax $N8601Xw.d
Syntax Description
w
specifies the width of the output field. Default: 50 Range:
1 - 200
The minimum length for a duration value or a datetime value is 16. The minimum length for an interval value is 16.
Requirement: d
specifies the number of digits to the right of the lowest order component. This argument is optional. Default: 0 Range:
0-3
Formats
4
$OCTALw. Format
125
Details The $N8601X format writes ISO 8601 durations, intervals, and datetimes in the following forms, omitting components in the PnYnMnDTnHnMnS form and using an x to represent omitted components in the datetime form: PnYnMnDTnHnMnS yyyy-mm-ddThh:mm:ss PnYnMnDTnHnMnS/yyyy-mm-ddThh:mm:ss yyyy-mm-ddThh:mm:ssT/PnYnMnDTnHnMnS yyyy-mm-ddThh:mm:ss/yyyy-mm-ddThh:mm:ss Omitted datetime components are always displayed, they are never truncated.
Examples put nx $n8601x.; Value of nx
Results
0002304FFFFFFFFC2008FFF15FFFFFFD
P2Y3M4D/2008-xx-xxT15:xx:xx
FFFF102FFFFFFFFD2008FFF15FFFFFFd
xxxx-01-02Txx:xx:xx/2008-xx-xxT15:xx:xx
See Also “Working with Dates and Times Using the ISO 8601 Basic and Extended Notations” on page 94
$OCTALw. Format Converts character data to octal representation. Category: Character Alignment:
left
Syntax $OCTALw.
Syntax Description w
specifies the width of the output field. Default: The default width is calculated based on the length of the variable to be
printed.
126
$QUOTEw. Format
4
Chapter 3
1–32767 Tip: Because each character value generates three octal characters, increase the value of w by three times the length of the character value. Range:
Comparisons The $OCTALw. format converts character values to the octal representation of their character codes. The OCTALw. format converts numeric values to octal representation.
Example The following example shows ASCII output when you use the $OCTALw. format. data _null_; infile datalines truncover; input item $5.; put item $octal15.; datalines; art rice bank ; run;
SAS writes the following results to the log. 141162164040040 162151143145040 142141156153040
$QUOTEw. Format Writes data values that are enclosed in double quotation marks. Character Alignment: left Category:
Syntax $QUOTEw.
Syntax Description w
specifies the width of the output field. Default: 2 if the length of the variable is undefined; otherwise, the length of the variable + 2 Range: 2–32767
Formats
4
$QUOTEw. Format
127
Make w wide enough to include the left and right quotation marks.
Tip:
Details The following list describes the output that SAS produces when you use the $QUOTEw. format. For examples of these items, see “Examples” on page 127.
3 If your data value is not enclosed in quotation marks, SAS encloses the output in double quotation marks.
3 If your data value is not enclosed in quotation marks, but the value contains a single quotation mark, SAS
3 encloses the data value in double quotation marks 3 does not change the single quotation mark. 3 If your data value begins and ends with single quotation marks, and the value contains double quotation marks, SAS
3 encloses the data value in double quotation marks 3 duplicates the double quotation marks that are found in the data value 3 does not change the single quotation marks. 3 If your data value begins and ends with single quotation marks, and the value contains two single contiguous quotation marks, SAS
3 encloses the value in double quotation marks 3 does not change the single quotation marks. 3 If your data value begins and ends with single quotation marks, and contains both double quotation marks and single, contiguous quotation marks, SAS
3 encloses the value in double quotation marks 3 duplicates the double quotation marks that are found in the data value 3 does not change the single quotation marks. 3 If the length of the target field is not large enough to contain the string and its quotation marks, SAS returns all blanks.
Examples put name $quote20.; Value of name
Results ----+----1
SAS
"SAS"
SAS’s
"SAS’s"
’ad"verb"’
"’ad""verb""’"
’ad’’verb’
"’ad’’verb’"
’"ad"’’"verb"’
"’""ad""’’""verb""’"
deoxyribonucleotide
128
$REVERJw. Format
4
Chapter 3
$REVERJw. Format Writes character data in reverse order and preserves blanks. Category: Alignment:
Character right
Syntax $REVERJw.
Syntax Description w
specifies the width of the output field. Default: 1 if w is not specified Range: 1–32767
Comparisons The $REVERJw. format is similar to the $REVERSw. format except that $REVERSw. left aligns the result by trimming all leading blanks.
Examples put @1 name $reverj7.; Name
Results ----+----1
ABCD###
DCBA
###ABCD
DCBA
* The character # represents a blank space.
$REVERSw. Format Writes character data in reverse order and left aligns Category: Alignment:
Character left
Syntax $REVERSw.
Formats
4
$UPCASEw. Format
129
Syntax Description w
specifies the width of the output field. Default: 1 if w is not specified Range: 1–32767
Comparisons The $REVERSw. format is similar to the $REVERJw. format except that $REVERJw. does not left align the result.
Examples put @1 name $revers7.; Name
Results ----+----1
ABCD###
DCBA
###ABCD
DCBA
* The character # represents a blank space.
$UPCASEw. Format Converts character data to uppercase. Category: Character Alignment:
left
Syntax $UPCASEw.
Syntax Description w
specifies the width of the output field. Default: 8 if the length of the variable is undefined; otherwise, the length of the variable Range: 1–32767
Details Special characters, such as hyphens and other symbols, are not altered.
130
4
$VARYINGw. Format
Chapter 3
Examples
put @1 name $upcase9.; Value of name
Results ----+----1
coxe-ryan
COXE-RYAN
$VARYINGw. Format Writes character data of varying length. Valid:
in DATA step
Category:
Character
Alignment:
left
Syntax $VARYINGw. length-variable
Syntax Description
w
specifies the maximum width of the output field for any output line or output file record. Default: 8 if the length of the variable is undefined; otherwise, the length of the
variable Range:
1–32767
length-variable
specifies a numeric variable that contains the length of the current value of the character variable. SAS obtains the value of the length-variable by reading it directly from a field that is described in an INPUT statement, reading the value of a variable in an existing SAS data set, or calculating its value. You must specify length-variable immediately after $VARYINGw. in a SAS statement.
Requirement: Restriction:
Length-variable cannot be an array reference.
If the value of length-variable is 0, negative, or missing, SAS writes nothing to the output field. If the value of length-variable is greater than 0 but less than w, SAS writes the number of characters that are specified by length-variable. If length-variable is greater than or equal to w, SAS writes w columns.
Tip:
Formats
4
$VARYINGw. Format
131
Details Use $VARYINGw. when the length of a character value differs from record to record. After writing a data value with $VARYINGw., the pointer’s position is the first column after the value.
Examples Example 1: Obtaining a Variable Length Directly
An existing data set variable contains the length of a variable. The data values and the results follow the explanation of this SAS statement: put @10 name $varying12. varlen;
NAME is a character variable of length 12 that contains values that vary from 1 to 12 characters in length. VARLEN is a numeric variable in the same data set that contains the actual length of NAME for the current observation. Value of name*
Results ----+----1----+----2----+
New York 8
New York
Toronto 7
Toronto
Buenos Aires 12
Buenos Aires
Tokyo 5
Tokyo
* The value of NAME appears before the value of VARLEN.
Example 2: Obtaining a Variable Length Indirectly
Use the LENGTH function to determine the length of a variable. The data values and the results follow the explanation of these SAS statements: varlen=length(name); put @10 name $varying12. varlen;
The assignment statement determines the length of the varying-length variable. The variable VARLEN contains this length and becomes the length-variable argument to the $VARYING12. format.
Values*
Results ----+----1----+----2----+
New York
New York
Toronto
Toronto
Buenos Aires
Buenos Aires
Tokyo
Tokyo
* The value of NAME appears before the value of VARLEN.
132
4
$w. Format
Chapter 3
$w. Format Writes standard character data. Character Alignment: left Alias: $Fw. Category:
Syntax $w.
Syntax Description w
specifies the width of the output field. You can specify a number or a column range. Default: 1 if the length of the variable is undefined; otherwise, the length of the variable Range: 1–32767
Comparisons The $w. format and the $CHARw. format are identical, and they do not trim leading blanks. To trim leading blanks, use the LEFT function to left align character data, or use list output with the colon (:) format modifier and the format of your choice.
Examples
put @10 name $5.; put name $ 10-15; Value of name*
Results ----+----1----+----2
#Cary
Cary
Tokyo
Tokyo
* The character # represents a blank space.
Formats
4
BESTw. Format
133
BESTw. Format SAS chooses the best notation. Category: Numeric
right BESTw. Format in the documentation for your operating environment.
Alignment: See:
Syntax BESTw.
Syntax Description w
specifies the width of the output field. Default: 12 Tip: If you print numbers between 0 and .01 exclusively, then use a field width of at least 7 to avoid excessive rounding. If you print numbers between 0 and -.01 exclusively, then use a field width of at least 8. Range: 1–32
Details When a format is not specified for writing a numeric value, SAS uses the BESTw. format as the default format. The BESTw. format writes numbers as follows: 3 Values are written with the maximum precision, as determined by the width. 3 Integers are written without decimals. 3 Numbers with decimals are written with as many digits to the left and right of the decimal point as needed or as allowed by the width. 3 Values that can be written within the given width are written without trailing zeros. 3 Values that cannot be written within the given width are written with the maximum allowable number of decimal places as determined by the width. 3 Extreme values might be written in scientific notation. SAS stores the complete value regardless of the format that is used.
Comparisons 3 The BESTw. format writes as many significant digits as possible in the output field, but if the numbers vary in magnitude, the decimal points do not line up. Integers print without a decimal. 3 The Dw.p format writes numbers with the desired precision and more alignment than the BESTw format. 3 The BESTDw.p format is a combination of the BESTw. format and the Dw.p format in that it formats all numeric data, and it does a better job of aligning decimals than the BESTw. format.
134
BESTDw.p Format
4
Chapter 3
3 The w.d format aligns decimal points, if possible, but does not necessarily show the same precision for all numbers.
Examples The following statements produce these results. SAS Statements
Results ----+----1----+----2
x=1257000; put x best6.;
1.26E6
x=1257000; put x best3.;
1E6
See Also Format: “BESTDw.p Format” on page 134
BESTDw.p Format Prints numeric values, lining up decimal places for values of similar magnitude, and prints integers without decimals. Category:
Numeric right
Alignment:
Syntax BESTDw.p
Syntax Description w
optionally specifies the width of the output field. Default: 12 Range:
1–32
p
specifies the precision. This argument is optional. Default: 3 Range:
0 to w–1
Requirement: Tip:
must be less than w
If p is omitted or is specified as 0, then p is set to 3.
Formats
4
BESTDw.p Format
135
Details The BESTDw.p format writes numbers so that the decimal point aligns in groups of values with similar magnitude. Integers are printed without a decimal point. Larger values of p print the data values with more precision and potentially more shifts in the decimal point alignment. Smaller values of p print the data values with less precision and a greater chance of decimal point alignment. The format chooses the number of decimal places to print for ranges of values, even when the underlying values can be represented with fewer decimal places.
Comparisons 3 The BESTw. format writes as many significant digits as possible in the output field, but if the numbers vary in magnitude, the decimal points do not line up. Integers print without a decimal.
3 The Dw.p format writes numbers with the desired precision and more alignment than the BESTw format.
3 The BESTDw.p format is a combination of the BESTw. format and the Dw.p format in that it formats all numeric data, and it does a better job of aligning decimals than the BESTw. format.
3 The w.d format aligns decimal points, if possible, but it does not necessarily show the same precision for all numbers.
Examples put x bestd14.; Data Line
Results —-+—-1—-+
12345 12345 123.45
123.4500000
1.2345
1.2345000
.12345 0.1234500 1.23456789
See Also Formats: “BESTw. Format” on page 133 “Dw.p Format” on page 148
1.23456789
136
BINARYw. Format
4
Chapter 3
BINARYw. Format Converts numeric values to binary representation. Numeric Alignment: left Category:
Syntax BINARYw.
Syntax Description w
specifies the width of the output field. Default: 8 Range: 1–64
Comparisons BINARYw. converts numeric values to binary representation. The $BINARYw. format converts character values to binary representation.
Examples put @1 x binary8.; Value of x
Results ----+----1
123.45
01111011
123
01111011
-123
10000101
Formats
4
B8601DAw. Format
137
B8601DAw. Format Writes date values using the IOS 8601 base notation yyyymmdd. Category: Date and Time
ISO 8601 Alignment:
left
Time Zone Format:
No
ISO 8601 Element: 5.2.1.1 Complete representation
Syntax B8601DAw.
Syntax Description
w
specifies the width of the output field. Default: 10 Requirement:
The width of the output field must be 10.
Details The B8601DA format writes the ISO 8601 basic date notation yyyymmdd: yyyy
is a four-digit year, such as 2008
mm
is a two-digit month (zero padded) between 01 and 12
dd
is a two-digit day of the month (zero padded) between 01 and 31
Examples put bda $b8601da.; Value of bda
Results
17790
20080915
See Also “Working with Dates and Times Using the ISO 8601 Basic and Extended Notations” on page 94
138
B8601DNw. Format
4
Chapter 3
B8601DNw. Format Writes the date from a datetime value using the ISO 8601 basic notation yyyymmdd. Category:
Date and Time ISO 8601
Alignment:
left
Time Zone Format:
No
ISO 8601 Element:
5.2.1.1 Complete representation
Syntax B8601DNw.
Syntax Description
w
specifies the width of the output field. Default: 10 Requirement:
The width of the input field must be 10.
Details The B8601DN format writes the date from a datetime value using the ISO 8601 basic date notation yyyymmdd: yyyy
is a four-digit year, such as 2008
mm
is a two-digit month (zero padded) between 01 and 12
dd
is a two-digit day of the month (zero padded) between 01 and 31
Examples put bdn b8601dn.; Value of bdn
Results
1537113180
20080915
See Also “Working with Dates and Times Using the ISO 8601 Basic and Extended Notations” on page 94
Formats
4
B8601DTw.d Format
139
B8601DTw.d Format Writes datetime values in the ISO 8601 basic notation yyyymmddThhmmssffffff. Category: Date and Time
ISO 8601 Alignment:
left
Time Zone Format:
No
ISO 8601 Element: 5.4.1 Complete representation
Syntax B8601DTw.d
Syntax Description
w
specifies the width of the output field. Default: 19 Range: 19 - 26 d
specifies the number of digits to the right of the seconds value that represents a fraction of a second. This argument is optional. Default: 0 Range: 0 - 6
Details The B8601DT format writes ISO 8601 basic datetime notation yyyymmddThhmmssffffff: yyyy
is a four-digit year, such as 2008
mm
is a two-digit month (zero padded) between 01 and 12
dd
is a two-digit day of the month (zero padded) between 01 and 31
hh
is a two-digit hour (zero padded), between 00 - 23
mm
is a two-digit minute (zero padded), between 00 - 59
ss
is a two-digit second (zero padded), between 00 - 59
.ffffff
are optional fractional seconds, with a precision of up to six digits, where each digit is between 0 - .
Examples put bdt b8601dt.;
140
B8601DZw. Format
4
Chapter 3
Value of bdt
Results
——+——1 1537113180
20080915T155300
See Also “Working with Dates and Times Using the ISO 8601 Basic and Extended Notations” on page 94
B8601DZw. Format Writes datetime values in the Coordinated Universal Time (UTC) time scale using ISO 8601 datetime and time zone basic notation yyyymmddThhmmss+|-hhmm. Date and Time ISO 8601 Alignment: left Time Zone Format: Yes ISO 8601 Element: 5.4.1 Complete representation Category:
Syntax B8601DZw.
Syntax Description w
specifies the width of the output field. Default: 26 Range: 20 - 35
Details UTC values specify a time and a time zone based on the zero meridian in Greenwich, England. The B8602DZ format writes SAS datetime values for the zero meridian date and time using one of the following ISO 8601 basic datetime notations: yyyymmddThhmmss+|–hhmm is the form used whenw is large enough to support this time zone notation. yyyymmddThhmmssZ is the form used when w is not large enough to support the +|-hhmm time zone notation. where
Formats
4
B8601LZw. Format
141
yyyy
is a four-digit year, such as 2008
mm
is a two-digit month (zero padded) between 01 and 12
dd
is a two-digit day of the month (zero padded) between 01 and 31
hh
is a two-digit hour (zero padded), between 00 - 23
mm
is a two-digit minute (zero padded), between 00 - 59
ss
is a two-digit second (zero padded), between 00 - 59
Z
indicates that the time is for zero meridian (Greenwich, England) or UTC time
+|-hhmm
is an hour and minute signed offset from zero meridian time. Note that the offset must be +|-hhmm (that is, + or - and four characters). Use + for time zones east of the zero meridian and use - for time zones west of the zero meridian. For example, +0200 indicates a two-hour time difference to the east of the zero meridian, and -0600 indicates a six-hour time differences to the west of the zero meridian. Restriction: The shorter form +|-hh is not supported.
Examples
SAS Statement
Value of bdz
Results
put bdz b8601dz20.;
1537113180
20080915T155300Z
put bdz b8601dz26.;
1537113180
20080915T155300+0000
See Also “Working with Dates and Times Using the ISO 8601 Basic and Extended Notations” on page 94
B8601LZw. Format Writes time values as local time by appending a time zone offset difference between the local time and UTC, using the ISO 8601 basic time notation hhmmss+|-hhmm. Category: Date and Time
ISO 8601 Alignment:
left
Time Zone Format: Yes. The format appends the UTC offset to the value as determined by the local SAS session. ISO 8601 Element: 5.3.3, 5.3.4.2
142
B8601LZw. Format
4
Chapter 3
Syntax B8601LZw.
Syntax Description
w
specifies the width of the output field. Default: 14 Range:
9 - 20
Details The B8602LZ format writes time values without making any adjustments and appends the UTC time zone offset for the local SAS session, using the following ISO 8601 basic notation: hhmmss+|–hhmm where hh
is a two-digit hour (zero padded), between 00 - 23
mm
is a two-digit minute (zero padded), between 00 - 59
ss
is a two-digit second (zero padded), between 00 - 59
+|-hhmm
is an hour and minute signed offset from zero meridian time. Note that the offset must be +|-hhmm (that is, + or - and five characters). Use + for time zones east of the zero meridian and use - for time zones west of the zero meridian. For example, +0200 indicates a two hour time difference to the east of the zero meridian, and -0600 indicates a six hour time differences to the west of the zero meridian. Restriction: The shorter form +|-hh is not supported.
When SAS reads a UTC time by using the B8601TZ informat, and the adjusted time is greater than 24 hours or less than 00 hours, SAS adjusts the value so that the time is between 000000 and 235959. If the B8601LZ format attempts to format a time outside of this time range, the time is formatted with stars to indicate that the value is out of range.
Examples The following PUT statement writes the time for the Eastern Standard time zone: put blz b8601lz.; Value of blz
Results
46380
125300-0500
Formats
4
B8601TMw.d Format
143
See Also “Working with Dates and Times Using the ISO 8601 Basic and Extended Notations” on page 94
B8601TMw.d Format Writes time values using the ISO 8601 basic notation hhmmssffff. Category: Date and Time
ISO 8601 Alignment: left Time Zone Format: No ISO 8601 Element: 5.3.1.1 Complete representation
Syntax B8601TMw.d
Syntax Description w
specifies the width of the output field. Default: 8 Range: 8 - 15 d
specifies the number of digits to the right of the seconds value that represent a fraction of a second. This argument is optional. Default: 0 Range: 0 - 6
Details The B8601TM format writes SAS time values using the following ISO 8601 basic time notation hhmmssffffff: hh
is a two-digit hour (zero padded), between 00 - 23.
mm
is a two-digit minute (zero padded), between 00 - 59.
ss
is a two-digit second (zero padded), between 00 - 59.
ffffff
are optional fractional seconds, with a precision of up to six digits, where each digit is between 0 - 9.
Examples put btm b8601tm.;.
144
B8601TZw. Format
4
Chapter 3
Value of btm
Results
57180
155300
See Also “Working with Dates and Times Using the ISO 8601 Basic and Extended Notations” on page 94
B8601TZw. Format Adjusts time values to the Coordinated Universal Time (UTC) and writes them using the ISO 8601 basic time notation hhmmss+|-hhmm. Category:
Date and Time ISO 8601
Alignment:
left
Time Zone Format:
Yes
ISO 8601 Element: 5.3.3, 5.3.4
Syntax B8601TZw.
Syntax Description w
specifies the width of the output field. Default: 14 Range:
9–20
Details UTC time values specify a time and a time zone based on the zero meridian in Greenwich, England. The B8602TZ format adjusts the time value to be the time at the zero meridian and writes it in one of the following ISO 8601 basic time notations: hhmmss+|– hhmm
is the form used when w is large enough to support this time notation.
hhmmssZ
is the form used when w is not large enough to support the +|-hhmm time zone notation.
where hh
is a two-digit hour (zero padded), between 00 and 23.
Formats
4
B8601TZw. Format
145
mm
is a two-digit minute (zero padded), between 00 and 59.
ss
is a two-digit second (zero padded), between 00 and 59.
Z
indicates that the time is for zero meridian (Greenwich, England) or UTC time.
+|–hh:mm
is an hour and minute signed offset from zero meridian time. Note that the offset must be +|–hhmm (that is, + or – and four characters). Use + for time zones east of the zero meridian and use – for time zones west of the zero meridian. For example, +0200 indicates a two hour time difference to the east of the zero meridian, and –0600 indicates a six hour time differences to the west of the zero meridian.
Restriction: The shorter form +|–hh is not supported. When SAS reads a UTC time by using the B8601TZ informat, and the adjusted time is greater than 24 hours or less than 00 hours, SAS adjusts the value so that the time is between 000000 and 240000. If the B8601TZ format attempts to format a time outside of this time range, the time is formatted with stars to indicate that the value is out of range.
Comparisons For time values between 000000 and 240000, the B8601TZ format adjusts the time value to be the time at the zero meridian and writes it in the international standard extended time notation. The B8601LZ format makes no adjustment to the time and writes time values in the international standard extended time notation, using a UTC time zone offset for the local SAS session.
Examples put btz b8601tz.; Values for btz
Results
73441
202401+0000
See Also “Working with Dates and Times Using the ISO 8601 Basic and Extended Notations” on page 94
146
COMMAw.d Format
4
Chapter 3
COMMAw.d Format Writes numeric values with a comma that separates every three digits and a period that separates the decimal fraction. Category: Alignment:
Numeric right
Syntax COMMAw.d
Syntax Description w
specifies the width of the output field. Default: 6 Range: 1–32 Tip: Make w wide enough to write the numeric values, the commas, and the optional decimal point. d
specifies the number of digits to the right of the decimal point in the numeric value. This argument is optional. Range: 0–31 Requirement: must be less than w
Details The COMMAw.d format writes numeric values with a comma that separates every three digits and a period that separates the decimal fraction.
Comparisons 3 The COMMAw.d format is similar to the COMMAXw.d format, but the COMMAXw.d format reverses the roles of the decimal point and the comma. This convention is common in European countries. 3 The COMMAw.d format is similar to the DOLLARw.d format except that the COMMAw.d format does not print a leading dollar sign.
Examples put @10 sales comma10.2; Value of sales
Results ----+----1----+----2
23451.23 123451.234
23,451.23 123,451.23
Formats
4
COMMAXw.d Format
147
See Also Formats: “COMMAXw.d Format” on page 147 “DOLLARw.d Format” on page 158
COMMAXw.d Format Writes numeric values with a period that separates every three digits and a comma that separates the decimal fraction. Category: Numeric Alignment:
right
Syntax COMMAXw.d
Syntax Description w
specifies the width of the output field. This argument is optional. Default: 6 Range: 1–32
Make w wide enough to write the numeric values, the commas, and the optional decimal point.
Tip: d
specifies the number of digits to the right of the decimal point in the numeric value. Range: 0–31 Requirement:
must be less than w
Details The COMMAXw.d format writes numeric values with a period that separates every three digits and with a comma that separates the decimal fraction.
Comparisons The COMMAw.d format is similar to the COMMAXw.d format, but the COMMAXw.d format reverses the roles of the decimal point and the comma. This convention is common in European countries.
Examples put @10 sales commax10.2;
148
Dw.p Format
4
Chapter 3
Value of sales
Results ----+----1----+----2
23451.23
23.451,23
123451.234
123.451,23
Dw.p Format Prints numeric values, possibly with a great range of values, lining up decimal places for values of similar magnitude. Category:
Numeric right
Alignment:
Syntax Dw.p
Syntax Description w
specifies the width of the output field. This argument is optional. Default: 12 Range:
1–32
p
specifies the precision. This argument is optional. Default: 3 Range:
0–9
Requirement:
p must be less than w
Tip:
If p is omitted or is specified as 0, then p is set to 3.
Tip:
If zero is the desired precision, use the w.d format in place of the Dw.p format.
Details The Dw.p format writes numbers so that the decimal point aligns in groups of values with similar magnitude. Larger values of p print the data values with more precision and potentially more shifts in the decimal point alignment. Smaller values of p print the data values with less precision and a greater chance of decimal point alignment.
Comparisons 3 The BESTw. format writes as many significant digits as possible in the output field, but if the numbers vary in magnitude, the decimal points do not line up.
Formats
4
DATEw. Format
149
3 Dw.p writes numbers with the desired precision and more alignment than the BESTw format.
3 The BESTDw.p format is a combination of the BESTw. format and the Dw.p format in that it formats all numeric data, and it does a better job of aligning decimals than the BESTw. format.
3 The w.d format aligns decimal points, if possible, but it does not necessarily show the same precision for all numbers.
Examples put @1 x d10.4; Value of x
Results ----+----1----+----2
12345
12345.0
1234.5
1234.5
123.45
123.45000
12.345
12.34500
1.2345
1.23450
.12345
0.12345
See Also Format: “BESTDw.p Format” on page 134
DATEw. Format Writes date values in the form ddmmmyy, ddmmmyyyy, or dd-mmm-yyyy. Category: Date and Time Alignment:
right
Syntax DATEw.
Syntax Description w
specifies the width of the output field. Default: 7 Range: 5–11
150
DATEAMPMw.d Format
4
Chapter 3
Use a width of 9 to print a 4-digit year without a separator between the day, month, and year. Use a width of 11 to print a 4-digit year using a hyphen as a separator between the day, month, and year
Tip:
Details The DATEw. format writes SAS date values in the form ddmmmyy, ddmmmyyyy, or dd-mmm-yyyy, where dd is an integer that represents the day of the month. mmm is the first three letters of the month name. yy or yyyy is a two-digit or four-digit integer that represents the year.
Examples The example table uses the input value of 17607, which is the SAS date value that corresponds to March 16, 2008. SAS Statement
Results ----+----1----+
put day date5.;
16MAR
put day date6.;
16MAR
put day date7.;
16MAR08
put day date8.;
16MAR08
put day date9.;
16MAR2008
put day date11.;
16-MAR-2008
See Also Function: “DATE Function” on page 613 Informat: “DATEw. Informat” on page 1279
DATEAMPMw.d Format Writes datetime values in the form ddmmmyy:hh:mm:ss.ss with AM or PM. Date and Time Alignment: right Category:
Formats
4
DATEAMPMw.d Format
151
Syntax DATEAMPMw.d
Syntax Description
w
specifies the width of the output field. Default: 19 Range: 7–40
SAS requires a minimum w value of 13 to write AM or PM. For widths between 10 and 12, SAS writes a 24-hour clock time.
Tip: d
specifies the number of digits to the right of the decimal point in the seconds value. This argument is optional. Requirement:
must be less than w
Range: 0–39
Note: If w–d< 17, SAS truncates the decimal values.
4
Details The DATEAMPMw.d format writes SAS datetime values in the form ddmmmyy:hh:mm:ss.ss, where dd is an integer that represents the day of the month. mmm is the first three letters of the month name. yy is a two-digit integer that represents the year. hh is an integer that represents the hour. mm is an integer that represents the minutes. ss.ss is the number of seconds to two decimal places.
Comparisons The DATEAMPMw.d format is similar to the DATETIMEw.d format except that DATEAMPMw.d prints AM or PM at the end of the time.
Examples The example table uses the input value of 1347455694, which is the SAS datetime value that corresponds to 11:01:34 a.m. on April 20, 2003.
152
DATETIMEw.d Format
4
Chapter 3
SAS Statement
Results ----+----1----+----2----+
put event dateampm.;
20APR03:11:01:34 AM
put event dateampm7.;
20APR03
put event dateampm10.;
20APR:11
put event dateampm13.;
20APR03:11 AM
put event dateampm22.2;
20APR03:11:01:34.00 AM
See Also Format: “DATETIMEw.d Format” on page 152
DATETIMEw.d Format Writes datetime values in the form ddmmmyy:hh:mm:ss.ss. Category:
Date and Time
Alignment:
right
Syntax DATETIMEw.d
Syntax Description w
specifies the width of the output field. Default: 16 Range:
7–40
SAS requires a minimum w value of 16 to write a SAS datetime value with the date, hour, and seconds. Add an additional two places to w and a value to d to return values with optional decimal fractions of seconds.
Tip:
d
specifies the number of digits to the right of the decimal point in the seconds value. This argument is optional. Requirement: Range:
Note:
must be less than w
0–39 If w–d< 17, SAS truncates the decimal values.
4
Formats
4
DATETIMEw.d Format
Details The DATETIMEw.d format writes SAS datetime values in the form ddmmmyy:hh:mm:ss.ss, where dd is an integer that represents the day of the month. mmm is the first three letters of the month name. yy is a two-digit integer that represents the year. hh is an integer that represents the hour in 24–hour clock time. mm is an integer that represents the minutes. ss.ss is the number of seconds to two decimal places.
Examples The example table uses the input value of 1447213759, which is the SAS datetime value that corresponds to 3:49:19 a.m. on November 10, 2005. SAS Statement
Results ----+----1----+----2
put event datetime.;
10NOV05:03:49:19
put event datetime7.;
10NOV05
put event datetime12.;
10NOV05:03
put event datetime18.;
10NOV05:03:49:19
put event datetime18.1; put event datetime19.;
10NOV05:03:49:19.0 10NOV2005:03:49:19
put event datetime20.1;
10NOV2005:03:49:19.0
put event datetime21.2;
10NOV2005:03:49:19.00
See Also Formats: “DATEw. Format” on page 149 “TIMEw.d Format” on page 240 Function: “DATETIME Function” on page 615 Informats: “DATEw. Informat” on page 1279
153
154
DAYw. Format
4
Chapter 3
“DATETIMEw. Informat” on page 1280 “TIMEw. Informat” on page 1346
DAYw. Format Writes date values as the day of the month. Category:
Date and Time right
Alignment:
Syntax DAYw.
Syntax Description
w
specifies the width of the output field. Default: 2
2–32
Range:
Examples The example table uses the input value of 16601, which is the SAS date value that corresponds to June 14, 2005. SAS Statement
Results ----+----1
put date day2.;
14
DDMMYYw. Format Writes date values in the form ddmmyy or dd/mm/yy, where a forward slash is the separator and the year appears as either 2 or 4 digits. Category: Alignment:
Date and Time right
Formats
4
DDMMYYw. Format
155
Syntax DDMMYYw.
Syntax Description w
specifies the width of the output field. Default: 8 Range: 2–10 Interaction: When w has a value of from 2 to 5, the date appears with as much of
the day and the month as possible. When w is 7, the date appears as a two-digit year without slashes.
Details The DDMMYYw. format writes SAS date values in the form ddmmyy or dd/mm/ yy, where dd is an integer that represents the day of the month. / is the separator. mm is an integer that represents the month. yy is a two-digit or four-digit integer that represents the year.
Examples The following examples use the input value of 16794, which is the SAS date value that corresponds to December 24, 2005. SAS Statement
Results ----+----1----+
put date ddmmyy5.;
24/12
put date ddmmyy6.;
241205
put date ddmmyy7.;
241205
put date ddmmyy8.;
24/12/05
put date ddmmyy10.;
24/12/2005
156
DDMMYYxw. Format
4
Chapter 3
See Also Formats: “DATEw. Format” on page 149 “DDMMYYxw. Format” on page 156 “MMDDYYw. Format” on page 192 “YYMMDDw. Format” on page 265 Function: “MDY Function” on page 894 Informats: “DATEw. Informat” on page 1279 “DDMMYYw. Informat” on page 1282 “MMDDYYw. Informat” on page 1304 “YYMMDDw. Informat” on page 1361
DDMMYYxw. Format Writes date values in the form ddmmyy or dd-mm-yy, where the x in the format name is a character that represents the special character that separates the day, month, and year, which can be a hyphen (-), period (.), blank character, slash (/), colon (:), or no separator; the year can be either 2 or 4 digits. Category: Alignment:
Date and Time right
Syntax DDMMYYxw.
Syntax Description
x
identifies a separator or specifies that no separator appear between the day, the month, and the year. Valid values for x are: B separates with a blank C separates with a colon D separates with a dash N indicates no separator
Formats
4
DDMMYYxw. Format
157
P separates with a period S separates with a slash. w
specifies the width of the output field. Default: 8 Range: 2–10 Interaction: When w has a value of from 2 to 5, the date appears with as much of
the day and the month as possible. When w is 7, the date appears as a two-digit year without separators. Interaction: When x has a value of N, the width range changes to 2–8.
Details The DDMMYYxw. format writes SAS date values in the form ddmmyy or ddxmmxyy, where dd is an integer that represents the day of the month. x is a specified separator. mm is an integer that represents the month. yy is a two-digit or four-digit integer that represents the year.
Examples The following examples use the input value of 18031, which is the SAS date value that corresponds to May 14, 2009. SAS Statement
Results ----+----1----+
put date ddmmyyc5.;
14:05
put date ddmmyyd8.;
14-05-09
put date ddmmyyp10.;
14.05.2009
put date ddmmyyn8.;
14052009
158
DOLLARw.d Format
4
Chapter 3
See Also Formats: “DATEw. Format” on page 149 “DDMMYYw. Format” on page 154 “MMDDYYxw. Format” on page 194 “YYMMDDxw. Format” on page 266 Functions: “DAY Function” on page 616 “MDY Function” on page 894 “MONTH Function” on page 906 “YEAR Function” on page 1191 Informat: “DDMMYYw. Informat” on page 1282
DOLLARw.d Format Writes numeric values with a leading dollar sign, a comma that separates every three digits, and a period that separates the decimal fraction. Numeric Alignment: right Category:
Syntax DOLLARw.d
Syntax Description w
specifies the width of the output field. Default: 6 Range:
2–32
Formats
4
DOLLARXw.d Format
159
d
specifies the number of digits to the right of the decimal point in the numeric value. This argument is optional. Range: 0–31 Requirement: must be less than w
Details The DOLLARw.d format writes numeric values with a leading dollar sign, a comma that separates every three digits, and a period that separates the decimal fraction. The hexadecimal representation of the code for the dollar sign character ($) is 5B on EBCDIC systems and 24 on ASCII systems. The monetary character that these codes represent might be different in other countries, but DOLLARw.d always produces one of these codes. If you need another monetary character, define your own format with the FORMAT procedure. See “The FORMAT Procedure” in Base SAS Procedures Guide for more details.
Comparisons 3 The DOLLARw.d format is similar to the DOLLARXw.d format, but the DOLLARXw.d format reverses the roles of the decimal point and the comma. This convention is common in European countries. 3 The DOLLARw.d format is the same as the COMMAw.d format except that the COMMAw.d format does not write a leading dollar sign.
Examples put @3 netpay dollar10.2; Value of netpay
Results ----+----1----+
1254.71
$1,254.71
See Also Formats: “COMMAw.d Format” on page 146 “DOLLARXw.d Format” on page 159
DOLLARXw.d Format Writes numeric values with a leading dollar sign, a period that separates every three digits, and a comma that separates the decimal fraction. Category: Numeric Alignment:
right
160
DOLLARXw.d Format
4
Chapter 3
Syntax DOLLARXw.d
Syntax Description w
specifies the width of the output field. Default: 6 Range:
2–32
d
specifies the number of digits to the right of the decimal point in the numeric value. This argument is optional. Default: 0 Range:
0–31
Requirement:
must be less than w
Details The DOLLARXw.d format writes numeric values with a leading dollar sign, with a period that separates every three digits, and with a comma that separates the decimal fraction. The hexadecimal representation of the code for the dollar sign character ($) is 5B on EBCDIC systems and 24 on ASCII systems. The monetary character that these codes represent might be different in other countries, but DOLLARXw.d always produces one of these codes. If you need another monetary character, define your own format with the FORMAT procedure. See “The FORMAT Procedure” in Base SAS Procedures Guide for more details.
Comparisons 3 The DOLLARXw.d format is similar to the DOLLARw.d format, but the DOLLARXw.d format reverses the roles of the decimal point and the comma. This convention is common in European countries.
3 The DOLLARXw.d format is the same as the COMMAXw.d format except that the COMMAw.d format does not write a leading dollar sign.
Examples put @3 netpay dollarx10.2; Value of netpay
Results ----+----1----+
1254.71
$1.254,71
Formats
4
DOWNAMEw. Format
See Also Formats: “COMMAXw.d Format” on page 147 “DOLLARw.d Format” on page 158
DOWNAMEw. Format Writes date values as the name of the day of the week. Category:
Date and Time
Alignment:
right
Syntax DOWNAMEw.
Syntax Description
w
specifies the width of the output field. Default: 9 Range: 1–32 Tip:
If you omit w, SAS prints the entire name of the day.
Details If necessary, SAS truncates the name of the day to fit the format width. For example, the DOWNAME2. prints the first two letters of the day name.
Examples The example table uses the input value of 13589, which is the SAS date value that corresponds to March 16, 1997. SAS Statement
Results ----+----1
put date downame.;
Sunday
161
162
DTDATEw. Format
4
Chapter 3
See Also Format: “WEEKDAYw. Format” on page 250
DTDATEw. Format Expects a datetime value as input and writes date values in the form ddmmmyy or ddmmmyyyy. Category:
Date and Time right
Alignment:
Syntax DTDATEw.
Syntax Description
w
specifies the width of the output field. Default: 7 Range: Tip:
5–9
Use a width of 9 to print a 4–digit year.
Details The DTDATEw. format writes SAS date values in the form ddmmmyy or ddmmmyyyy, where dd is an integer that represents the day of the month. mmm are the first three letters of the month name. yy or yyyy is a two-digit or four-digit integer that represents the year.
Comparisons The DTDATEw. format produces the same type of output that the DATEw. format produces. The difference is that the DTDATEw. format requires a datetime value.
Examples The example table uses a datetime value of 16APR2000:10:00:00 as input, and prints both a two-digit and a four-digit year for the DTDATEw. format.
Formats
SAS Statement
4
DTMONYYw. Format
163
Results ----+----+
put trip_date=dtdate.;
16APR00
put trip_date=dtdate9.;
16APR2000
See Also Formats: “DATEw. Format” on page 149
DTMONYYw. Format Writes the date part of a datetime value as the month and year in the form mmmyy or mmmyyyy. Category: Date and Time Alignment:
right
Syntax DTMONYYw.
Syntax Description w
specifies the width of the output field. Default: 5 Range: 5–7
Details The DTMONYYw. format writes SAS datetime values in the form mmmyy or mmmyyyy, where mmm is the first three letters of the month name. yy or yyyy is a two–digit or four-digit integer that represents the year.
Comparisons The DTMONYYw. format and the MONYYw. format are similar in that they both write date values. The difference is that DTMONYYw. expects a datetime value as input, and MONYYw. expects a SAS date value.
164
4
DTWKDATXw. Format
Chapter 3
Examples The example table uses as input the value 1476598132, which is the SAS datetime value that corresponds to October 16, 2006, at 06:08:52 a.m. SAS Statement
Results ----+----1
put date dtmonyy.;
OCT06
put date dtmonyy5.;
OCT06
put date dtmonyy6.;
OCT06
put date dtmonyy7.;
OCT2006
See Also Formats: “DATETIMEw.d Format” on page 152 “MONYYw. Format” on page 202
DTWKDATXw. Format Writes the date part of a datetime value as the day of the week and the date in the form day-of-week, dd month-name yy (or yyyy). Category:
Date and Time
Alignment:
right
Syntax DTWKDATXw.
Syntax Description
w
specifies the width of the output field. Default: 29 Range:
3–37
Details The DTWKDATXw. format writes SAS date values in the form day-of-week, dd month-name, yy or yyyy, where
Formats
4
DTYEARw. Format
day-of-week is either the first three letters of the day name or the entire day name. dd is an integer that represents the day of the month. month-name is either the first three letters of the month name or the entire month name. yy or yyyy is a two-digit or four-digit integer that represents the year.
Comparisons The DTWKDATXw. format is similar to the WEEKDATXw. format in that they both write date values. The difference is that DTWKDATXw. expects a datetime value as input, and WEEKDATXw. expects a SAS date value.
Examples The example table uses as input the value 1476598132, which is the SAS datetime value that corresponds to October 16, 2002, at 06:08:52 a.m. SAS Statement
Results ----+----1----+----2----+----3
put date dtwkdatx.; put date dtwkdatx3.; put date dtwkdatx8.; put date dtwkdatx25.;
Monday, 16 October 2006 Mon Mon Monday, 16 Oct 2006
See Also Formats: “DATETIMEw.d Format” on page 152 “WEEKDATXw. Format” on page 249
DTYEARw. Format Writes the date part of a datetime value as the year in the form yy or yyyy. Category: Date and Time Alignment:
right
Syntax DTYEARw.
165
166
DTYYQCw. Format
4
Chapter 3
Syntax Description w
specifies the width of the output field. Default: 4 Range: 2–4
Comparisons The DTYEARw. format is similar to the YEARw. format in that they both write date values. The difference is that DTYEARw. expects a datetime value as input, and YEARw. expects a SAS date value.
Examples The example table uses as input the value 1476598132, which is the SAS datetime value that corresponds to October 16, 2006, at 06:08:52 a.m. SAS Statement
Results ----+----1
put date dtyear.;
2006
put date dtyear2.;
06
put date dtyear3.; put date year4.;
06 2006
See Also Formats: “DATETIMEw.d Format” on page 152 “YEARw. Format” on page 261
DTYYQCw. Format Writes the date part of a datetime value as the year and the quarter and separates them with a colon (:). Date and Time Alignment: right Category:
Syntax DTYYQCw.
Formats
4
Ew. Format
167
Syntax Description
w
specifies the width of the output field. Default: 4 Range: 4–6
Details The DTYYQCw. format writes SAS datetime values in the form yy or yyyy, followed by a colon (:) and the numeric value for the quarter of the year.
Examples The example table uses as input the value 1476598132, which is the SAS datetime value that corresponds to October 16, 2006, at 06:08:52 p.m.. SAS Statement
Results ----+----1
put date dtyyqc.;
06:4
put date dtyyqc4.;
06:4
put date dtyyqc5.;
06:4
put date dtyyqc6.;
2006:4
See Also Formats: “DATETIMEw.d Format” on page 152
Ew. Format Writes numeric values in scientific notation. Category: Numeric Alignment: See:
Ew. Format in the documentation for your operating environment.
Syntax Ew.
right
168
E8601DAw. Format
4
Chapter 3
Syntax Description w
specifies the width of the output field. The output field can display up to 14 significant digits. Default: 12 Range: 7–32
Details When formatting values in scientific notation, the E format reserves the first column of the result for a minus sign and formats up to 14 significant digits.
Examples put @1 x e10.; Value of x
Results ----+----1----+
1257
1.257E+03
-1257
-1.257E+03
E8601DAw. Format Writes date values using the ISO 8601 extended notation yyyy-mm-dd. Category:
Date and Time
ISO 8601 Alignment: left Alias: IS8601DA No ISO 8601 Element: 5.2.1.1 Complete representation Time Zone Format:
Syntax E8601DAw.
Formats
4
E8601DAw. Format
169
Syntax Description w
specifies the width of the output field. Default: 10 Requirement:
The width of the output field must be 10.
Details The E8601DA format writes a date in the ISO 8601 extended notation yyyy-mm-dd: yyyy
is a four-digit year, such as 2008.
mm
is a two-digit month (zero padded) between 01 and 12.
dd
is a two-digit day of the month (zero padded) between 01 and 31.
Examples put eda e8601da.; Value for eda
Results
17790
2008-09-15
See Also “Working with Dates and Times Using the ISO 8601 Basic and Extended Notations” on page 94
170
E8601DNw. Format
4
Chapter 3
E8601DNw. Format Writes the date from a SAS datetime value using the ISO 8601 extended notation yyyy-mm-dd. Category:
Date and Time ISO 8601
Alignment: Alias:
left
IS8601DN
Time Zone Format:
No
ISO 8601 Element: 5.2.1.1 Complete representation
Syntax E8601DNw.
Syntax Description
w
specifies the width of the input field. Default: 10 Requirement:
The width of the input field must be 10.
Details The E8601DN formats writes the date in the ISO 8601 extended date notation yyyy-mm-dd: yyyy
is a four-digit year, such as 2008.
mm
is a two-digit month (zero padded) between 01 and 12.
dd
is a two-digit day of the month (zero padded) between 01 and 31.
Examples put edn e8601dn.; Value for edn
Results
1537113180
2008-09-15
See Also “Working with Dates and Times Using the ISO 8601 Basic and Extended Notations” on page 94
Formats
4
E8601DTw.d Format
171
E8601DTw.d Format Writes datetime values in the ISO 8601 extended notation yyyy-mm-ddThh:mm:ss.ffffff. Category: Date and Time
ISO 8601 Alignment: Alias:
left
IS8601DT
Time Zone Format:
No
ISO 8601 Element: 5.4.1 Complete representation
Syntax E8601DTw.d
Syntax Description w
specifies the width of the input field. Default: 19 Range: 19 - 26 d
specifies the number of digits to the right of the decimal point in the seconds value. This argument is optional. Default: 0 Range: 0 - 6
Details The E8602DT format writes datetime values using the ISO 8601 extended datetime notation yyyy-mm-ddThh:mm:ss.ffffff: yyyy
is a four-digit year, such as 2008.
mm
is a two-digit month (zero padded) between 01 and 12.
dd
is a two-digit day of the month (zero padded) between 01 and 31.
hh
is a two-digit hour (zero padded), between 00 - 23.
mm
is a two-digit minute (zero padded), between 00 - 59.
ss
is a two-digit second (zero padded), between 00 - 59.
.ffffff
are optional fractional seconds, with a precision of up to six digits, where each digit is between 0 - 9.
Examples put edt e8601dt.;
172
E8601DZw. Format
4
Chapter 3
Value of edt
Results
1537113180
2008-09-15T15:53:00
See Also “Working with Dates and Times Using the ISO 8601 Basic and Extended Notations” on page 94
E8601DZw. Format Writes datetime values in the Coordinated Universal Time (UTC) time scale using ISO 8601 datetime and time zone extended notations yyyy-mm-ddThh:mm:ss+|-hh:mm. Date and Time ISO 8601 Alignment: left Alias: IS8601DZ Time Zone Format: Yes ISO 8601 Element: 5.4.1 Complete representation Category:
Syntax E8601DZw.
Syntax Description w
specifies the width of the output field. Default: 26 Range: 20 - 35
Details UTC values specify a time and a time zone based on the zero meridian in Greenwich, England. The E8602DZ format writes SAS datetime values using one of the following ISO 8601 extended datetime notations: is the form used when w is large enough to support this time zone yyyy-mmddThh:mm:ss+|– notation. hh:mm yyyy-mmddThh:mm:ssZ where
is the form used when w is not large enough to support the +|hhmm time zone notation.
Formats
4
E8601LZw. Format
173
yyyy
is a four-digit year, such as 2008
mm
is a two-digit month (zero padded) between 01 and 12
dd
is a two-digit day of the month (zero padded) between 01 and 31
hh
is a two-digit hour (zero padded), between 00 - 24
mm
is a two-digit minute (zero padded), between 00 - 59
ss
is a two-digit second (zero padded), between 00 - 59
Z
indicates that the time is for zero meridian (Greenwich, England) or UTC time.
+|-hh:mm
is an hour and minute signed offset from zero meridian time. Note that the offset must be +|-hh:mm (that is, + or - and five characters). Use + for time zones east of the zero meridian and use - for time zones west of the zero meridian. For example, +02:00 indicates a two hour time difference to the east of the zero meridian, and -06:00 indicates a six hour time differences to the west of the zero meridian. Restriction: The shorter form +|-hh is not supported.
Examples put edz e8601dz.; Value of edz
Results
1537113180
2008-09-15T15:53:00+00:00
1537102380
2008-09-15T12:53:00+00:00
See Also “Working with Dates and Times Using the ISO 8601 Basic and Extended Notations” on page 94
E8601LZw. Format Writes time values as local time, appending the Coordinated Universal Time (UTC) offset for the local SAS session, using the ISO 8601 extended time notation hh:mm:ss+|-hh:mm. Category: Date and Time
ISO 8601 Alignment: left Alias: IS8601LZ Time Zone Format: Yes. The format appends the UTC offset to the value as determined by the local SAS session. ISO 8601 Element: 5.3.1.1 Complete representation
174
E8601LZw. Format
4
Chapter 3
Syntax E8601LZw.
Syntax Description
w
specifies the width of the output field. Default: 14 Range:
9 - 20
Details The E8602LZ format writes time values without making any adjustments and appends the UTC time zone offset for the local SAS session, using one of the following ISO 8601 extended time notations: hh:mm:ss+|– hh:mm
is the form used when w is large enough to support this time notation.
hh:mm:ssZ
is the form used when w is not large enough to support the +|hh:mm time zone notation.
where hh
is a two-digit hour (zero padded), between 00 - 23.
mm
is a two-digit minute (zero padded), between 00 - 59.
ss
is a two-digit second (zero padded), between 00 - 59.
Z
indicate zero meridian (Greenwich, England) or UTC time.
+|-hh:mm
is an hour and minute signed offset from zero meridian time. Note that the offset must be +|-hh:mm (that is, + or - and five characters). Use + for time zones east of the zero meridian and use - for time zones west of the zero meridian. For example, +02:00 indicates a two hour time difference to the east of the zero meridian, and -06:00 indicates a six hour time differences to the west of the zero meridian. Restriction: The shorter form +|-hh is not supported.
SAS writes the time value using the form hh:mm.ffffff and appends the time zone indicator +|-hh:mm based on the time zone offset from the zero meridian for the local SAS session, or Z. The Z time zone indicator is used for format lengths that are less than 14. If the same time is written using both zone indicators, they indicate two different times based on the UTC. For example, if the local SAS session uses Eastern Standard Time in the US, and the time value is 45824, SAS would write 12:43:44-04:00 or 12:43:44Z. The time 12:43:44–04:00 is the time 16:43:44+00:00 at the zero meridian. The Z indicates that the time is the time at the zero meridian, or 12:43:44+00:00. When SAS reads a UTC time by using the E8601TZ informat, and the adjusted time is greater than 24 hours or less than 00 hours, SAS adjusts the value so that the time is between 00:00:00 and 24:00:00. If the E8601TZ format attempts to format a time outside of this time range, the time is formatted with stars to indicate that the value is out of range.
Formats
4
E8601TMw.d Format
175
Examples The following PUT statement write the time for the Eastern Standard time zone. put elz e8601lz.; Value of elz
Results
46380
12:53:00-5:00
See Also “Working with Dates and Times Using the ISO 8601 Basic and Extended Notations” on page 94
E8601TMw.d Format Writes time values using the ISO 8601 extended notation hh:mm:ss.ffffff. Category: Date and Time
ISO 8601 Alignment: Alias:
left
IS8601TM
Time Zone Format:
No
ISO 8601 Element: 5.3.1.1 Complete representation and 5.3.1.3 Representation of decimal
fractions
Syntax E8601TMw.d
Syntax Description
w
specifies the width of the output field. Default: 8 Range: 8 - 15 d
specifies the number of digits to the right of the decimal point in the seconds value. This argument is optional. Default: 0 Range: 0 - 6
176
E8601TZw.d Format
4
Chapter 3
Details The E8601TM format writes SAS time values using the following ISO 8601 extended time notation: hh:mm:ss.ffffff
hh
is a two-digit hour (zero padded), between 00 - 23.
mm
is a two-digit minute (zero padded), between 00 - 59.
ss
is a two-digit second (zero padded), between 00 - 59.
.ffffff
are optional fractional seconds, with a precision of up to six digits, where each digit is between 0 - 9.
Examples put etm e8601tm.; Value of etm
Results
57180
15:53:00
See Also “Working with Dates and Times Using the ISO 8601 Basic and Extended Notations” on page 94
E8601TZw.d Format Adjusts time values to the Coordinated Universal Time (UTC) and writes them using the ISO 8601 extended notation hh:mm:ss+|-hh:mm. Category:
Date and Time ISO 8601
Alignment: Alias:
left
IS8601TZ
Time Zone Format:
Yes
ISO 8601 Element: 5.3.1.1 Complete representation
Syntax E8601TZw.d
Formats
4
E8601TZw.d Format
177
Syntax Description
w
specifies the width of the output field. Default: 14 Range: 9 - 20 d
specifies the number of digits to the right of the decimal point in the seconds value. This argument is optional. Default: 0 Range: 0 - 6
Details UTC time values specify a time and a time zone based on the zero meridian in Greenwich, England. The E8602TZ format writes time values in one of the following ISO 8601 extended time notations: hh:mm:ss+|– hh:mm
is the form used when w is large enough to support this time zone notation.
hh:mm:ssZ
is the form used when w is not large enough to support the +|hh:mm time zone notation.
where hh
is a two-digit hour (zero padded), between 00 - 23
mm
is a two-digit minute (zero padded), between 00 - 59
ss
is a two-digit second (zero padded), between 00 - 59
Z
indicate zero meridian (Greenwich, England) or UTC time
+|-hh:mm
is an hour and minute signed offset from zero meridian time. Note that the offset must be +|-hh:mm (that is, + or - and five characters). The shorter form +|-hh is not supported. Use + for time zones east of the zero meridian and use - for time zones west of the zero meridian. For example, +02:00 indicates a two hour time difference to the east of the zero meridian, and -06:00 indicates a six hour time differences to the west of the zero meridian.
When SAS reads a UTC time by using the B8601TZ informat, and the adjusted time is greater than 24 hours or less than 00 hours, SAS adjusts the value so that the time is between 00:00:00 and 24:00:00. If the E8601TZ format attempts to format a time outside of this time range, the time is formatted with stars to indicate that the value is out of range.
Comparisons For time values between 00:00:00 and 24:00:00, the E8601TZ format adjusts the time value to be the time at the zero meridian and writes it in the international standard extended time notation. The E8601LZ format makes no adjustment to the time and writes time values in the international standard extended time notation, using a UTC time zone offset for the local SAS session.
178
FLOATw.d Format
4
Chapter 3
Examples put etz e8601tz.; Value of etz
Results
73441
20:24:01+00:00
62641
17:24:01+00:00
See Also “Working with Dates and Times Using the ISO 8601 Basic and Extended Notations” on page 94
FLOATw.d Format Generates a native single-precision, floating-point value by multiplying a number by 10 raised to the dth power. Numeric Alignment: left Category:
Syntax FLOATw.d
Syntax Description w
specifies the width of the output field. Requirement: width must be 4 d
specifies the power of 10 by which to multiply the value. This argument is optional. Default: 0 Range: 0-31
Details This format is useful in operating environments where a float value is not the same as a truncated double. Values that are written by FLOAT4. typically are values that are meant to be read by some other external program that runs in your operating environment and that expects these single-precision values. Note: If the value that is to be formatted is a missing value, or if it is out-of-range for a native single-precision, floating-point value, a single-precision value of zero is generated. 4
Formats
4
FRACTw. Format
179
On IBM mainframe systems, a four-byte floating-point number is the same as a truncated eight-byte floating-point number. However, in operating environments using the IEEE floating-point standard, such as IBM PC-based operating environments and most UNIX operating environments, a four-byte floating-point number is not the same as a truncated double. Hence, the RB4. format does not produce the same results as the FLOAT4. format. Floating-point representations other than IEEE may have this same characteristic.
Comparisons The following table compares the names of float notation in several programming languages: Language
Float Notation
SAS
FLOAT4
FORTRAN
REAL+4
C
float
IBM 370 ASM
E
PL/I
FLOAT BIN(21)
Examples put x float4.; Value of x
Results*
1
3F800000
* The result is a hexadecimal representation of a binary number that is stored in IEEE form.
FRACTw. Format Converts numeric values to fractions. Category: Numeric Alignment:
Syntax FRACTw.
right
180
HEXw. Format
4
Chapter 3
Syntax Description w
specifies the width of the output field. Default: 10 Range:
4–32
Details Dividing the number 1 by 3 produces the value 0.33333333. To write this value as 1/3, use the FRACTw. format. FRACTw. writes fractions in reduced form, that is, 1/2 instead of 50/100.
Examples
put x fract8.; Value of x
Results ----+----1
0.6666666667 0.2784
2/3 174/625
HEXw. Format Converts real binary (floating-point) values to hexadecimal representation. Category:
left HEXw. Format in the documentation for your operating environment.
Alignment: See:
Numeric
Syntax HEXw.
Formats
4
HHMMw.d Format
181
Syntax Description w
specifies the width of the output field. Default: 8 Range: 1–16 Tip: If w< 16, the HEXw. format converts real binary numbers to fixed-point integers before writing them as hexadecimal characters. It also writes negative numbers in two’s complement notation, and right aligns digits. If w is 16, HEXw. displays floating-point values in their hexadecimal form.
Details In any operating environment, the least significant byte written by HEXw. is the rightmost byte. Some operating environments store integers with the least significant digit as the first byte. The HEXw. format produces consistent results in any operating environment regardless of the order of significance by byte. Note: Different operating environments store floating-point values in different ways. However, the HEX16. format writes hexadecimal representations of floating-point values with consistent results in the same way that your operating environment stores them. 4
Comparisons The HEXw. numeric format and the $HEXw. character format both generate the hexadecimal equivalent of values.
Examples put @8 x hex8.; Value of x
Results ----+----1----+----2
35.4
00000023
88
00000058
2.33
00000002
-150
FFFFFF6A
HHMMw.d Format Writes time values as hours and minutes in the form hh:mm. Date and Time Alignment: right Category:
182
HHMMw.d Format
4
Chapter 3
Syntax HHMMw.d
Syntax Description w
specifies the width of the output field. Default: 5 Range:
2–20
d
specifies the number of digits to the right of the decimal point in the minutes value. The digits to the right of the decimal point specify a fraction of a minute. This argument is optional. Default: 0 Range:
0–19
Requirement:
must be less than w
Details The HHMMw.d format writes SAS datetime values in the form hh:mm, where hh Note: If hh is a single digit, HHMMw.d places a leading blank before the digit. For example, the HHMMw.d. format writes 9:00 instead of 09:00. 4 is an integer. mm is the number of minutes that range from 00 through 59. SAS rounds hours and minutes that are based on the value of seconds in a SAS time value.
Comparisons The HHMMw.d format is similar to the TIMEw.d format except that the HHMMw.d format does not print seconds. The HHMMw.d format writes a leading blank for a single-hour digit. The TODw.d format writes a leading zero for a single-hour digit.
Examples The example table uses the input value of 46796, which is the SAS time value that corresponds to 12:59:56 p. m. SAS Statement
Results ----+----1
put time hhmm.;
13:00
put time hhmm8.2;
12:59.93
Formats
4
HOURw.d Format
183
In the first example, SAS rounds up the time value four seconds based on the value of seconds in the SAS time value. In the second example, by adding a decimal specification of 2 to the format shows that fifty-six seconds is 93% of a minute.
See Also Formats: “HOURw.d Format” on page 183 “MMSSw.d Format” on page 196 “TIMEw.d Format” on page 240 “TODw.d Format” on page 243 Functions: “HMS Function” on page 777 “HOUR Function” on page 780 “MINUTE Function” on page 898 “SECOND Function” on page 1083 “TIME Function” on page 1121 Informat: “TIMEw. Informat” on page 1346
HOURw.d Format Writes time values as hours and decimal fractions of hours. Category: Date and Time Alignment:
right
Syntax HOURw.d
Syntax Description w
specifies the width of the output field. Default: 2 Range: 2–20 d
specifies the number of digits to the right of the decimal point in the hour value. Therefore, SAS prints decimal fractions of the hour. This argument is optional. Requirement: must be less than w Range: 0-19
184
IBw.d Format
4
Chapter 3
Details SAS rounds hours based on the value of minutes in the SAS time value.
Examples The example table uses the input value of 41400, which is the SAS time value that corresponds to 11:30 a.m. SAS Statement
Results ----+----1
put time hour4.1;
11.5
See Also Formats: “HHMMw.d Format” on page 181 “MMSSw.d Format” on page 196 “TIMEw.d Format” on page 240 “TODw.d Format” on page 243 Functions: “HMS Function” on page 777 “HOUR Function” on page 780 “MINUTE Function” on page 898 “SECOND Function” on page 1083 “TIME Function” on page 1121 Informat: “TIMEw. Informat” on page 1346
IBw.d Format Writes native integer binary (fixed-point) values, including negative values. Category: Alignment: See:
Numeric left
IBw.d Format in the documentation for your operating environment.
Syntax IBw.d
Formats
4
IBw.d Format
185
Syntax Description w
specifies the width of the output field. Default: 4 Range: 1–8 d d
specifies to multiply the number by 10 . This argument is optional. Default 0 Range: 0–10
Details The IBw.d format writes integer binary (fixed-point) values, including negative values that are represented in two’s complement notation. IBw.d writes integer binary values with consistent results if the values are created in the same type of operating environment that you use to run SAS. Note: Different operating environments store integer binary values in different ways. This concept is called byte ordering. For a detailed discussion about byte ordering, see “Byte Ordering for Integer Binary Data on Big Endian and Little Endian Platforms” on page 88. 4
Comparisons The IBw.d and PIBw.d formats are used to write native format integers. (Native format allows you to read and write values created in the same operating environment.) The IBRw.d and PIBRw.d formats are used to write little endian integers in any operating environment. To view a table that shows the type of format to use with big endian and little endian integers, see Table 3.1 on page 88. To view a table that compares integer binary notation in several programming languages, see Table 3.2 on page 89.
Examples y=put(x,ib4.); put y $hex8.; Value of x
128
Results on Big Endian Platforms*
Results on Little Endian Platforms*
----+----1
----+----1
00000080
80000000
* The result is a hexadecimal representation of a four-byte integer binary number. Each byte occupies one column of the output field.
186
IBRw.d Format
4
Chapter 3
See Also Format: “IBRw.d Format” on page 186
IBRw.d Format Writes integer binary (fixed-point) values in Intel and DEC formats. Category:
Numeric
Alignment:
left
Syntax IBRw.d
Syntax Description w
specifies the width of the output field. Default: 4 Range:
1–8
d d
specifies to multiply the number by 10 . This argument is optional. Default: 0 Range:
0–10
Details The IBRw.d format writes integer binary (fixed-point) values, including negative values that are represented in two’s complement notation. IBRw.d writes integer binary values that are generated by and for Intel and DEC operating environments. Use IBRw.d to write integer binary data from Intel or DEC environments on other operating environments. The IBRw.d format in SAS code allows for a portable implementation for writing the data in any operating environment. Note: Different operating environments store integer binary values in different ways. This concept is called byte ordering. For a detailed discussion about byte ordering, see “Byte Ordering for Integer Binary Data on Big Endian and Little Endian Platforms” on page 88. 4
Comparisons 3 The IBw.d and PIBw.d formats are used to write native format integers. (Native format allows you to read and write values that are created in the same operating environment.)
Formats
4
IEEEw.d Format
187
3 The IBRw.d and PIBRw.d formats are used to write little endian integers, regardless of the operating environment you are writing on.
3 In Intel and DEC operating environments, the IBw.d and IBRw.d formats are equivalent. To view a table that shows the type of format to use with big endian and little endian integers, see Table 3.1 on page 88. To view a table that compares integer binary notation in several programming languages, see Table 3.2 on page 89.
Examples y=put(x,ibr4.); put y $hex8.; Value of x
Results ----+----1
128
80000000
* The result is a hexadecimal representation of a 4-byte integer binary number. Each byte occupies one column of the output field.
See Also Format: “IBw.d Format” on page 184
IEEEw.d Format Generates an IEEE floating-point value by multiplying a number by 10 raised to the dth power. Category: Numeric Alignment:
left
Caution: Large floating-point values and floating-point values that require precision
might not be identical to the original SAS value when they are written to an IBM mainframe by using the IEEE format and read back into SAS using the IEE informat.
Syntax IEEEw.d
Syntax Description w
specifies the width of the output field.
188
JULDAYw. Format
4
Chapter 3
Default: 8 Range: 3–8 Tip: If w is 8, an IEEE double-precision, floating-point number is written. If w is 5,
6, or 7, an IEEE double-precision, floating-point number is written, which assumes truncation of the appropriate number of bytes. If w is 4, an IEEE single-precision floating-point number is written. If w is 3, an IEEE single-precision, floating-point number is written, which assumes truncation of one byte. d d
specifies to multiply the number by 10 . This argument is optional. Default: 0 Range: 0–10
Details This format is useful in operating environments where IEEEw.d is the floating-point representation that is used. In addition, you can use the IEEEw.d format to create files that are used by programs in operating environments that use the IEEE floating-point representation. Typically, programs generate IEEE values in single-precision (4 bytes) or double-precision (8 bytes). Programs perform truncation solely to save space on output files. Machine instructions require that the floating-point number be one of the two lengths. The IEEEw.d format allows other lengths, which enables you to write data to files that contain space-saving truncated data.
Examples test1=put(x,ieee4.); put test1 $hex8.; test2=put(x,ieee5.); put test2 $hex10.; Value of x
Results
1
3F800000 3FF0000000
* The result contains hexadecimal representations of binary numbers stored in IEEE form.
JULDAYw. Format Writes date values as the Julian day of the year. Category: Alignment:
Date and Time right
Syntax JULDAYw.
Formats
4
JULIANw. Format
189
Syntax Description w
specifies the width of the output field. Default: 3 Range: 3–32
Details The JULDAYw. format writes SAS date values in the form ddd, where ddd is the number of the day, 1–365 (or 1–366 for leap years).
Examples The example table uses the input values of 13515, which is the SAS date value that corresponds to January 1, 1997, and 13589, which is the SAS date value that corresponds to March 16, 1997. SAS Statement
Results ----+----1
put date julday3.;
1
put date julday3.;
75
JULIANw. Format Writes date values as Julian dates in the form yyddd or yyyyddd. Category: Date and Time Alignment:
left
Syntax JULIANw.
Syntax Description w
specifies the width of the output field. Default: 5 Range: 5–7
190
4
MDYAMPMw.d Format
Chapter 3
If w is 5, the JULIANw. format writes the date with a two-digit year. If w is 7, the JULIANw. format writes the date with a four-digit year.
Tip:
Details The JULIANw. format writes SAS date values in the form yyddd or yyyyddd, where yy or yyyy is a two-digit or four-digit integer that represents the year. ddd is the number of the day, 1–365 (or 1–366 for leap years), in that year.
Examples The example table uses the input value of 16794, which is the SAS date value that corresponds to December 24, 2005 (the 358th day of the year). SAS Statement
Results ----+----1
put date julian5.;
05358
put date julian7.;
2005358
See Also Functions: “DATEJUL Function” on page 614 “JULDATE Function” on page 840 Informat: “JULIANw. Informat” on page 1301
MDYAMPMw.d Format Writes datetime values in the form mm/dd/yy hh:mm AM|PM. The year can be either two or four digits. Category: Alignment:
Date and Time right
Default Time Period:
Syntax MDYAMPMw.
AM
Formats
4
MDYAMPMw.d Format
191
Syntax Description
w
specifies the width of the output field. Default: 19 Range: 8–40
Details The MDYAMPMw.d format writes SAS datetime values in the following form: mm/dd/yy hh:mm< AM | PM> The following list explains the datetime variables: mm is an integer from 1 through 12 that represents the month. dd is an integer from 1 through 31 that represents the day of the month. yy or yyyy specifies a two-digit or four-digit integer that represents the year. hh is the number of hours that range from 0 through 23. mm is the number of minutes that range from 00 through 59. AM | PM specifies either the time period 00:01–12:00 noon (AM) or the time period 12:01–12:00 midnight (PM). The default is AM. date and time separator characters is one of several special characters, such as the slash (/), colon (:), or a blank character that SAS uses to separate date and time components.
Comparison The MDYAMPMw. format writes datetime values with separators in the form mm/dd/ yy hh:mm AM | PM, and requires a space between the date and the time. The DATETIMEw.d format writes datetime values with separators in the form ddmmmyy: hh:mm:ss.ss.
Examples This example uses the input value of 1537113180, which is the SAS datetime value that corresponds to 3:53:00 PM on September 15, 2008. SAS Statement
Results
put dt mdyampm25.
9/15/2008 3:53 PM
192
MMDDYYw. Format
4
Chapter 3
See Also Format: “DATETIMEw.d Format” on page 152 Informat: “MDYAMPMw.d Informat” on page 1302
MMDDYYw. Format Writes date values in the form mmddyy or mm/dd/yy, where a forward slash is the separator and the year appears as either 2 or 4 digits. Category:
Date and Time
Alignment:
right
Syntax MMDDYYw.
Syntax Description
w
specifies the width of the output field. Default: 8 Range:
2–10
Interaction: When w has a value of from 2 to 5, the date appears with as much of
the month and the day as possible. When w is 7, the date appears as a two-digit year without slashes.
Details The MMDDYYw. format writes SAS date values in the form mmddyy or mm/dd/ yy, where mm is an integer that represents the month. / is the separator. dd is an integer that represents the day of the month. yy is a two-digit or four-digit integer that represents the year.
Formats
4
MMDDYYw. Format
Examples The following examples use the input value of 16734, which is the SAS date value that corresponds to October 25, 2005. SAS Statement
Results ----+----1----+
put day mmddyy2.;
10
put day mmddyy3.;
10
put day mmddyy4.;
1025
put day mmddyy5.;
10/25
put day mmddyy6.;
102505
put day mmddyy7.;
102505
put day mmddyy8.;
10/25/05
put day mmddyy10.;
10/25/2005
See Also Formats: “DATEw. Format” on page 149 “DDMMYYw. Format” on page 154 “MMDDYYxw. Format” on page 194 “YYMMDDw. Format” on page 265 Functions: “DAY Function” on page 616 “MDY Function” on page 894 “MONTH Function” on page 906 “YEAR Function” on page 1191 Informats: “DATEw. Informat” on page 1279 “DDMMYYw. Informat” on page 1282 “YYMMDDw. Informat” on page 1361
193
194
MMDDYYxw. Format
4
Chapter 3
MMDDYYxw. Format Writes date values in the form mmddyy or mm-dd-yy, where the x in the format name is a character that represents the special character which separates the month, day, and year. The special character can be a hyphen (-), period (.), blank character, slash (/), colon (:), or no separator; the year can be either 2 or 4 digits. Category:
Date and Time
Alignment:
right
Syntax MMDDYYxw.
Syntax Description x
identifies a separator or specifies that no separator appear between the month, the day, and the year. Valid values for x are: B separates with a blank C separates with a colon D separates with a dash N indicates no separator P separates with a period S separates with a slash. w
specifies the width of the output field. Default: 8 Range:
2–10
Interaction: When w has a value of from 2 to 5, the date appears with as much of
the month and the day as possible. When w is 7, the date appears as a two-digit year without separators. Interaction: When x has a value of N, the width range changes to 2–8.
Details The MMDDYYxw. format writes SAS date values in the form mmddyy or mmxddxyy, where
Formats
4
MMDDYYxw. Format
mm is an integer that represents the month. x is a specified separator. dd is an integer that represents the day of the month. yy is a two-digit or four-digit integer that represents the year.
Examples The following examples use the input value of 18031, which is the SAS date value that corresponds to May 14, 2009. SAS Statement
Results ----+----1----+
put day mmddyyc5.;
05:14
put day mmddyyd8.;
05-14-09
put day mmddyyp10.;
05.14.2009
put day mmddyyn8.;
05142009
See Also Formats: “DATEw. Format” on page 149 “DDMMYYxw. Format” on page 156 “MMDDYYw. Format” on page 192 “YYMMDDxw. Format” on page 266 Functions: “DAY Function” on page 616 “MDY Function” on page 894 “MONTH Function” on page 906 “YEAR Function” on page 1191 Informat: “MMDDYYw. Informat” on page 1304
195
196
MMSSw.d Format
4
Chapter 3
MMSSw.d Format Writes time values as the number of minutes and seconds since midnight. Date and Time Alignment: right Category:
Syntax MMSSw.d
Syntax Description w
specifies the width of the output field. Default: 5 Range: 2–20 Tip:
Set w to a minimum of 5 to write a value that represents minutes and seconds.
d
specifies the number of digits to the right of the decimal point in the seconds value. Therefore, the SAS time value includes fractional seconds. This argument is optional. Range: 0–19 Restriction:
must be less than w
Examples The example table uses the input value of 4530. SAS Statement
Results ----+----1
put time mmss.;
75:30
Formats
4
MMYYw. Format
197
See Also Formats: “HHMMw.d Format” on page 181 “TIMEw.d Format” on page 240 Functions: “HMS Function” on page 777 “MINUTE Function” on page 898 “SECOND Function” on page 1083 Informat: “TIMEw. Informat” on page 1346
MMYYw. Format Writes date values in the form mmMyy, where M is the separator and the year appears as either 2 or 4 digits. Category: Date and Time Alignment:
right
Syntax MMYYw.
Syntax Description
w
specifies the width of the output field. Default: 7 Range: 5–32 Interaction: When w has a value of 5 or 6, the date appears with only the last two
digits of the year. When w is 7 or more, the date appears with a four-digit year.
Details The MMYYw. format writes SAS date values in the form mmMyy, where mm is an integer that represents the month. M is the character separator. yy is a two-digit or four-digit integer that represents the year.
198
MMYYxw. Format
4
Chapter 3
Examples The following examples use the input value of 16734, which is the SAS date value that corresponds to October 25, 2005. SAS Statement
Results ----+----1----+
put date mmyy5.; put date mmyy6.;
10M05 10M05
put date mmyy.;
10M2005
put date mmyy7.;
10M2005
put date mmyy10.;
10M2005
See Also Format: “MMYYxw. Format” on page 198 “YYMMw. Format” on page 262
MMYYxw. Format Writes date values in the form mmyy or mm-yy, where the x in the format name is a character that represents the special character that separates the month and the year, which can be a hyphen (-), period (.), blank character, slash (/), colon (:), or no separator; the year can be either 2 or 4 digits. Category:
Date and Time
Alignment:
right
Syntax MMYYxw.
Syntax Description
x
identifies a separator or specifies that no separator appear between the month and the year. Valid values for x are C separates with a colon
Formats
4
MMYYxw. Format
199
D separates with a dash N indicates no separator P separates with a period S separates with a forward slash. w
specifies the width of the output field. Default: 7 Range: 5–32 Interaction: When x is set to N, no separator is specified. The width range is then
4–32, and the default changes to 6. Interaction: When x has a value of C, D, P, or S and w has a value of 5 or 6, the
date appears with only the last two digits of the year. When w is 7 or more, the date appears with a four-digit year. Interaction: When x has a value of N and w has a value of 4 or 5, the date appears
with only the last two digits of the year. When x has a value of N and w is 6 or more, the date appears with a four-digit year.
Details The MMYYxw. format writes SAS date values in the form mmyy or mmxyy, where mm is an integer that represents the month. x is a specified separator. yy is a two-digit or four-digit integer that represents the year.
Examples The following examples use the input value of 18031, which is the SAS date value that corresponds to May 14, 2009. SAS Statement
Results ----+----1----+
put date mmyyc5.;
05:09
put date mmyyd.;
05-2009
put date mmyyn4.;
0509
put date mmyyp8.; put date mmyys10.;
05.2009 05/2009
200
4
MONNAMEw. Format
Chapter 3
See Also Format: “MMYYw. Format” on page 197 “YYMMxw. Format” on page 263
MONNAMEw. Format Writes date values as the name of the month. Category:
Date and Time right
Alignment:
Syntax MONNAMEw.
Syntax Description
w
specifies the width of the output field. Default: 9 Range: Tip:
1–32
Use MONNAME3. to print the first three letters of the month name.
Details If necessary, SAS truncates the name of the month to fit the format width.
Examples The example table uses the input value of 16500, which is the SAS date value that corresponds to March 5, 2005. SAS Statement
Results ----+----1
put date monname1.;
M
put date monname3.;
Mar
put date monname5.;
March
Formats
4
MONTHw. Format
See Also Format: “MONTHw. Format” on page 201
MONTHw. Format Writes date values as the month of the year. Category: Date and Time Alignment:
right
Syntax MONTHw.
Syntax Description w
specifies the width of the output field. Default: 2 Range: 1–32 Tip:
Use MONTH1. to obtain a hexadecimal value.
Details The MONTHw. format writes the month (1 through 12) of the year from a SAS date value. If the month is a single digit, the MONTHw. format places a leading blank before the digit. For example, the MONTHw. format writes 4 instead of 04.
Examples The example table uses the input value of 18031, which is the SAS date value that corresponds to May 14, 2009. SAS Statement
Results ----+----1
put date month.;
5
201
202
MONYYw. Format
4
Chapter 3
See Also Format: “MONNAMEw. Format” on page 200
MONYYw. Format Writes date values as the month and the year in the form mmmyy or mmmyyyy. Category: Date and Time Alignment: right
Syntax MONYYw.
Syntax Description w
specifies the width of the output field. Default: 5 Range: 5–7
Details The MONYYw. format writes SAS date values in the form mmmyy or mmmyyyy, where mmm is the first three letters of the month name. yy or yyyy is a two-digit or four-digit integer that represents the year.
Comparisons The MONYYw. format and the DTMONYYw. format are similar in that they both write date values. The difference is that MONYYw. expects a SAS date value as input, and DTMONYYw. expects a datetime value.
Examples The example table uses the input value of 16794, which is the SAS date value that corresponds to December 24, 2005. SAS Statement
Results ----+----1
put date monyy5.;
DEC05
put date monyy7.;
DEC2005
Formats
4
NEGPARENw.d Format
203
See Also Formats: “DTMONYYw. Format” on page 163 “DDMMYYw. Format” on page 154 “MMDDYYw. Format” on page 192 “YYMMDDw. Format” on page 265 Functions: “MONTH Function” on page 906 “YEAR Function” on page 1191 Informat: “MONYYw. Informat” on page 1305
NEGPARENw.d Format Writes negative numeric values in parentheses. Category: Numeric Alignment:
right
Syntax NEGPARENw.d
Syntax Description w
specifies the width of the output field. Default: 6 Range: 1–32 d
specifies the number of digits to the right of the decimal point in the numeric value. This argument is optional. Default: 0 Range: 0–31
Details The NEGPARENw.d format attempts to right-align output values. If the input value is negative, NEGPARENw.d displays the output by enclosing the value in parentheses, if the field that you specify is wide enough. Otherwise, it uses a minus sign to represent the negative value. If the input value is non-negative, NEGPARENw.d displays the value with a leading and trailing blank to ensure proper column alignment. It reserves the last column for a close parenthesis even when the value is positive.
204
NUMXw.d Format
4
Chapter 3
Comparisons The NEGPARENw.d format is similar to the COMMAw.d format in that it separates every three digits of the value with a comma.
Examples put @1 sales negparen8.; Value of sales
Results ----+----1----+
100
100
1000
1,000
-200
(200)
-2000
(2,000)
NUMXw.d Format Writes numeric values with a comma in place of the decimal point. Numeric Alignment: right Category:
Syntax NUMXw.d
Syntax Description w
specifies the width of the output field. Default: 12 Range: 1–32 d
specifies the number of digits to the right of the decimal point (comma) in the numeric value. This argument is optional. Default: 0 Range: 0–31
Details The NUMXw.d format writes numeric values with a comma in place of the decimal point.
Formats
4
OCTALw. Format
205
Comparisons The NUMXw.d format is similar to the w.d format except that NUMXw.d writes numeric values with a comma in place of the decimal point.
Examples put x numx10.2; Value of x
Results ----+----1----+
896.48
896,48
64.89
64,89
3064.10
3064,10
See Also Format: “w.d Format” on page 247 Informat: “NUMXw.d Informat” on page 1308
OCTALw. Format Converts numeric values to octal representation. Category: Numeric Alignment:
left
Syntax OCTALw.
Syntax Description w
specifies the width of the output field. Default: 3 Range: 1–24
Details If necessary, the OCTALw. format converts numeric values to integers before displaying them in octal representation.
206
PDw.d Format
4
Chapter 3
Comparisons OCTALw. converts numeric values to octal representation. The $OCTALw. format converts character values to octal representation.
Examples put x octal6.; Value of x
Results ----+----1
3592
007010
PDw.d Format Writes data in packed decimal format. Category:
Numeric
left See: PDw.d Format in the documentation for your operating environment. Alignment:
Syntax PDw.d
Syntax Description w
specifies the width of the output field. The w value specifies the number of bytes, not the number of digits. (In packed decimal data, each byte contains two digits.) Default: 1 Range:
1–16
d d
specifies to multiply the number by 10 . This argument is optional. Default: 0 Range:
0–31
Details Different operating environments store packed decimal values in different ways. However, the PDw.d format writes packed decimal values with consistent results if the values are created in the same kind of operating environment that you use to run SAS. The PDw.d format writes missing numerical data as –0. When the PDw.d informat reads a –0, it stores it as 0.
Formats
4
PDJULGw. Format
207
Comparisons The following table compares packed decimal notation in several programming languages: Language
Notation
SAS
PD4.
COBOL
COMP-3 PIC S9(7)
IBM 370 assembler
PL4
PL/I
FIXED DEC
Examples
y=put(x,pd4.); put y $hex8.; Value of x
Results* ----+----1
128
00000128
* The result is a hexadecimal representation of a binary number written in packed decimal format. Each byte occupies one column of the output field.
PDJULGw. Format Writes packed Julian date values in the hexadecimal format yyyydddF for IBM. Category: Date and Time
Syntax PDJULGw.
Syntax Description
w
specifies the width of the output field. Default: 4 Range: 3-16
208
PDJULIw. Format
4
Chapter 3
Details The PDJULGw. format writes SAS date values in the form yyyydddF, where yyyy is the two-byte representation of the four-digit Gregorian year. ddd is the one-and-a-half byte representation of the three-digit integer that corresponds to the Julian day of the year, 1–365 (or 1–366 for leap years). F is the half byte that contains all binary 1s, which assigns the value as positive. Note: SAS interprets a two-digit year as belonging to the 100-year span that is defined by the YEARCUTOFF= system option. 4
Examples SAS Statement
Results ----+----1
date = ’17mar2005’d; juldate = put(date,pdjulg4.); put juldate $hex8.;
2005076F
See Also Formats: “PDJULIw. Format” on page 208 “JULIANw. Format” on page 189 “JULDAYw. Format” on page 188 Functions: “JULDATE Function” on page 840 “DATEJUL Function” on page 614 Informats: “PDJULIw. Informat” on page 1313 “PDJULGw. Informat” on page 1311 “JULIANw. Informat” on page 1301 System Option: “YEARCUTOFF= System Option” on page 1998
PDJULIw. Format Writes packed Julian date values in the hexadecimal format ccyydddF for IBM. Category:
Date and Time
Formats
4
PDJULIw. Format
209
Syntax PDJULIw.
Syntax Description w
specifies the width of the output field. Default: 4 Range: 3-16
Details The PDJULIw. format writes SAS date values in the form ccyydddF, where cc is the one-byte representation of a two-digit integer that represents the century. yy is the one-byte representation of a two-digit integer that represents the year. The PDJULIw. format makes an adjustment for the century byte by subtracting 1900 from the 4–digit Gregorian year to produce the correct packed decimal ccyy representation. A year value of 1998 is stored in ccyy as 0098, and a year value of 2011 is stored as 0111. ddd is the one-and-a-half byte representation of the three-digit integer that corresponds to the Julian day of the year, 1–365 (or 1–366 for leap years). F is the half byte that contains all binary 1s, which assigns the value as positive. Note: SAS interprets a two-digit year as belonging to the 100-year span that is defined by the YEARCUTOFF= system option. 4
Examples SAS Statement
Results ----+----1
date = ’17mar2005’d; juldate = put(date,pdjuli4.); put juldate $hex8.;
0105076F
date = ’31dec2003’d; juldate = put(date,pdjuli4.); put juldate $hex8.;
0103365F
See Also Formats: “PDJULGw. Format” on page 207
210
PERCENTw.d Format
4
Chapter 3
“JULIANw. Format” on page 189 “JULDAYw. Format” on page 188 Functions: “DATEJUL Function” on page 614 “JULDATE Function” on page 840 Informats: “PDJULGw. Informat” on page 1311 “PDJULIw. Informat” on page 1313 “JULIANw. Informat” on page 1301 System Option: “YEARCUTOFF= System Option” on page 1998
PERCENTw.d Format Writes numeric values as percentages. Numeric Alignment: right Category:
Syntax PERCENTw.d
Syntax Description w
specifies the width of the output field. Default: 6 Range: 4–32 Tip: The width of the output field must account for the percent sign (% )and parentheses for negative numbers, whether the number is negative or positive. d
specifies the number of digits to the right of the decimal point in the numeric value. This argument is optional. Range: 0–31 Requirement: must be less than w
Details The PERCENTw.d format multiplies values by 100, formats them the same as the BESTw.d format, and adds a percent sign (%) to the end of the formatted value, while it encloses negative values in parentheses.
Formats
4
PERCENTNw.d Format
211
Examples put @10 gain percent10.; Value of x
Results ----+----1----+----2
0.1
10%
1.2
120%
-0.05
(
5%)
See Also Format: “PERCENTNw.d Format” on page 211
PERCENTNw.d Format Produces percentages, using a minus sign for negative values. Category: Numeric Alignment: right
Syntax PERCENTNw.d
Syntax Description w
specifies the width of the output field. Default: 6 Range: 4–32 Tip: The width of the output field must account for the minus sign ( – ), the percent sign ( % ), and a trailing blank, whether the number is negative or positive. d
specifies the number of digits to the right of the decimal point in the numeric value. This argument is optional. Range: 0–31 Requirement: must be less than w
Details The PERCENTNw.d format multiplies negative values by 100, formats them the same as the BESTw.d format, adds a minus sign to the beginning of the value, and adds a percent sign (%) to the end of the formatted value.
212
4
PIBw.d Format
Chapter 3
Comparisons The PERCENTNw.d format produces percents by using a minus sign instead of parentheses for negative values. The PERCENTw.d format produces percents by using parentheses for negative values.
Examples put x percentn10.; Value of x
Results
--0.1
-10%
.2
20%
.8
80%
--0.05
-5%
--6.3
--630%
See Also Format: “PERCENTw.d Format” on page 210
PIBw.d Format Writes positive integer binary (fixed-point) values. Numeric Alignment: left Category: See:
PIBw.d Format in the documentation for your operating environment.
Syntax PIBw.d
Syntax Description w
specifies the width of the output field. Default: 1 Range:
1–8
d d
specifies to multiply the number by 10 . This argument is optional.
Formats
4
PIBw.d Format
213
Default: 0 Range: 0–31
Details All values are treated as positive. PIBw.d writes positive integer binary values with consistent results if the values are created in the same type of operating environment that you use to run SAS. Note: Different operating environments store integer binary values in different ways. This concept is called byte ordering. For a detailed discussion about byte ordering, see “Byte Ordering for Integer Binary Data on Big Endian and Little Endian Platforms” on page 88. 4
Comparisons 3 Positive integer binary values are the same as integer binary values except that the sign bit is part of the value, which is always a positive integer. The PIBw.d format treats all values as positive and includes the sign bit as part of the value. 3 The PIBw.d format with a width of 1 results in a value that corresponds to the binary equivalent of the contents of a byte. A value that corresponds to the binary equivalent of the contents of a byte is useful if your data contain values between hexadecimal 80 and hexadecimal FF, where the high-order bit can be misinterpreted as a negative sign. 3 The PIBw.d format is the same as the IBw.d format except that PIBw.d treats all values as positive values. 3 The IBw.d and PIBw.d formats are used to write native format integers. (Native format allows you to read and write values that are created in the same operating environment.) The IBRw.d and PIBRw.d formats are used to write little endian integers in any operating environment. To view a table that shows the type of format to use with big endian and little endian integers, see Table 3.1 on page 88. To view a table that compares integer binary notation in several programming languages, see Table 3.2 on page 89.
Examples y=put(x,pib1.); put y $hex2.; Value of x
Results ----+----1
12
0C
* The result is a hexadecimal representation of a one-byte binary number written in positive integer binary format, which occupies one column of the output field.
See Also Format: “PIBRw.d Format” on page 214
214
PIBRw.d Format
4
Chapter 3
PIBRw.d Format Writes positive integer binary (fixed-point) values in Intel and DEC formats. Category:
Numeric
Syntax PIBRw.d
Syntax Description
w
specifies the width of the input field. Default: 1 Range:
1–8
d d
specifies to multiply the number by 10 . This argument is optional. Default: 0 Range:
0–10
Details All values are treated as positive. PIBRw.d writes positive integer binary values that have been generated by and for Intel and DEC operating environments. Use PIBRw.d to write positive integer binary data from Intel or DEC environments on other operating environments. The PIBRw.d format in SAS code allows for a portable implementation for writing the data in any operating environment. Note: Different operating environments store positive integer binary values in different ways. This concept is called byte ordering. For a detailed discussion about byte ordering, see “Byte Ordering for Integer Binary Data on Big Endian and Little Endian Platforms” on page 88. 4
Comparisons 3 Positive integer binary values are the same as integer binary values except that the sign bit is part of the value, which is always a positive integer. The PIBRw.d format treats all values as positive and includes the sign bit as part of the value.
3 The PIBRw.d format with a width of 1 results in a value that corresponds to the binary equivalent of the contents of a byte. A value that corresponds to the binary equivalent of the contents of a byte is useful if your data contain values between hexadecimal 80 and hexadecimal FF, where the high-order bit can be misinterpreted as a negative sign.
3 On Intel and DEC operating environments, the PIBw.d and PIBRw.d formats are equivalent.
3 The IBw.d and PIBw.d formats are used to write native format integers. (Native format allows you to read and write values that are created in the same operating
Formats
4
PKw.d Format
215
environment.) The IBRw.d and PIBRw.d formats are used to write little endian integers in any operating environment. To view a table that shows the type of format to use with big endian and little endian integers, see Table 3.1 on page 88. To view a table that compares integer binary notation in several programming languages, see Table 3.2 on page 89.
Examples y=put(x,pibr2.); put y $hex4.; Value of x
Results ----+----1
128
8000
* The result is a hexadecimal representation of a two-byte binary number written in positive integer binary format, which occupies one column of the output field.
See Also Informat: “PIBw.d Informat” on page 1316
PKw.d Format Writes data in unsigned packed decimal format. Category: Numeric Alignment:
left
Syntax PKw.d
Syntax Description w
specifies the width of the output field. Default: 1 Range: 1–16 d d
specifies to multiply the number by 10 . This argument is optional. Default: 0
216
PVALUEw.d Format
4
Chapter 3
0–10
Range:
Requirement:
must be less than w
Details Each byte of unsigned packed decimal data contains two digits.
Comparisons The PKw.d format is similar to the PDw.d format except that PKw.d does not write the sign in the low-order byte.
Examples y=put(x,pk4.); put y $hex8.; Value of x
Results* ----+----1
128
00000128
* The result is a hexadecimal representation of a four-byte number written in packed decimal format. Each byte occupies one column of the output field.
PVALUEw.d Format Writes p-values. Category: Alignment:
Numeric right
Syntax PVALUEw.d
Syntax Description w
specifies the width of the output field. Default: 6 Range: 3–32 d
specifies the number of digits to the right of the decimal point in the numeric value. This argument is optional. Default: the minimum of 4 and w–2
Formats
4
QTRw. Format
Range: 1–30 Restriction: must be less than w
Comparisons The PVALUEw.d format follows the rules for the w.d format, except that 3 if the value x is such that 0 0, b>0. It should be noted that
(a; b) =
0 (a) 0 (b) 0 (a + b)
where 0 (:) is the gamma function. If the expression cannot be computed, BETA returns a missing value.
Examples SAS Statements
Results
x=beta(5,3);
0.9523809524e-2
See Also Function: “LOGBETA Function” on page 878
Functions and CALL Routines
4
BETAINV Function
405
BETAINV Function Returns a quantile from the beta distribution. Category: Quantile
Syntax BETAINV (p,a,b)
Arguments
p
is a numeric probability. Range: 0 ≤ p ≤ 1 a
is a numeric shape parameter. Range: a > 0 b
is a numeric shape parameter. Range: b > 0
Details The BETAINV function returns the pth quantile from the beta distribution with shape parameters a and b. The probability that an observation from a beta distribution is less than or equal to the returned quantile is p. Note:
BETAINV is the inverse of the PROBBETA function.
Examples SAS Statements
Results
x=betainv(0.001,2,4);
0.0101017879
See Also Functions: “QUANTILE Function” on page 1028
4
406
BLACKCLPRC Function
4
Chapter 4
BLACKCLPRC Function Calculates call prices for European options on futures, based on the Black model. Category:
Financial
Syntax BLACKCLPRC(E, t, F, r, sigma)
Arguments
E
is a non-missing, positive value that specifies exercise price. Requirement:
Specify E and F in the same units.
t
is a non-missing value that specifies time to maturity. F
is a non-missing, positive value that specifies future price. Requirement:
Specify F and E in the same units.
r
is a non-missing, positive fraction that specifies the risk-free interest rate between the present time and t. Requirement:
Specify a value for r for the same time period as the unit of t.
sigma
is a non-missing, positive fraction that specifies the volatility (the square root of the variance of r). Requirement:
Specify a value for sigma for the same time period as the unit of t.
Details The BLACKCLPRC function calculates call prices for European options on futures, based on the Black model. The function is based on the following relationship:
CALL = e0rt (F N (d1 ) 0 EN (d2 )) where F
specifies future price.
N
specifies the cumulative normal density function.
E
specifies the exercise price of the option.
r
specifies the risk-free interest rate for period t.
t
specifies the time to expiration.
Functions and CALL Routines
BLACKCLPRC Function
407
0 1 2 ln F + t
E pt = p d2 = d1 0 t
d1
4
2
where
specifies the volatility of the underlying asset.
2
specifies the variance of the rate of return. For the special case of t=0, the following equation is true:
CALL = max ((F
0 E ) ; 0)
For information about the basics of pricing, see “Using Pricing Functions” on page 303.
Comparisons The BLACKCLPRC function calculates call prices for European options on futures, based on the Black model. The BLACKPTPRC function calculates put prices for European options on futures, based on the Black model. These functions return a scalar value.
Examples SAS Statements
Results ----+----1----+-----2--
a=blackclprc(1000, .5, 950, 4, 2); put a;
65.335687119
b=blackclprc(850, 2.5, 125, 3, 1); put b;
0.012649067
c=blackclprc(7500, .9, 950, 3, 2); put c;
17.880939441
d=blackclprc(5000, -.5, 237, 3, 2); put d;
See Also Function: “BLACKPTPRC Function” on page 408
0
408
BLACKPTPRC Function
4
Chapter 4
BLACKPTPRC Function Calculates put prices for European options on futures, based on the Black model. Category:
Financial
Syntax BLACKPTPRC(E, t, F, r, sigma)
Arguments
E
is a non-missing, positive value that specifies exercise price. Requirement:
Specify E and F in the same units.
t
is a non-missing value that specifies time to maturity. F
is a non-missing, positive value that specifies future price. Requirement:
Specify F and E in the same units.
r
is a non-missing, positive fraction that specifies the risk-free interest rate between the present time and t. Requirement:
Specify a value for r for the same time period as the unit of t.
sigma
is a non-missing, positive fraction that specifies the volatility (the square root of the variance of r). Requirement:
Specify a value for sigma for the same time period as the unit of t.
Details The BLACKPTPRC function calculates put prices for European options on futures, based on the Black model. The function is based on the following relationship:
PUT = CALL + e0rt (E 0 F ) where E
specifies the exercise price of the option.
r
specifies the risk-free interest rate for period t.
t
specifies the time to expiration.
F
specifies future price.
Functions and CALL Routines
BLACKPTPRC Function
409
0 1 2 ln F + t
E pt = p d2 = d1 0 t
d1
4
2
where
specifies the volatility of the underlying asset.
2
specifies the variance of the rate of return. For the special case of t=0, the following equation is true:
PUT = max ((E
0 F ) ; 0)
For information about the basics of pricing, see “Using Pricing Functions” on page 303.
Comparisons The BLACKPTPRC function calculates put prices for European options on futures, based on the Black model. The BLACKCLPRC function calculates call prices for European options on futures, based on the Black model. These functions return a scalar value.
Examples SAS Statements
Results ----+----1----+-----2--
a=blackptprc(1000, .5, 950, 4, 2); put a;
72.102451281
b=blackptprc(850, 2.5, 125, 3, 1); put b;
0.4136352354
c=blackptprc(7500, .9, 950, 3, 2); put c;
458.07704789
d=blackptprc(5000, -.5, 237, 3, 2); put d;
See Also Function: “BLACKCLPRC Function” on page 406
0
410
BLKSHCLPRC Function
4
Chapter 4
BLKSHCLPRC Function Calculates call prices for European options on stocks, based on the Black-Scholes model. Category:
Financial
Syntax BLKSHCLPRC(E, t, S, r, sigma)
Arguments E
is a non-missing, positive value that specifies the exercise price. Requirement: Specify E and S in the same units. t
is a non-missing value that specifies the time to maturity. S
is a non-missing, positive value that specifies the share price. Requirement: Specify S and E in the same units. r
is a non-missing, positive fraction that specifies the risk-free interest rate for period t. Requirement: Specify a value for r for the same time period as the unit of t. sigma
is a non-missing, positive fraction that specifies the volatility of the underlying asset. Requirement: Specify a value for sigma for the same time period as the unit of t.
Details The BLKSHCLPRC function calculates the call prices for European options on stocks, based on the Black-Scholes model. The function is based on the following relationship:
CALL = SN (d1 ) 0 EN (d2 ) e0rt where S
is a non-missing, positive value that specifies the share price.
N
specifies the cumulative normal density function.
E
is a non-missing, positive value that specifies the exercise price of the option.
0 1 S
ln E + r + pt = p d2 = d1 0 t
d1
2
2
t
Functions and CALL Routines
4
BLKSHCLPRC Function
411
where t
specifies the time to expiration.
r
specifies the risk-free interest rate for period t.
2
specifies the volatility (the square root of the variance). specifies the variance of the rate of return. For the special case of t=0, the following equation is true:
CALL = max ((S 0 E ) ; 0) For information about the basics of pricing, see “Using Pricing Functions” on page 303.
Comparisons The BLKSHCLPRC function calculates the call prices for European options on stocks, based on the Black-Scholes model. The BLKSHPTPRC function calculates the put prices for European options on stocks, based on the Black-Scholes model. These functions return a scalar value.
Examples SAS Statements
Results ----+----1----+-----2--
a=blkshclprc(1000, .5, 950, 4, 2); put a;
831.05008469
b=blkshclprc(850, 2.5, 125, 3, 1); put b;
124.53035232
c=blkshclprc(7500, .9, 950, 3, 2); put c;
719.40891129
d=blkshclprc(5000, -.5, 237, 3, 2); put d;
See Also Function: “BLKSHPTPRC Function” on page 412
0
412
BLKSHPTPRC Function
4
Chapter 4
BLKSHPTPRC Function Calculates put prices for European options on stocks, based on the Black-Scholes model. Category:
Financial
Syntax BLKSHPTPRC(E, t, S, r, sigma)
Arguments E
is a non-missing, positive value that specifies the exercise price. Requirement:
Specify E and S in the same units.
t
is a non-missing value that specifies the time to maturity. S
is a non-missing, positive value that specifies the share price. Requirement:
Specify S and E in the same units.
r
is a non-missing, positive fraction that specifies the risk-free interest rate for period t. Requirement:
Specify a value for r for the same time period as the unit of t.
sigma
is a non-missing, positive fraction that specifies the volatility of the underlying asset. Requirement:
Specify a value for sigma for the same time period as the unit of t.
Details The BLKSHPTPRC function calculates the put prices for European options on stocks, based on the Black-Scholes model. The function is based on the following relationship:
PUT = CALL 0 S + Ee0r t where S
is a non-missing, positive value that specifies the share price.
E
is a non-missing, positive value that specifies the exercise price of the option.
0 1 S
ln + r + pt = p d2 = d1 0 t
d1
E
2
2
t
Functions and CALL Routines
4
BLKSHPTPRC Function
where t
specifies the time to expiration.
r
specifies the risk-free interest rate for period t.
2
specifies the volatility (the square root of the variance). specifies the variance of the rate of return. For the special case of t=0, the following equation is true:
PUT = max ((E 0 S ) ; 0) For information about the basics of pricing, see “Using Pricing Functions” on page 303.
Comparisons The BLKSHPTPRC function calculates the put prices for European options on stocks, based on the Black-Scholes model. The BLKSHCLPRC function calculates the call prices for European options on stocks, based on the Black-Scholes model. These functions return a scalar value.
Examples
SAS Statements
Results ----+----1----+-----2--
a=blkshptprc(1000, .5, 950, 4, 2); put a;
16.385367922
b=blkshptprc(850, 1.2, 125, 3, 1); put b;
1.426971358
c=blkshptprc(7500, .9, 950, 3, 2); put c;
273.45025684
d=blkshptprc(5000, -.5, 237, 3, 2); put d;
See Also Function: “BLKSHCLPRC Function” on page 410
0
413
414
BLSHIFT Function
4
Chapter 4
BLSHIFT Function Returns the bitwise logical left shift of two arguments. Category:
Bitwise Logical Operations
Syntax BLSHIFT(argument-1,argument-2)
Arguments
argument-1
specifies a numeric constant, variable, or expression. Range:
32
between 0 and (2 )-1 inclusive
argument-2
specifies a numeric constant, variable, or expression. Range:
0 to 31, inclusive
Details If either argument contains a missing value, then the function returns a missing value and sets _ERROR_ equal to 1.
Examples SAS Statements
Results
x=blshift(07x,2); put x=hex.;
x=0000001C
BNOT Function Returns the bitwise logical NOT of an argument. Category:
Bitwise Logical Operations
Syntax BNOT(argument)
Functions and CALL Routines
4
BOR Function
415
Arguments argument
specifies a numeric constant, variable, or expression. 32 Range: between 0 and (2 )-1 inclusive
Details If the argument contains a missing value, then the function returns a missing value and sets _ERROR_ equal to 1.
Examples SAS Statements
Results
x=bnot(0F000000Fx); put x=hex.;
x=0FFFFFF0
BOR Function Returns the bitwise logical OR of two arguments. Category: Bitwise Logical Operations
Syntax BOR(argument-1,argument-2)
Arguments argument-1, argument-2
specifies a numeric constant, variable, or expression. 32 Range: between 0 and (2 )-1 inclusive
Details If either argument contains a missing value, then the function returns a missing value and sets _ERROR_ equal to 1.
Examples SAS Statements
Results
x=bor(01x,0F4x); put x=hex.;
x=000000F5
416
BRSHIFT Function
4
Chapter 4
BRSHIFT Function Returns the bitwise logical right shift of two arguments. Category:
Bitwise Logical Operations
Syntax BRSHIFT(argument-1, argument-2)
Arguments
argument-1
specifies a numeric constant, variable, or expression. Range:
32
between 0 and (2 )-1 inclusive
argument-2
specifies a numeric constant, variable, or expression. Range:
0 to 31, inclusive
Details If either argument contains a missing value, then the function returns a missing value and sets _ERROR_ equal to 1.
Examples SAS Statements
Results
x=brshift(01Cx,2); put x=hex.;
x=00000007
BXOR Function Returns the bitwise logical EXCLUSIVE OR of two arguments. Category:
Bitwise Logical Operations
Syntax BXOR(argument-1, argument-2)
Functions and CALL Routines
4
BYTE Function
417
Arguments argument-1, argument-2
specifies a numeric constant, variable, or expression. 32 Range: between 0 and (2 )-1 inclusive
Details If either argument contains a missing value, then the function returns a missing value and sets _ERROR_ equal to 1.
Examples SAS Statements
Results
x=bxor(03x,01x); put x=hex.;
x=00000002
BYTE Function Returns one character in the ASCII or the EBCDIC collating sequence. Category: Character
“I18N Level 0” on page 305 BYTE Function in the documentation for your operating environment.
Restriction: See:
Syntax BYTE (n)
Arguments n
specifies an integer that represents a specific ASCII or EBCDIC character. Range: 0–255
Details Length of Returned Variable In a DATA step, if the BYTE function returns a value to a variable that has not previously been assigned a length, then that variable is assigned a length of 1. ASCII and EBCDIC Collating Sequences For EBCDIC collating sequences, n is between 0 and 255. For ASCII collating sequences, the characters that correspond to values
418
4
CALL ALLCOMB Routine
Chapter 4
between 0 and 127 represent the standard character set. Other ASCII characters that correspond to values between 128 and 255 are available on certain ASCII operating environments, but the information those characters represent varies with the operating environment.
Examples SAS Statements
x=byte(80); put x;
Results ASCII
EBCDIC
----+----1----+----2
----+----1----+----2
P
&
See Also Functions: “COLLATE Function” on page 568 “RANK Function” on page 1048
CALL ALLCOMB Routine Generates all combinations of the values of n variables taken k at a time in a minimal change order. Category:
Combinatorial
Syntax CALL ALLCOMB(count, k, variable-1, …, variable-n);
Arguments count
specifies an integer variable that is assigned from 1 to the number of combinations in a loop. k
specifies an interger constant, variable, or expression between 1 and n, inclusive, that specifies the number of items in each combination. variable
specifies either all numeric variables, or all character variables that have the same length. The values of these variables are permuted. Requirement: Initialize these variables before calling the ALLCOMB routine.
Functions and CALL Routines
4
CALL ALLCOMB Routine
419
Restriction: Specify no more than 33 items. If you need to find combinations of
more than 33 items, use the CALL ALLCOMBI routine. After calling the ALLCOMB routine, the first k variables contain the values in one combination.
Tip:
Details CALL ALLCOMB Processing Use the CALL ALLCOMB routine in a loop where the first argument to CALL ALLCOMB accepts each integral value from 1 to the number of combinations, and where k is constant. The number of combinations can be computed by using the COMB function. On the first call, the argument types and lengths are checked for consistency. On each subsequent call, the values of two variables are interchanged. If you call the ALLCOMB routine with the first argument out of sequence, the results are not useful. In particular, if you initialize the variables and then immediately th call ALLCOMB with a first argument of j, for example, then you will not get the j th combination (except when j is 1). To get the j combination, you must call ALLCOMB j times, with the first argument taking values from 1 through j in that exact order. Using the CALL ALLCOMB Routine with Macros
You can call the ALLCOMB routine when you use the %SYSCALL macro. In this case, the variable arguments are not required to be the same type or length. If %SYSCALL identifies an argument as numeric, then %SYSCALL reformats the returned value. If an error occurs during the execution of the CALL ALLCOMB routine, then both of the following values are set:
3 &SYSERR is assigned a value that is greater than 4. 3 &SYSINFO is assigned a value that is less than –100. If there are no errors, then &SYSERR is set to zero, and &SYSINFO is set to one of the following values:
3 0 if count=1 3 j if the values of variable-j and variable-k were interchanged, where j);
Arguments count
specifies an integer variable that ranges from 1 to the number of permutations. variable
specifies either all numeric variables, or all character variables that have the same length. The values of these variables are permuted. Requirement: Initialize these variables before you call the ALLPERM routine. Restriction: Specify no more than 18 variables.
Details CALL ALLPERM Processing Use the CALL ALLPERM routine in a loop where the first argument to CALL ALLPERM takes each integral value from 1 to the number of permutations. On the first call, the argument types and lengths are checked for consistency. On each subsequent call, the values of two consecutive variables are interchanged. Note: You can compute the number of permutations by using the PERM function. See “PERM Function” on page 974 for more information. 4 If you call the ALLPERM routine and the first argument is out of sequence, the results are not useful. In particular, if you initialize the variables and then immediately call the ALLPERM routine with a first argument of K, for example, your result will not be the Kth permutation (except when K is 1). To get the Kth permutation, you must call the ALLPERM routine K times, with the first argument taking values from 1 through K in that exact order. ALLPERM always produces N! permutations even if some of the variables have equal values or missing values. If you want to generate only the distinct permutations when there are equal values, or if you want to omit missing values from the permutations, use the LEXPERM function instead.
Using the CALL ALLPERM Routine with Macros
You can call the ALLPERM routine when you use the %SYSCALL macro. In this case, the variable arguments are not required to be the same type or length. If %SYSCALL identifies an argument as numeric, then %SYSCALL reformats the returned value. If an error occurs during the execution of the CALL ALLPERM routine, then both of the following values are set: 3 &SYSERR is assigned a value that is greater than 4. 3 &SYSINFO is assigned a value that is less than -100. If there are no errors, then &SYSERR is set to zero, and &SYSINFO is set to one of the following values: 3 0 if count=1 3 J if 1);
Arguments result
specifies a character variable. Restriction: The CALL CATT routine accepts only a character variable as a valid argument for result. Do not use a constant or a SAS expression because CALL CATT is unable to update these arguments. item
specifies a constant, variable, or expression, either character or numeric. If item is numeric, then its value is converted to a character string using the BESTw. format. In this case, leading blanks are removed and SAS does not write a note to the log.
Details The CALL CATT routine returns the result in the first argument, result. The routine appends the values of the arguments that follow to result. If the length of result is not large enough to contain the entire result, SAS does the following: 3 writes a warning message to the log stating that the result was truncated 3 writes a note to the log that shows the location of the function call and lists the argument that caused the truncation, except in SQL or in a WHERE clause 3 sets _ERROR_ to 1 in the DATA step, except in a WHERE clause The CALL CATT routine removes leading and trailing blanks from numeric arguments after it formats the numeric value with the BESTw. format.
Comparisons The results of the CALL CATS, CALL CATT, and CALL CATX routines are usually equivalent to statements that use the concatenation operator (||) and the TRIM and LEFT functions. However, using the CALL CATS, CALL CATT, and CALL CATX routines is faster than using TRIM and LEFT. The following table shows statements that are equivalent to CALL CATS, CALL CATT, and CALL CATX. The variables X1 through X4 specify character variables, and SP specifies a separator, such as a blank or comma.
430
CALL CATX Routine
4
Chapter 4
CALL Routine
Equivalent Statement
CALL CATS(OF X1-X4);
X1=TRIM(LEFT(X1))||TRIM(LEFT(X2))||TRIM(LEFT(X3))|| TRIM(LEFT(X4));
CALL CATT(OF X1-X4);
X1=TRIM(X1)||TRIM(X2)||TRIM(X3)||TRIM(X4);
CALL CATX(SP, OF X1-X4); *
X1=TRIM(LEFT(X1))||SP||TRIM(LEFT(X2))||SP|| TRIM(LEFT(X3))||SP||TRIM(LEFT(X4));
* If any of the arguments is blank, the results that are produced by CALL CATX differ slightly from the results that are produced by the concatenated code. In this case, CALL CATX omits the corresponding separator. For example, CALL CATX("+","X"," ", "Z"," "); produces X+Z.
Examples The following example shows how the CALL CATT routine concatenates strings. data _null_; length answer $ 36; x=’Athens is t ’; y=’he Olym ’; z=’pic site for 2004. ’; call catt(answer,x,y,z); put answer; run;
The following line is written to the SAS log: ----+----1----+----2----+----3----+----4 Athens is the Olympic site for 2004.
See Also Functions and CALL Routines: “CALL CATS Routine” on page 427 “CALL CATX Routine” on page 430 “CAT Function” on page 526 “CATQ Function” on page 528 “CATS Function” on page 532 “CATT Function” on page 534 “CATX Function” on page 537
CALL CATX Routine Removes leading and trailing blanks, inserts delimiters, and returns a concatenated character string.
Functions and CALL Routines
4
CALL CATX Routine
431
Category: Character
Syntax CALL CATX(delimiter, result< , item-1 , … item-n>);
Arguments delimiter
specifies a character string that is used as a delimiter between concatenated strings. result
specifies a character variable. Restriction: The CALL CATX routine accepts only a character variable as a valid argument for result. Do not use a constant or a SAS expression because CALL CATX is unable to update these arguments. item
specifies a constant, variable, or expression, either character or numeric. If item is numeric, then its value is converted to a character string using the BESTw. format. In this case, SAS does not write a note to the log.
Details The CALL CATX routine returns the result in the second argument, result. The routine appends the values of the arguments that follow to result. If the length of result is not large enough to contain the entire result, SAS does the following: 3 writes a warning message to the log stating that the result was truncated 3 writes a note to the log that shows the location of the function call and lists the argument that caused the truncation, except in SQL or in a WHERE clause 3 sets _ERROR_ to 1 in the DATA step, except in a WHERE clause The CALL CATX routine removes leading and trailing blanks from numeric arguments after formatting the numeric value with the BESTw. format.
Comparisons The results of the CALL CATS, CALL CATT, and CALL CATX routines are usually equivalent to statements that use the concatenation operator (||) and the TRIM and LEFT functions. However, using the CALL CATS, CALL CATT, and CALL CATX routines is faster than using TRIM and LEFT. The following table shows statements that are equivalent to CALL CATS, CALL CATT, and CALL CATX. The variables X1 through X4 specify character variables, and SP specifies a delimiter, such as a blank or comma.
432
CALL COMPCOST Routine
4
Chapter 4
CALL Routine
Equivalent Statement
CALL CATS(OF X1-X4);
X1=TRIM(LEFT(X1))||TRIM(LEFT(X2))||TRIM(LEFT(X3))|| TRIM(LEFT(X4));
CALL CATT(OF X1-X4);
X1=TRIM(X1)||TRIM(X2)||TRIM(X3)||TRIM(X4);
CALL CATX(SP, OF X1-X4); *
X1=TRIM(LEFT(X1))||SP||TRIM(LEFT(X2))||SP|| TRIM(LEFT(X3))||SP||TRIM(LEFT(X4));
* If any of the arguments are blank, the results that are produced by CALL CATX differ slightly from the results that are produced by the concatenated code. In this case, CALL CATX omits the corresponding delimiter. For example, CALL CATX("+",newvar,"X"," ", "Z"," "); produces X+Z.
Examples The following example shows how the CALL CATX routine concatenates strings. data _null_; length answer $ 50; separator=’%%$%%’; x=’Athens is t ’; y=’he Olym ’; z=’ pic site for 2004. ’; call catx(separator,answer,x,y,z); put answer; run;
The following line is written to the SAS log: ----+----1----+----2----+----3----+----4----+----5 Athens is t%%$%%he Olym%%$%%pic site for 2004.
See Also Functions and CALL Routines: “CALL CATS Routine” on page 427 “CALL CATT Routine” on page 429 “CAT Function” on page 526 “CATQ Function” on page 528 “CATS Function” on page 532 “CATT Function” on page 534 “CATX Function” on page 537
CALL COMPCOST Routine Sets the costs of operations for later use by the COMPGED function
Functions and CALL Routines
4
CALL COMPCOST Routine
433
Category: Character Restriction:
Use with the COMPGED function
When invoked by the %SYSCALL macro statement, CALL COMPCOST removes quotation marks from its arguments. For more information, see “Using CALL Routines and the %SYSCALL Macro Statement” on page 304.
Interaction:
Syntax CALL COMPCOST(operation-1, value-1 );
Arguments operation
is a character constant, variable, or expression that specifies an operation that is performed by the COMPGED function. value
is a numeric constant, variable, or expression that specifies the cost of the operation that is indicated by the preceding argument. Restriction: Must be an integer that ranges from -32767 to 32767, or a missing
value
Details Computing the Cost of Operations
Each argument that specifies an operation must have a value that is a character string. The character string corresponds to one of the terms that is used to denote an operation that the COMPGED function performs. See “Computing the Generalized Edit Distance” on page 576 to view a table of operations that the COMPGED function uses. The character strings that specify operations can be in uppercase, lowercase, or mixed case. Blanks are ignored. Each character string must end with an equal sign (=). Valid values for operations, and the default cost of the operations are listed in the following table.
Operation
Default Cost
APPEND=
very large
BLANK=
very large
DELETE=
100
DOUBLE=
very large
FDELETE=
equal to DELETE
FINSERT=
equal to INSERT
FREPLACE=
equal to REPLACE
INSERT=
100
MATCH=
0
PUNCTUATION=
very large
REPLACE=
100
434
CALL COMPCOST Routine
4
Chapter 4
Operation
Default Cost
SINGLE=
very large
SWAP=
very large
TRUNCATE=
very large
If an operation does not appear in the call to the COMPCOST routine, or if the operation appears and is followed by a missing value, then that operation is assigned a default cost. A “very large” cost indicates a cost that is sufficiently large that the COMPGED function will not use the corresponding operation. After your program calls the COMPCOST routine, the costs that are specified remain in effect until your program calls the COMPCOST routine again, or until the step that contains the call to COMPCOST terminates.
Abbreviating Character Strings
You can abbreviate character strings. That is, you can use the first one or more letters of a specific operation rather than use the entire term. You must, however, use as many letters as necessary to uniquely identify the term. For example, you can specify the INSERT= operation as “in=”, and the REPLACE= operation as “r=”. To specify the DELETE= or the DOUBLE= operation, you must use the first two letters because both DELETE= and DOUBLE= begin with “d”. The character string must always end with an equal sign.
Examples The following example calls the COMPCOST routine to compute the generalized edit distance for the operations that are specified. options pageno=1 nodate linesize=80 pagesize=60; data test; length String $8 Operation $40; if _n_ = 1 then call compcost(’insert=’,10,’DEL=’,11,’r=’, 12); input String Operation; GED=compged(string, ’baboon’); datalines; baboon match xbaboon insert babon delete baXoon replace ; proc print data=test label; label GED=’Generalized Edit Distance’; var String Operation GED; run;
The following output shows the results.
Functions and CALL Routines
Output 4.12
4
CALL EXECUTE Routine
435
Generalized Edit Distance Based on Operation The SAS System
Obs 1 2 3 4
String baboon xbaboon babon baXoon
1
Operation
Generalized Edit Distance
match insert delete replace
0 10 11 12
See Also Functions: “COMPGED Function” on page 575 “COMPARE Function” on page 571 “COMPLEV Function” on page 580
CALL EXECUTE Routine Resolves the argument, and issues the resolved value for execution at the next step boundary. Category: Macro
Syntax CALL EXECUTE(argument);
Arguments argument
specifies a character expression or a constant that yields a macro invocation or a SAS statement. Argument can be: 3 a character string, enclosed in quotation marks. 3 the name of a DATA step character variable. Do not enclose the name of the DATA step variable in quotation marks. 3 a character expression that the DATA step resolves to a macro text expression or a SAS statement.
Details If argument resolves to a macro invocation, the macro executes immediately and DATA step execution pauses while the macro executes. If argument resolves to a SAS statement or if execution of the macro generates SAS statements, the statement(s) execute after the end of the DATA step that contains the CALL EXECUTE routine. CALL EXECUTE is fully documented in SAS Macro Language: Reference.
436
CALL GRAYCODE Routine
4
Chapter 4
CALL GRAYCODE Routine Generates all subsets of n items in a minimal change order. Category:
Combinatorial
Syntax CALL GRAYCODE(k, numeric-variable-1, ..., numeric-variable-n); CALL GRAYCODE(k, character-variable );
Arguments
k
specifies a numeric variable. Initialize k to either of the following values prior to executing the CALL GRAYCODE routine:
3 a negative number to cause CALL GRAYCODE to initialize the subset to be empty
3 the number of items in the initial set indicated by numeric-variable-1 through numeric-variable-n, or character-variable, which must be an integer value between 0 and N inclusive The value of k is updated when CALL GRAYCODE is executed. The value that is returned is the number of items in the subset. numeric-variable
specifies numeric variables that have values of 0 or 1 which are updated when CALL th GRAYCODE is executed. A value of 1 for numeric-variable-j indicates that the j th item is in the subset. A value of 0 for numeric-variable-j indicates that the j item is not in the subset. If you assign a negative value to k before you execute CALL GRAYCODE, then you do not need to initialize numeric-variable-1 through numeric-variable-n before executing CALL GRAYCODE unless you want to suppress the note about uninitialized variables. If you assign a value between 0 and n inclusive to k before you execute CALL GRAYCODE, then you must initialize numeric-variable-1 through numeric-variable-n to k values of 1 and n-k values of 0. character-variable
specifies a character variable that has a length of at least n characters. The first n th characters indicate which items are in the subset. By default, an "I" in the j position th th indicates that thej item is in the subset, and an "O" in the j position indicates that th the j item is out of the subset. You can change the two characters by specifying the in-out argument. If you assign a negative value to k before you execute CALL GRAYCODE, then you do not need to initialize character-variable before executing CALL GRAYCODE unless you want to suppress the note about an uninitialized variable. If you assign a value between 0 and n inclusive to k before you execute CALL GRAYCODE, then you must initialize character-variable to k characters that indicate an item is in the subset, and k-k characters that indicate an item is out of the subset.
Functions and CALL Routines
4
CALL GRAYCODE Routine
437
n
specifies a numeric constant, variable, or expression. By default, n is the length of character-variable. in-out
specifies a character constant, variable, or expression. The default value is "IO." The first character is used to indicate that an item is in the subset. The second character is used to indicate that an item is out of the subset.
Details Using CALL GRAYCODE in a DATA Step
When you execute the CALL GRAYCODE routine with a negative value of k, the subset is initialized to be empty. When you execute the CALL GRAYCODE routine with an integer value of k between 0 and n inclusive, one item is either added to the subset or removed from the subset, and the value of k is updated to equal the number of items in the subset. To generate all subsets of n items, you can initialize k to a negative value and execute CALL GRAYCODE in a loop that iterates 2**n times. If you want to start with a non-empty subset, then initialize k to be the number of items in the subset, initialize the other arguments to specify the desired initial subset, and execute CALL GRAYCODE in a loop that iterates 2**n-1 times. The sequence of subsets that are generated by CALL GRAYCODE is cyclical, so you can begin with any subset you want.
Using the CALL GRAYCODE Routine with Macros
You can call the GRAYCODE routine when you use the %SYSCALL macro. Differences exist when you use CALL GAYCODE in a DATA step and when you use the routine with macros. The following list describes usage with macros: 3 All arguments must be initialized to nonblank values. 3 If you use the character-variable argument, then it must be initialized to a nonblank, nonnumeric character string that contains at least n characters. 3 If you use the in-out argument, then it must be initialized to a string that contains two characters that are not blanks, digits, decimal points, or plus and minus signs.
If %SYSCALL identifies an argument as being the wrong type, or if %SYSCALL is unable to identify the type of argument, then &SYSERR and &SYSINFO are not set. Otherwise, if an error occurs during the execution of the CALL GRAYCODE routine, then both of the following values are set:
3 &SYSERR is assigned a value that is greater than 4. 3 &SYSINFO is assigned a value that is less than –100. If there are no errors, then &SYSERR is set to zero, and &SYSINFO is set to one of the following values:
3 0 if the value of k on input is negative 3 the index of the item that was added or removed from the subset if the value of k on input is a valid nonnegative integer.
Examples Example 1: Using a Character Variable and Positive Initial k with CALL GRAYCODE
The following example uses the CALL GRAYCODE routine to generate subsets in a minimal change order. data _null_; x=’++++’; n=length(x);
438
CALL GRAYCODE Routine
4
Chapter 4
k=countc(x, ’+’); put ’ 1’ +3 k= +2 x=; nsubs=2**n; do i=2 to nsubs; call graycode(k, x, n, ’+-’); put i 5. +3 k= +2 x=; end; run;
SAS writes the following output to the log: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
k=4 k=3 k=2 k=3 k=2 k=1 k=0 k=1 k=2 k=1 k=2 k=3 k=2 k=1 k=2 k=3
x=++++ x=-+++ x=-+-+ x=++-+ x=+--+ x=---+ x=---x=+--x=++-x=-+-x=-++x=+++x=+-+x=--+x=--++ x=+-++
Example 2: Using %SYSCALL with Numeric Variables and Negative k
The following example uses the %SYSCALL macro with numeric variables to generate subsets in a minimal change order. %macro test; %let n=3; %let x1=.; %let x2=.; %let x3=.; %let k=-1; %let nsubs=%eval(2**&n + 1); %put nsubs=&nsubs k=&k x: &x1 &x2 &x3; %do j=1 %to &nsubs; %syscall graycode(k, x1, x2, x3); %put &j: k=&k x: &x1 &x2 &x3 sysinfo=&sysinfo; %end; %mend; %test;
SAS writes the following output to the log: nsubs=9 k=-1 x: 1: k=0 x: 0 0 0 2: k=1 x: 1 0 0 3: k=2 x: 1 1 0 4: k=1 x: 0 1 0 5: k=2 x: 0 1 1 6: k=3 x: 1 1 1
. . . sysinfo=0 sysinfo=1 sysinfo=2 sysinfo=1 sysinfo=3 sysinfo=1
Functions and CALL Routines
4
CALL IS8601_CONVERT Routine
439
7: k=2 x: 1 0 1 sysinfo=2 8: k=1 x: 0 0 1 sysinfo=1 9: k=0 x: 0 0 0 sysinfo=3
Example 3: Using %SYSCALL with a Character Variable and Negative k
The following example uses the %SYSCALL macro with a character variable to generate subsets in a minimal change order. %macro test(n); %*** Initialize the character variable to a sufficiently long nonblank, nonnumeric value. ; %let x=%sysfunc(repeat(_, &n-1)); %let k=-1; %let nsubs=%eval(2**&n + 1); %put nsubs=&nsubs k=&k x="&x"; %do j=1 %to &nsubs; %syscall graycode(k, x, n); %put &j: k=&k x="&x" sysinfo=&sysinfo; %end; %mend; %test(3);
SAS writes the following output to the log: nsubs=9 k=-1 x="___" 1: k=0 x="OOO" sysinfo=0 2: k=1 x="IOO" sysinfo=1 3: k=2 x="IIO" sysinfo=2 4: k=1 x="OIO" sysinfo=1 5: k=2 x="OII" sysinfo=3 6: k=3 x="III" sysinfo=1 7: k=2 x="IOI" sysinfo=2 8: k=1 x="OOI" sysinfo=1 9: k=0 x="OOO" sysinfo=3
See Also Functions: “GRAYCODE Function” on page 770
CALL IS8601_CONVERT Routine Converts an ISO 8601 interval to datetime and duration values, or converts datetime and duration values to an ISO 8601 interval. Category: Date and Time
Syntax CALL IS8601_CONVERT( convert-from, convert-to, < from-variables>, ,
440
CALL IS8601_CONVERT Routine
4
Chapter 4
Arguments convert-from
specifies a keyword in single quotation marks that indicates whether the source for the conversion is an interval, a datetime and duration value, or a duration value. convert-from can have one of the following values: ’intvl’
specifies that the source value for the conversion is an interval value.
’dt/du’
specifies that the source value for the conversion is a datetime/duration value.
’du/dt’
specifies that the source value for the conversion is a duration/datetime value.
’dt/dt’
specifies that the source value for the conversion is a datetime/datetime value.
’du’
specifies that the source value for the conversion is a duration value.
convert-to
specifies a keyword in single quotation marks that indicates the results of the conversion. convert-to can have one of the following values: ’intvl’
specifies to create an interval value.
’dt’/du’
specifies to create a datetime/duration interval.
’du/dt’
specifies to create a duration/datetime interval.
’dt/dt’
specifies to create a datetime/datetime interval.
’du’
specifies to create a duration.
’start’
specifies to create a value that is the beginning datetime or duration of an interval value.
’end’
specifies to create a value that is the ending datetime or duration of an interval value.
from-variable
specifies one or two variables that contain the source value. Specify one variable for an interval value and two variables, one each, for datetime and duration values. The datetime and duration values are interval components where the first value is the beginning value of the interval and the second value is the ending value of the interval. An integer variable must be at least a 16-byte character variable whose value is determined by reading the value using either the $N8601B informat or the $N8601E informat, or the integer variable is an integer value returned from invoking the CALL ISO8601_CONVERT routine.
Requirement:
A datetime value must be either a SAS datetime value or an 8-byte character value that is read by the $N8601B informat or the $N8601E informat, or by invoking the CALL ISO8601_CONVERT routine.
Requirement:
A duration value must be a numeric value that represents the number of seconds in the duration or an 8–byte character value whose value is determined by reading the value using either the $N8601B informat or the $N8601E informat, or by invoking the CALL ISO8601_CONVERT routine.
Requirement:
Functions and CALL Routines
4
CALL IS8601_CONVERT Routine
441
to-variable
specifies one or two variables that contain converted values. Specify one variable for in interval value and two variables, one each, for datetime and duration values. Requirement: The interval variable must be at least a 16-byte character variable. Tip: The datetime and duration variables can be either numeric or character. To avoid losing precision of a numeric value, the length of a numeric variable needs to be at least 8 characters. Datetime and duration character variables must be at least 16 bytes; they are padded with blank characters for values that are less than the length of the variable. date_time_replacements
specifies date or time component values to use when a month, day, or time component is omitted from an interval, datetime, or duration value. date_time_replacements is specified as a series of numbers separated by a comma to represent, in this order, the year, month, day, hour, minute, or second. Components of date_time_replacements can be omitted only in the reverse order, seconds, minutes, hours, day, and month. If no substitute values are specified, the conversion is done using default values. Defaults: The following are default values for omitted date and time components: month
1
day
1
hour
0
minute
0
second
0 Requirements: A year component must be part of the datetime or duration value, and therefore is not valid in date_time_replacements. A comma is required as a placeholder for the year in date_time_replacements. For example, in the replacement value string, ,9,4,,2,’, the first comma is a placeholder for a year value.
Examples This DATA step uses the ISO8601_CONVERT function to do the following: 3 create an interval by using datetime and duration values 3 create datetime and duration values from an interval that was created using the CALL IS8601_CONVERT routine 3 create an interval from datetime and duration values, using replacement values for omitted date and time components in the datetime value For easier reading, numeric variables end with an N and character variables end with a C. data _null_; /** /** /** /**
declare variable length and type **/ Character datetime and duration values must be at least **/ 16 characters. In order not to lose precision, the **/ numeric datetime value has a length of 8. **/
length dtN duN 8 dtC duC $16 intervalC $32; /** assign a numeric datetime value and a /** character duration value.
**/ **/
442
CALL IS8601_CONVERT Routine
4
Chapter 4
dtN=’15Sep2008:09:00:00’dt; duC=input(’P2y3m4dT5h6m7s’, $n8601b.); put dtN=; put duC=; /** Create an interval from a datetime and duration value /** and format it using the ISO 8601 extended notation for /** character values.
**/ **/ **/
call is8601_convert(’dt/du’, ’intvl’, dtN, duC, intervalC); put ’** Character interval created from datetime and duration values **/’; put intervalC $n8601e.; put ’ ’; /** Create numeric datetime and duration values from an interval /** and format it using the ISO 8601 extended notation for /** numeric values.
**/ **/ **/
call is8601_convert(’intvl’, ’dt/du’, intervalC, dtN, duN); put put put put
’** Character datetime and duration created from an interval dtN=; duN=; ’ ’;
/** assign a new datetime value with omitted components
**/’;
**/
dtC=input(’2009---15T10:-:-’, $n8601b.); put ’** This datetime is a character value. **’; put dtC $n8601h.; put ’ ’; /** Create an interval by reading in a datetime value **/ /** with omitted date and time components. Use replacement **/ /** values for the month, minutes, and seconds. **/
call is8601_convert(’du/dt’, ’intvl’, duC, dtC, intervalC,,7,,,35,45); put put put put put run;
’** Interval created using a datetime with omitted values, ’** inserting replacement values for month (7), minute (35) ’** seconds (45). intervalC $n8601e.; ’ ’;
**’; **’; **’;
Functions and CALL Routines
4
CALL LABEL Routine
443
The following output appears in the SAS log: dtN=1537088400 duC=0002304050607FFC ** Character interval created from datetime and duration values **/ 2008-09-15T09:00:00.000/P2Y3M4DT5H6M7S ** Character datetime and duration created from an interval dtN=1537088400 duN=71211967
**/
** This datetime is a character value. ** 2009---15T10:-:** ** **
Interval created using a datetime with omitted values, inserting replacement values for month (7), minute (35) seconds (45).
** ** **
P2Y3M4DT5H6M7S/2009-07-15T10:35:45 NOTE: DATA statement used (Total process time): real time 0.04 seconds cpu time 0.03 seconds
CALL LABEL Routine Assigns a variable label to a specified character variable. Category: Variable Control
Syntax CALL LABEL(variable-1,variable-2);
Arguments
variable-1
specifies any SAS variable. If variable-1 does not have a label, the variable name is assigned as the value of variable-2. variable-2
specifies any SAS character variable. Variable labels can be up to 256 characters long; therefore, the length of variable-2 should be at least 256 characters to avoid truncating variable labels. Note: To conserve space, you should set the length of variable-2 to the length of the label for variable-1, if it is known. 4
Details The CALL LABEL routine assigns the label of the variable-1 variable to the character variable variable-2.
444
4
CALL LEXCOMB Routine
Chapter 4
Examples This example uses the CALL LABEL routine with array references to assign the labels of all variables in the data set OLD as values of the variable LAB in data set NEW: data new; set old; /* lab is not in either array */ length lab $256; /* all character variables in old */ array abc{*} _character_; /* all numeric variables in old */ array def{*} _numeric_; do i=1 to dim(abc); /* get label of character variable */ call label(abc{i},lab); /* write label to an observation */ output; end; do j=1 to dim(def); /* get label of numeric variable */ call label(def{j},lab); /* write label to an observation */ output; end; stop; keep lab; run;
See Also Function: “VLABEL Function” on page 1173
CALL LEXCOMB Routine Generates all distinct combinations of the non-missing values of n variables taken k at a time in lexicographic order. Category:
Combinatorial
When invoked by the %SYSCALL macro statement, CALL LEXCOMB removes the quotation marks from its arguments. For more information, see “Using CALL Routines and the %SYSCALL Macro Statement” on page 304.
Interaction:
Syntax CALL LEXCOMB(count, k, variable-1, …, variable-n);
Functions and CALL Routines
4
CALL LEXCOMB Routine
445
Arguments count
specifies an integer value that is assigned values from 1 to the number of combinations in a loop. k
specifies an integer constant, variable, or expression between 1 and n, inclusive, that specifies the number of items in each combination. variable
specifies either all numeric variables, or all character variables that have the same length. The values of these variables are permuted. Requirement: Initialize these variables before you call the LEXCOMB routine. Tip: After calling LEXCOMB, the first k variables contain the values in one combination.
Details The Basics Use the CALL LEXCOMB routine in a loop where the first argument to CALL LEXCOMB takes each integral value from 1 to the number of distinct combinations of the non-missing values of the variables. In each call to LEXCOMB within this loop, k should have the same value. Number of Combinations
When all of the variables have non-missing, unequal values, then the number of combinations is COMB(n,k). If the number of variables that have missing values is m, and all the non-missing values are unequal, then LEXCOMB produces COMB(n-m,k) combinations because the missing values are omitted from the combinations. When some of the variables have equal values, the exact number of combinations is difficult to compute. If you cannot compute the exact number of combinations, use the LEXCOMB function instead of the CALL LEXCOMB routine.
CALL LEXCOMB Processing
On the first call to the LEXCOMB routine, the following actions occur: 3 The argument types and lengths are checked for consistency. 3 The m missing values are assigned to the last m arguments. 3 The n-m non-missing values are assigned in ascending order to the first n-m arguments following count.
On subsequent calls, up to and including the last combination, the next distinct combination of the non-missing values is generated in lexicographic order. If you call the LEXCOMB routine with the first argument out of sequence, then the results are not useful. In particular, if you initialize the variables and then immediately call the LEXCOMB routine with a first argument of j, for example, you will not get the th th j combination (except when j is 1). To get the j combination, you must call the LEXCOMB routine j times, with the first argument taking values from 1 through j in that exact order.
Using the CALL LEXCOMB Routine with Macros You can call the LEXCOMB routine when you use the %SYSCALL macro. In this case, the variable arguments are not required to be the same length, but they are required to be the same type. If %SYSCALL identifies an argument as numeric, then %SYSCALL reformats the returned value.
446
CALL LEXCOMB Routine
4
Chapter 4
If an error occurs during the execution of the CALL LEXCOMB routine, then both of the following values are set:
3 &SYSERR is assigned a value that is greater than 4. 3 &SYSINFO is assigned a value that is less than –100. If there are no errors, then &SYSERR is set to zero, and &SYSINFO is set to one of the following values:
3 1 if count=1 and at lease one variable has a non-missing value 3 1 if the value of variable-1 changed 3 j if variable-1 through variable-i did not change, but variable-j did change, where j=i+1
3 –1 if all distinct combinations have already been generated
Comparisons The CALL LEXCOMB routine generates all distinct combinations of the non-missing values of n variables taken k at a time in lexicographic order. The CALL ALLCOMB routine generates all combinations of the values of n variables taken k at a time in a minimal change order.
Examples Example 1: Using CALL LEXCOMB in a DATA Step
The following example calls the LEXCOMB routine to generate distinct combinations in lexicographic order. data _null_; array x[5] $3 (’ant’ ’bee’ ’cat’ ’dog’ ’ewe’); n=dim(x); k=3; ncomb=comb(n,k); do j=1 to ncomb; call lexcomb(j, k, of x[*]); put j 5. +3 x1-x3; end; run;
SAS writes the following output to the log: 1 2 3 4 5 6 7 8 9 10
ant ant ant ant ant ant bee bee bee cat
bee bee bee cat cat dog cat cat dog dog
cat dog ewe dog ewe ewe dog ewe ewe ewe
Example 2: Using CALL LEXCOMB with Macros The following is an example of the CALL LEXCOMB routine that is used with macros. The output includes values for the %SYSINFO macro. %macro test; %let x1=ant; %let x2=baboon;
Functions and CALL Routines
4
CALL LEXCOMBI Routine
%let x3=baboon; %let x4=hippopotamus; %let x5=zebra; %let k=2; %let ncomb=%sysfunc(comb(5,&k)); %do j=1 %to &ncomb; %syscall lexcomb(j, k, x1, x2, x3, x4, x5); %let jfmt=%qsysfunc(putn(&j, 5. )); %let pad=%qsysfunc(repeat(%str( ), 20-%length(&x1 &x2))); %put &jfmt: &x1 &x2 &pad sysinfo=&sysinfo; %if &sysinfo < 0 %then %let j=%eval(&ncomb+1); %end; %mend; %test
SAS writes the following output to the log: 1: 2: 3: 4: 5: 6: 7: 8:
ant baboon ant hippopotamus ant zebra baboon baboon baboon hippopotamus baboon zebra hippopotamus zebra hippopotamus zebra
sysinfo=1 sysinfo=2 sysinfo=2 sysinfo=1 sysinfo=2 sysinfo=2 sysinfo=1 sysinfo=-1
See Also Functions and CALL Routines: “LEXCOMB Function” on page 861 “CALL ALLCOMB Routine” on page 418
CALL LEXCOMBI Routine Generates all combinations of the indices of n objects taken k at a time in lexicographic order. Category: Combinatorial
Syntax CALL LEXCOMBI(n, k, index-1, …, index-k);
Arguments n
is a numeric constant, variable, or expression that specifies the total number of objects.
447
448
CALL LEXCOMBI Routine
4
Chapter 4
k
is a numeric constant, variable, or expression that specifies the number of objects in each combination. index
is a numeric variable that contains indices of the objects in the combination that is returned. Indices are integers between 1 and n, inclusive. Tip: If index-1 is missing or zero, then the CALL LEXCOMBI routine initializes the indices to index-1=1 through index-k=k. Otherwise, CALL LEXCOMBI creates a new combination by removing one index from the combination and adding another index.
Details CALL LEXCOMBI Processing
Before the first call to the LEXCOMBI routine, complete one of the following tasks: 3 Set index-1 equal to zero or to a missing value. 3 Initialize index-1 through index-k to distinct integers between 1 and n inclusive.
The number of combinations of n objects taken k at a time can be computed as COMB(n,k). To generate all combinations of n objects taken k at a time, call LEXCOMBI in a loop that executes COMB(n,k) times.
Using the CALL LEXCOMBI Routine with Macros If you call the LEXCOMBI routine from the macro processor with %SYSCALL, then you must initialize all arguments to numeric values. %SYSCALL reformats the values that are returned. If an error occurs during the execution of the CALL LEXCOMBI routine, then both of the following values are set: 3 &SYSERR is assigned a value that is greater than 4. 3 &SYSINFO is assigned a value that is less than –100. If there are no errors, then &SYSERR is set to zero, and &SYSINFO is set to one of the following values: 3 1 if the value of variable-1 changed 3 j if variable-1 through variable-i did not change, but variable-j did change, where j=i+1 3 –1 if all distinct combinations have already been generated
Comparisons The CALL LEXCOMBI routine generates all combinations of the indices of n objects taken k at a time in lexicographic order. The CALL ALLCOMBI routine generates all combinations of the indices of n objects taken k at a time in a minimum change order.
Examples Example 1: Using the CALL LEXCOMBI Routine with the DATA Step
The following example uses the CALL LEXCOMBI routine to generate combinations of indices in lexicographic order. data _null_; array x[5] $3 (’ant’ ’bee’ ’cat’ ’dog’ ’ewe’); array c[3] $3; array i[3]; n=dim(x);
Functions and CALL Routines
4
CALL LEXCOMBI Routine
k=dim(i); i[1]=0; ncomb=comb(n,k); do j=1 to ncomb; call lexcombi(n, k, of i[*]); do h=1 to k; c[h]=x[i[h]]; end; put @4 j= @10 ’i= ’ i[*] +3 ’c= ’ c[*]; end; run;
SAS writes the following output to the log: j=1 j=2 j=3 j=4 j=5 j=6 j=7 j=8 j=9 j=10
i= i= i= i= i= i= i= i= i= i=
1 1 1 1 1 1 2 2 2 3
2 2 2 3 3 4 3 3 4 4
3 4 5 4 5 5 4 5 5 5
c= c= c= c= c= c= c= c= c= c=
ant ant ant ant ant ant bee bee bee cat
bee bee bee cat cat dog cat cat dog dog
cat dog ewe dog ewe ewe dog ewe ewe ewe
Example 2: Using the CALL LEXCOMBI Routine with Macros and Displaying the Return Code The following example uses the CALL LEXCOMBI routine with macros. The output includes values for the %SYSINFO macro. %macro test; %let x1=0; %let x2=0; %let x3=0; %let n=5; %let k=3; %let ncomb=%sysfunc(comb(&n,&k)); %do j=1 %to &ncomb+1; %syscall lexcombi(n,k,x1,x2,x3); %let jfmt=%qsysfunc(putn(&j,5.)); %let pad=%qsysfunc(repeat(%str(),6-%length(&x1 &x2 &x3))); %put &jfmt: &x1 &x2 &x3 &pad sysinfo=&sysinfo; %end; %mend; %test
SAS writes the following output to the log: 1: 2: 3: 4: 5: 6: 7: 8: 9:
1 1 1 1 1 1 2 2 2
2 2 2 3 3 4 3 3 4
3 4 5 4 5 5 4 5 5
sysinfo=1 sysinfo=3 sysinfo=3 sysinfo=2 sysinfo=3 sysinfo=2 sysinfo=1 sysinfo=3 sysinfo=2
449
450
CALL LEXPERK Routine
4
Chapter 4
10: 3 4 5 11: 3 4 5
sysinfo=1 sysinfo=-1
See Also Functions and CALL Routines: “CALL LEXCOMB Routine” on page 444 “CALL ALLCOMBI Routine” on page 421
CALL LEXPERK Routine Generates all distinct permutations of the non-missing values of n variables taken k at a time in lexicographic order. Category:
Combinatorial
When invoked by THE %SYSCALL macro statement, CALL LEXPERK removes the quotation marks from its arguments. For more information, see “Using CALL Routines and the %SYSCALL Macro Statement” on page 304.
Interaction:
Syntax CALL LEXPERK(count, k, variable-1, …, variable-n);
Arguments
count
specifies an integer variable that is assigned a value from 1 to the number of permutations in a loop. k
specifies an integer constant, variable, or expression between 1 and n, inclusive, that specifies the number of items in each permutation. variable
specifies either all numeric variables, or all character variables that have the same length. The values of these variables are permuted. Requirement:
Initialize these variables before you call the LEXPERK routine.
After calling LEXPERK, the first k variables contain the values in one permutation.
Tip:
Details The Basics Use the CALL LEXPERK routine in a loop where the first argument to CALL LEXPERK accepts each integral value from 1 to the number of distinct permutations of k non-missing values of the variables. In each call to LEXPERK within this loop, k should have the same value.
Functions and CALL Routines
4
CALL LEXPERK Routine
451
Number of Permutations When all of the variables have non-missing, unequal values, the number of permutations is PERM(,k). If the number of variables that have missing values is m, and all the non-missing values are unequal, CALL LEXPERK produces PERM(n-m,k) permutations because the missing values are omitted from the permutations. When some of the variables have equal values, the exact number of permutations is difficult to compute. If you cannot compute the exact number of permutations, use the LEXPERK function instead of the CALL LEXPERK routine. CALL LEXPERK Processing On the first call to the LEXPERK routine, the following actions occur: 3 The argument types and lengths are checked for consistency. 3 The m missing values are assigned to the last m arguments. 3 The n-m non-missing values are assigned in ascending order to the first n-m arguments following count. On subsequent calls, up to and including the last permutation, the next distinct permutation of k non-missing values is generated in lexicographic order. If you call the LEXPERK routine with the first argument out of sequence, then the results are not useful. In particular, if you initialize the variables and then immediately th call the LEXPERK routine with a first argument of j, for example, you will not get thej th permutation (except when j is 1). To get the j permutation, you must call LEXPERK j times, with the first argument taking values from 1 through j in that exact order.
Using the CALL LEXPERK Routine with Macros
You can call the LEXPERK routine when you use the %SYSCALL macro. In this case, the variable arguments are not required to be the same length, but they are required to be the same type. If %SYSCALL identifies an argument as numeric, then %SYSCALL reformats the returned value. If an error occurs during the execution of the CALL LEXPERK routine, then both of the following values are set: 3 &SYSERR is assigned a value that is greater than 4. 3 &SYSINFO is assigned a value that is less than –100. If there are no errors, then &SYSERR is set to zero, and &SYSINFO is set to one of the following values: 3 1 if count=1 and at least one variable has a non-missing value 3 1 if count>1 and the value of variable-1 changed 3 j if count>1 and variable-1 through variable-i did not change, but variable-j did change, where j=i+1 3 –1 if all distinct permutations were already generated
Comparisons The CALL LEXPERK routine generates all distinct permutations of the non-missing values of n variables taken k at a time in lexicographic order. The CALL ALLPERM routine generates all permutations of the values of several variables in a minimal change order.
Examples Example 1: Using CALL LEXPERK in a DATA Step CALL LEXPERK routine. data _null_; array x[5] $3 (’V’ ’W’ ’X’ ’Y’ ’Z’);
The following is an example of the
452
CALL LEXPERK Routine
4
Chapter 4
n=dim(x); k=3; nperm=perm(n,k); do j=1 to nperm; call lexperk(j, k, of x[*]); put j 5. +3 x1-x3; end; run;
SAS writes the following output to the log: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44
V V V V V V V V V V V V W W W W W W W W W W W W X X X X X X X X X X X X Y Y Y Y Y Y Y Y
W W W X X X Y Y Y Z Z Z V V V X X X Y Y Y Z Z Z V V V W W W Y Y Y Z Z Z V V V W W W X X
X Y Z W Y Z W X Z W X Y X Y Z V Y Z V X Z V X Y W Y Z V Y Z V W Z V W Y W X Z V X Z V W
Functions and CALL Routines
45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Y Y Y Y Z Z Z Z Z Z Z Z Z Z Z Z
X Z Z Z V V V W W W X X X Y Y Y
4
CALL LEXPERK Routine
453
Z V W X W X Y V X Y V W Y V W X
Example 2: Using CALL LEXPERK with Macros The following is an example of the CALL LEXPERK routine that is used with macros. The output includes values for the %SYSINFO macro. %macro test; %let x1=ant; %let x2=baboon; %let x3=baboon; %let x4=hippopotamus; %let x5=zebra; %let k=2; %let nperk=%sysfunc(perm(5,&k)); %do j=1 %to &nperk; %syscall lexperk(j, k, x1, x2, x3, x4, x5); %let jfmt=%qsysfunc(putn(&j,5.)); %let pad=%qsysfunc(repeat(%str(),20-%length(&x1 &x2))); %put &jfmt: &x1 &x2 &pad sysinfo=&sysinfo; %if &sysinfo1, an acceptance-rejection method by Cheng is used (Cheng, 1977; see in “References” on page 1213). For a 1, an acceptance-rejection method by Fishman is used (Fishman, 1978; see in “References” on page 1213). For a discussion and example of an effective use of the random number CALL routines, see “Starting, Stopping, and Restarting a Stream” on page 317.
Comparisons The CALL RANGAM routine gives greater control of the seed and random number streams than does the RANGAM function.
Examples This example uses the CALL RANGAM routine: options nodate pageno=1 linesize=80 pagesize=60; data u1(keep=x); seed = 104; do i = 1 to 5; call rangam(seed, x); output; end; call symputx(’seed’, seed); run; data u2(keep=x); seed = &seed; do i = 1 to 5; call rangam(seed, x); output; end; run; data all; set u1 u2; z = rangam(104); run; proc print label; label x = ’Separate Streams’ z = ’Single Stream’; run;
Functions and CALL Routines
Output 4.17
4
CALL RANNOR Routine
Output from the CALL RANGAM Routine The SAS System
Output 4.18
485
1
Obs
Separate Streams
Single Stream
1 2 3 4 5 6 7 8 9 10
1.44347 0.11740 0.54175 0.02280 0.16645 0.21711 0.75538 1.21760 1.72273 0.08021
1.44347 0.11740 0.54175 0.02280 0.16645 0.21711 0.75538 1.21760 1.72273 0.08021
The RANGAM Example The SAS System
i
Seed_1
1 2 3 4 5 6 7 8 9 10
1404437564 1326029789 1988843719 50049159 802575599 100573943 1986749826 52428589 1216356463 805366679
Seed_2
Seed_3
1404437564 1326029789 1988843719 50049159 18 991271755 1437043694 959908645 1225034217 425626811
45 45 45 45 18 18 18 18 18 18
1 X1
1.30569 1.87514 1.71597 1.59304 0.43342 1.11812 0.68415 1.62296 2.26455 2.16723
X2 1.30569 1.87514 1.71597 1.59304 0.43342 1.32646 0.88806 2.46091 4.06596 6.94703
X3 1.30569 1.87514 1.71597 1.59304 0.43342 1.11812 0.68415 1.62296 2.26455 2.16723
Changing Seed_2 for the CALL RANGAM statement, when I=5, forces the stream of the variates for X2 to deviate from the stream of the variates for X1. Changing Seed_3 on the RANGAM function, however, has no effect.
See Also Functions: “RAND Function” on page 1034 “RANGAM Function” on page 1046
CALL RANNOR Routine Returns a random variate from a normal distribution. Category: Random Number
486
CALL RANNOR Routine
4
Chapter 4
Syntax CALL RANNOR(seed,x);
Arguments seed
is the seed value. A new value for seed is returned each time CALL RANNOR is executed. 31
seed < 2 - 1
Range: Note:
If seed ≤ 0, the time of day is used to initialize the seed stream.
“Seed Values” on page 306 and “Comparison of Seed Values in Random-Number Functions and CALL Routines” on page 310 for more information about seed values
See: x
is a numeric variable. A new value for the random variate x is returned each time CALL RANNOR is executed.
Details The CALL RANNOR routine updates seed and returns a variate x that is generated from a normal distribution, with mean 0 and variance 1. By adjusting the seeds, you can force streams of variates to agree or disagree for some or all of the observations in the same, or in subsequent, DATA steps. The CALL RANNOR routine uses the Box-Muller transformation of RANUNI uniform variates. For a discussion and example of an effective use of the random number CALL routines, see “Starting, Stopping, and Restarting a Stream” on page 317.
Comparisons The CALL RANNOR routine gives greater control of the seed and random number streams than does the RANNOR function.
Examples This example uses the CALL RANNOR routine: options pageno=1 ls=80 ps=64 nodate; data u1(keep=x); seed = 104; do i = 1 to 5; call rannor(seed, X); output; end; call symputx(’seed’, seed); run; data u2(keep=x); seed = &seed; do i = 1 to 5;
Functions and CALL Routines
4
CALL RANPERK Routine
487
call rannor(seed, X); output; end; run; data all; set u1 u2; z = rannor(104); run; proc print label; label x = ’Separate Streams’ z = ’Single Stream’; run;
Output 4.19
Output from the CALL RANNOR Routine The SAS System
Obs
Separate Streams
Single Stream
1 2 3 4 5 6 7 8 9 10
1.30390 1.03049 0.19491 -0.34987 1.64273 -1.75842 0.75080 0.94375 0.02436 -0.97256
1.30390 1.03049 0.19491 -0.34987 1.64273 -1.75842 0.75080 0.94375 0.02436 -0.97256
1
See Also Functions: “RAND Function” on page 1034 “RANNOR Function” on page 1049
CALL RANPERK Routine Randomly permutes the values of the arguments, and returns a permutation of k out of n values. Category: Combinatorial
Syntax CALL RANPERK(seed, k, variable-1< , variable–2, ...>);
488
CALL RANPERK Routine
4
Chapter 4
Arguments seed
is a numeric variable that contains the random number seed. For more information about seeds, see “Seed Values” on page 306. k
is the number of values that you want to have in the random permutation. variable
specifies all numeric variables, or all character variables that have the same length. K values of these variables are randomly permuted.
Details Using CALL RANPERK with Macros You can call the RANPERK routine when you use the %SYSCALL macro. In this case, the variable arguments are not required to be the same type or length. If %SYSCALL identifies an argument as numeric, then %SYSCALL reformats the returned value. If an error occurs during the execution of the CALL RANPERK routine, then both of the following values are set: 3 &SYSERR is assigned a value that is greater than 4. 3 &SYSINFO is assigned a value that is less than -100. If there are no errors, then &SYSERR and &SYSINFO are set to zero.
Examples Example 1: Using CALL RANPERK in a DATA Step The following example shows how to generate random permutations of given values by using the CALL RANPERK routine. data _null_; array x x1-x5 (1 2 3 4 5); seed = 1234567890123; do n=1 to 10; call ranperk(seed, 3, of x1-x5); put seed= @20 ’ x= ’ x1-x3; end; run;
Output 4.20
Log Output from Using the CALL RANPERK Routine in a DATA Step
seed=1332351321 seed=829042065 seed=767738639 seed=1280236105 seed=670350431 seed=1956939964 seed=353939815 seed=1996660805 seed=1835940555 seed=910897519
x= x= x= x= x= x= x= x= x= x=
5 4 5 3 4 3 4 3 5 5
4 1 1 2 3 1 2 4 1 1
2 3 2 5 5 2 1 5 4 2
Example 2: Using CALL RANPERK with a Macro
The following is an example of the CALL RANPERK routine that is used with macros.
Functions and CALL Routines
%macro test; %let x1=ant; %let x2=-.1234; %let x3=1e10; %let x4=hippopotamus; %let x5=zebra; %let k=3; %let seed = 12345; %do j=1 %to 10; %syscall ranperk(seed, k, x1, x2, x3, x4, x5); %put j=&j &x1 &x2 &x3; %end; %mend; %test;
Output 4.21 j=1 j=2 j=3 j=4 j=5 j=6 j=7 j=8 j=9 j=10
Output from Using the CALL RANPERK Routine with a Macro
-0.1234 hippopotamus zebra hippopotamus -0.1234 10000000000 hippopotamus ant zebra -0.1234 zebra ant -0.1234 ant hippopotamus 10000000000 hippopotamus ant 10000000000 hippopotamus ant ant 10000000000 -0.1234 zebra -0.1234 10000000000 zebra hippopotamus 10000000000
See Also Functions and CALL Routines: “RAND Function” on page 1034 “CALL ALLPERM Routine” on page 423 “CALL RANPERM Routine” on page 489
CALL RANPERM Routine Randomly permutes the values of the arguments. Category: Combinatorial
Syntax CALL RANPERM(seed, variable-1< , variable–2, ...>);
4
CALL RANPERM Routine
489
490
CALL RANPERM Routine
4
Chapter 4
Arguments
seed
is a numeric variable that contains the random number seed. For more information about seeds, see “Seed Values” on page 306. variable
specifies all numeric variables or all character variables that have the same length. The values of these variables are randomly permuted.
Details Using CALL RANPERM with Macros You can call the RANPERM routine when you use the %SYSCALL macro. In this case, the variable arguments are not required to be the same type or length. If %SYSCALL identifies an argument as numeric, then %SYSCALL reformats the returned value. If an error occurs during the execution of the CALL RANPERM routine, then both of the following values are set:
3 &SYSERR is assigned a value that is greater than 4. 3 &SYSINFO is assigned a value that is less than -100. If there are no errors, then &SYSERR and &SYSINFO are set to zero.
Examples Example 1: Using CALL RANPERM in a DATA Step
The following example generates random permutations of given values by using the CALL RANPERM routine. data _null_; array x x1-x4 (1 2 3 4); seed = 1234567890123; do n=1 to 10; call ranperm(seed, of x1-x4); put seed= @20 ’ x= ’ x1-x4; end; run;
Output 4.22
Output from Using the CALL RANPERM Routine in a DATA Step
seed=1332351321 seed=829042065 seed=767738639 seed=1280236105 seed=670350431 seed=1956939964 seed=353939815 seed=1996660805 seed=1835940555 seed=910897519
x= x= x= x= x= x= x= x= x= x=
1 3 4 1 2 2 4 4 4 3
3 4 2 2 1 4 1 3 3 2
2 2 3 4 4 3 2 1 2 1
4 1 1 3 3 1 3 2 1 4
Example 2: Using CALL RANPERM with a Macro
The following is an example of the CALL RANPERM routine that is used with the %SYSCALL macro.
Functions and CALL Routines
%macro test; %let x1=ant; %let x2=-.1234; %let x3=1e10; %let x4=hippopotamus; %let x5=zebra; %let seed = 12345; %do j=1 %to 10; %syscall ranperm(seed, x1, x2, x3, x4, x5); %put j=&j &x1 &x2 &x3; %end; %mend; %test;
Output 4.23 j=1 j=2 j=3 j=4 j=5 j=6 j=7 j=8 j=9 j=10
Output from Using the CALL RANPERM Routine with a Macro
zebra ant hippopotamus 10000000000 ant -0.1234 -0.1234 10000000000 ant hippopotamus ant zebra -0.1234 zebra 10000000000 -0.1234 hippopotamus ant zebra ant -0.1234 -0.1234 hippopotamus ant ant -0.1234 hippopotamus -0.1234 zebra 10000000000
See Also Functions and CALL Routines: “RAND Function” on page 1034 “CALL ALLPERM Routine” on page 423 “CALL RANPERK Routine” on page 487
CALL RANPOI Routine Returns a random variate from a Poisson distribution. Category: Random Number
Syntax CALL RANPOI(seed,m,x);
4
CALL RANPOI Routine
491
492
CALL RANPOI Routine
4
Chapter 4
Arguments seed
is the seed value. A new value for seed is returned each time CALL RANPOI is executed. 31 Range: seed < 2 - 1 Note: If seed ≤ 0, the time of day is used to initialize the seed stream. See: “Seed Values” on page 306 and “Comparison of Seed Values in Random-Number Functions and CALL Routines” on page 310 for more information about seed values m
is a numeric mean parameter. Range: m 0
x
is a numeric variable. A new value for the random variate x is returned each time CALL RANPOI is executed.
Details The CALL RANPOI routine updates seed and returns a variate x that is generated from a Poisson distribution, with mean m. By adjusting the seeds, you can force streams of variates to agree or disagree for some or all of the observations in the same, or in subsequent, DATA steps. For m< 85, an inverse transform method applied to a RANUNI uniform variate is used (Fishman, 1976; see in “References” on page 1213). For m ≥ 85, the normal approximation of a Poisson random variable is used. To expedite execution, internal variables are calculated only on initial calls (that is, with each new m). For a discussion and example of an effective use of the random number CALL routines, see “Starting, Stopping, and Restarting a Stream” on page 317.
Comparisons The CALL RANPOI routine gives greater control of the seed and random number streams than does the RANPOI function.
Examples This example uses the CALL RANPOI routine:
options pageno=1 ls=80 ps=64 nodate; data u1(keep=x); seed = 104; do i = 1 to 5; call ranpoi(seed, 2000, x); output; end; call symputx(’seed’, seed); run;
Functions and CALL Routines
4
CALL RANTBL Routine
data u2(keep=x); seed = &seed; do i = 1 to 5; call ranpoi(seed, 2000, x); output; end; run; data all; set u1 u2; z = ranpoi(104, 2000); run; proc print label; label x = ’Separate Streams’ z = ’Single Stream’; run;
Output 4.24
Output from the CALL RANPOI Routine The SAS System
Obs
Separate Streams
Single Stream
1 2 3 4 5 6 7 8 9 10
2058 2046 2009 1984 2073 1921 2034 2042 2001 1957
2058 2046 2009 1984 2073 1921 2034 2042 2001 1957
See Also Functions: “RAND Function” on page 1034 “RANPOI Function” on page 1050
CALL RANTBL Routine Returns a random variate from a tabled probability distribution. Category: Random Number
Syntax CALL RANTBL(seed,p1,...pi,...pn,x);
1
493
494
CALL RANTBL Routine
4
Chapter 4
Arguments seed
is the seed value. A new value for seed is returned each time CALL RANTBL is executed. 31 Range: seed < 2 - 1 Note: If seed ≤ 0, the time of day is used to initialize the seed stream. See: “Seed Values” on page 306 and “Comparison of Seed Values in Random-Number Functions and CALL Routines” on page 310 for more information about seed values pi
is a numeric SAS value. Range: 0 pi 1 for 0< i n
x
is a numeric SAS variable. A new value for the random variate x is returned each time CALL RANTBL is executed.
Details The CALL RANTBL routine updates seed and returns a variate x generated from the probability mass function defined by p1 through pn. By adjusting the seeds, you can force streams of variates to agree or disagree for some or all of the observations in the same, or in subsequent, DATA steps. An inverse transform method applied to a RANUNI uniform variate is used. The CALL RANTBL routine returns these data:
1
2
with probability p1 with probability p2
: : :
X n + 1 with probability 1 0 n
with probability pn
n
i
=1
X n
pi
if
i
=1
pi
1
If, for some index j)
Arguments
modifier
specifies a character constant, variable, or expression in which each non-blank character modifies the action of the CATQ function. Blanks are ignored. You can use the following characters as modifiers: 1 or ’
uses single quotation marks when CATQ adds quotation marks to a string.
2 or ”
uses double quotation marks when CATQ adds quotation marks to a string.
a or A
adds quotation marks to all of the item arguments.
Functions and CALL Routines
4
CATQ Function
529
b or B
adds quotation marks to item arguments that have leading or trailing blanks that are not removed by the S or T modifiers.
c or C
uses a comma as a delimiter.
d or D
indicates that you have specified the delimiter argument.
h or H
uses a horizontal tab as the delimiter.
m or M
inserts a delimiter for every item argument after the first. If you do not use the M modifier, then CATQ does not insert delimiters for item arguments that have a length of zero after processing that is specified by other modifiers. The M modifier can cause delimiters to appear at the beginning or end of the result and can cause multiple consecutive delimiters to appear in the result.
n or N
converts item arguments to name literals when the value does not conform to the usual syntactic conventions for a SAS name. A name literal is a string in quotation marks that is followed by the letter “n” without any intervening blanks. To use name literals in SAS statements, you must specify the SAS option, VALIDVARNAME=ANY.
q or Q
adds quotation marks to item arguments that already contain quotation marks.
s or S
strips leading and trailing blanks from subsequently processed arguments:
3 To strip leading and trailing blanks from the delimiter argument, specify the S modifier before the D modifier.
3 To strip leading and trailing blanks from the item arguments but not from the delimiter argument, specify the S modifier after the D modifier. t or T
trims trailing blanks from subsequently processed arguments: 3 To trim trailing blanks from the delimiter argument, specify the T modifier before the D modifier. 3 To trim trailing blanks from the item arguments but not from the delimiter argument, specify the T modifier after the D modifier.
x or X
converts item arguments to hexadecimal literals when the value contains nonprintable characters. Tip: If modifier is a constant, enclose it in quotation marks. You can also express modifier as a variable name or an expression. Tip: The A, B, N, Q, S, T, and X modifiers operate internally to the CATQ function. If an item argument is a variable, then the value of that variable is not changed by CATQ unless the result is assigned to that variable.
delimiter
specifies a character constant, variable, or expression that is used as a delimiter between concatenated strings. If you specify this argument, then you must also specify the D modifier. item
specifies a constant, variable, or expression, either character or numeric. If item is numeric, then its value is converted to a character string by using the BESTw. format. In this case, leading blanks are removed and SAS does not write a note to the log.
530
CATQ Function
4
Chapter 4
Details Length of Returned Variable The CATQ function returns a value to a variable or if CATQ is called inside an expression, CATQ returns a value to a temporary buffer. The value that is returned has the following length: 3 up to 200 characters in WHERE clauses and in PROC SQL 3 up to 32767 characters in the DATA step except in WHERE clauses 3 up to 65534 characters when CATQ is called from the macro processor If the length of the variable or the buffer is not large enough to contain the result of the concatenation, then SAS does the following steps: 3 changes the result to a blank value in the DATA step and in PROC SQL 3 writes a warning message to the log stating that the result was either truncated or set to a blank value, depending on the calling environment 3 writes a note to the log that shows the location of the function call and lists the argument that caused the truncation 3 sets _ERROR_ to 1 in the DATA step If CATQ returns a value in a temporary buffer, then the length of the buffer depends on the calling environment, and the value in the buffer can be truncated after CATQ finishes processing. In this case, SAS does not write a message about the truncation to the log.
The Basics If you do not use the C, D, or H modifiers, then CATQ uses a blank as a delimiter. If you specify neither a quotation mark in modifier nor the 1 or 2 modifiers, then CATQ decides independently for each item argument which type of quotation mark to use, if quotation marks are required. The following rules apply: 3 CATQ uses single quotation marks for strings that contain an ampersand (&) or percent (%) sign, or that contain more double quotation marks than single quotation marks. 3 CATQ uses double quotation marks for all other strings. The CATQ function initializes the result to a length of zero and then performs the following actions for each item argument: 1 If item is not a character string, then CATQ converts item to a character string by using the BESTw. format and removes leading blanks. 2 If you used the S modifier, then CATQ removes leading blanks from the string. 3 If you used the S or T modifiers, then CATQ removes trailing blanks from the string. 4 CATQ determines whether to add quotation marks based on the following conditions: 3 If you use the X modifier and the string contains control characters, then the string is converted to a hexadecimal literal. 3 If you use the N modifier, then the string is converted to a name literal if either of the following conditions is true: 3 The first character in the string is not an underscore or an English letter. 3 The string contains any character that is not a digit, underscore, or English letter.
3 If you did not use the X or the N modifiers, then CATQ adds quotation marks to the string if any of the following conditions is true: 3 You used the A modifier.
Functions and CALL Routines
4
CATQ Function
531
3 You used the B modifier and the string contains leading or trailing blanks that were not removed by the S or T modifiers. 3 You used the Q modifier and the string contains quotation marks. 3 The string contains a substring that equals the delimiter with leading and trailing blanks omitted. 5 For the second and subsequent item arguments, CATQ appends the delimiter to
the result if either of the following conditions is true: 3 You used the M modifier. 3 The string has a length greater than zero after it has been processed by the preceding steps. 6 CATQ appends the string to the result.
Comparisons The CATX function is similar to the CATQ function except that CATX does not add quotation marks.
Examples The following example shows how the CATQ function concatenates strings. options ls=110; data _null_; result1=CATQ(’ ’, ’noblanks’, ’one blank’, 12345, ’ lots of blanks ’); result2=CATQ(’CS’, ’Period (.) ’Ampersand (&) ’Comma (,) ’Double quotation marks (") ’ Leading Blanks’); result3=CATQ(’BCQT’, ’Period (.) ’Ampersand (&) ’Comma (,) ’Double quotation marks (") ’ Leading Blanks’); result4=CATQ(’ADT’, ’#=#’, ’Period (.) ’Ampersand (&) ’Comma (,) ’Double quotation marks (") ’ Leading Blanks’); result5=CATQ(’N’, ’ABC_123 ’, ’123 ’, ’ABC 123’); put (result1-result5) (=/);
’, ’, ’, ’,
’, ’, ’, ’,
’, ’, ’, ’,
532
CATS Function
4
Chapter 4
run;
SAS writes the following output to the log. result1=noblanks "one blank" 12345 " lots of blanks " result2=Period (.),Ampersand (&),"Comma (,)",Double quotation marks ("),Leading Blanks result3=Period (.),Ampersand (&),"Comma (,)",’Double quotation marks (")’," Leading Blanks" result4="Period (.)"#=#’Ampersand (&)’#=#"Comma (,)"#=#’Double quotation marks (")’#=#" result5=ABC_123 "123"n "ABC 123"n
Leading Blanks"
See Also Functions and CALL Routines: “CALL CATS Routine” on page 427 “CALL CATT Routine” on page 429 “CALL CATX Routine” on page 430 “CAT Function” on page 526 “CATS Function” on page 532 “CATT Function” on page 534 “CATX Function” on page 537
CATS Function Removes leading and trailing blanks, and returns a concatenated character string. Category: Restriction:
Character “I18N Level 0” on page 305
Syntax CATS(item-1 )
Arguments item
specifies a constant, variable, or expression, either character or numeric. If item is numeric, then its value is converted to a character string by using the BESTw. format. In this case, SAS does not write a note to the log.
Details Length of Returned Variable In a DATA step, if the CATS function returns a value to a variable that has not previously been assigned a length, then that variable is given a length of 200 bytes. If the concatenation operator (||) returns a value to a variable that has not previously
Functions and CALL Routines
4
CATS Function
533
been assigned a length, then that variable is given a length that is the sum of the lengths of the values which are being concatenated.
Length of Returned Variable: Special Cases
The CATS function returns a value to a variable, or returns a value in a temporary buffer. The value that is returned from the CATS function has the following length:
3 up to 200 characters in WHERE clauses and in PROC SQL 3 up to 32767 characters in the DATA step except in WHERE clauses 3 up to 65534 characters when CATS is called from the macro processor If CATS returns a value in a temporary buffer, the length of the buffer depends on the calling environment, and the value in the buffer can be truncated after CATS finishes processing. In this case, SAS does not write a message about the truncation to the log. If the length of the variable or the buffer is not large enough to contain the result of the concatenation, SAS does the following:
3 changes the result to a blank value in the DATA step, and in PROC SQL 3 writes a warning message to the log stating that the result was either truncated or set to a blank value, depending on the calling environment
3 writes a note to the log that shows the location of the function call and lists the argument that caused the truncation
3 sets _ERROR_ to 1 in the DATA step The CATS function removes leading and trailing blanks from numeric arguments after it formats the numeric value with the BESTw. format.
Comparisons The results of the CAT, CATS, CATT, and CATX functions are usually equivalent to results that are produced by certain combinations of the concatenation operator (||) and the TRIM and LEFT functions. However, the default length for the CAT, CATS, CATT, and CATX functions is different from the length that is obtained when you use the concatenation operator. For more information, see “Length of Returned Variable” on page 532. Using the CAT, CATS, CATT, and CATX functions is faster than using TRIM and LEFT, and you can use them with the OF syntax for variable lists in calling environments that support variable lists. The following table shows equivalents of the CAT, CATS, CATT, and CATX functions. The variables X1 through X4 specify character variables, and SP specifies a delimiter, such as a blank or comma. Function
Equivalent Code
CAT(OF X1-X4)
X1||X2||X3||X4
CATS(OF X1-X4)
TRIM(LEFT(X1))||TRIM(LEFT(X2))||TRIM(LEFT(X3))|| TRIM(LEFT(X4))
CATT(OF X1-X4)
TRIM(X1)||TRIM(X2)||TRIM(X3)||TRIM(X4)
CATX(SP, OF X1-X4)
TRIM(LEFT(X1))||SP||TRIM(LEFT(X2))||SP|| TRIM(LEFT(X3))||SP||TRIM(LEFT(X4))
534
CATT Function
4
Chapter 4
Examples The following example shows how the CATS function concatenates strings. data _null_; x=’ The Olym’; y=’pic Arts Festi’; z=’ val includes works by D a=’ale Chihuly.’; result=cats(x,y,z,a); put result $char.; run;
’;
The following line is written to the SAS log: ----+----1----+----2----+----3----+----4----+----5----+----6 The Olympic Arts Festival includes works by Dale Chihuly.
See Also Functions and CALL Routines: “CALL CATS Routine” on page 427 “CALL CATT Routine” on page 429 “CALL CATX Routine” on page 430 “CAT Function” on page 526 “CATQ Function” on page 528 “CATT Function” on page 534 “CATX Function” on page 537
CATT Function Removes trailing blanks, and returns a concatenated character string. Category: Restriction:
Character “I18N Level 0” on page 305
Syntax CATT(item-1 )
Arguments item
specifies a constant, variable, or expression, either character or numeric. If item is numeric, then its value is converted to a character string by using the BESTw. format. In this case, leading blanks are removed and SAS does not write a note to the log.
Functions and CALL Routines
4
CATT Function
535
Details Length of Returned Variable In a DATA step, if the CATT function returns a value to a variable that has not previously been assigned a length, then that variable is given a length of 200 bytes. If the concatenation operator (||) returns a value to a variable that has not previously been assigned a length, then that variable is given a length that is the sum of the lengths of the values which are being concatenated.
Length of Returned Variable: Special Cases
The CATT function returns a value to a variable, or returns a value in a temporary buffer. The value that is returned from the CATT function has the following length:
3 up to 200 characters in WHERE clauses and in PROC SQL 3 up to 32767 characters in the DATA step except in WHERE clauses 3 up to 65534 characters when CATT is called from the macro processor If CATT returns a value in a temporary buffer, the length of the buffer depends on the calling environment, and the value in the buffer can be truncated after CATT finishes processing. In this case, SAS does not write a message about the truncation to the log. If the length of the variable or the buffer is not large enough to contain the result of the concatenation, SAS does the following:
3 changes the result to a blank value in the DATA step, and in PROC SQL 3 writes a warning message to the log stating that the result was either truncated or set to a blank value, depending on the calling environment
3 writes a note to the log that shows the location of the function call and lists the argument that caused the truncation
3 sets _ERROR_ to 1 in the DATA step The CATT function removes leading and trailing blanks from numeric arguments after it formats the numeric value with the BESTw. format.
Comparisons The results of the CAT, CATS, CATT, and CATX functions are usually equivalent to results that are produced by certain combinations of the concatenation operator (||) and the TRIM and LEFT functions. However, the default length for the CAT, CATS, CATT, and CATX functions is different from the length that is obtained when you use the concatenation operator. For more information, see “Length of Returned Variable” on page 535. Using the CAT, CATS, CATT, and CATX functions is faster than using TRIM and LEFT, and you can use them with the OF syntax for variable lists in calling environments that support variable lists. The following table shows equivalents of the CAT, CATS, CATT, and CATX functions. The variables X1 through X4 specify character variables, and SP specifies a delimiter, such as a blank or comma. Function
Equivalent Code
CAT(OF X1-X4)
X1||X2||X3||X4
CATS(OF X1-X4)
TRIM(LEFT(X1))||TRIM(LEFT(X2))||TRIM(LEFT(X3))|| TRIM(LEFT(X4))
536
CATT Function
4
Chapter 4
Function
Equivalent Code
CATT(OF X1-X4)
TRIM(X1)||TRIM(X2)||TRIM(X3)||TRIM(X4)
CATX(SP, OF X1-X4)
TRIM(LEFT(X1))||SP||TRIM(LEFT(X2))||SP|| TRIM(LEFT(X3))||SP||TRIM(LEFT(X4))
Examples The following example shows how the CATT function concatenates strings. data _null_; x=’ The Olym’; y=’pic Arts Festi’; z=’ val includes works by D a=’ale Chihuly.’; result=catt(x,y,z,a); put result $char.; run;
’;
The following line is written to the SAS log: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7 The Olympic Arts Festi val includes works by Dale Chihuly.
See Also Functions and CALL Routines: “CALL CATS Routine” on page 427 “CALL CATT Routine” on page 429 “CALL CATX Routine” on page 430 “CAT Function” on page 526 “CATQ Function” on page 528 “CATS Function” on page 532 “CATX Function” on page 537
Functions and CALL Routines
4
CATX Function
537
CATX Function Removes leading and trailing blanks, inserts delimiters, and returns a concatenated character string. Category: Character Restriction:
“I18N Level 0” on page 305
Syntax CATX(delimiter, item-1 )
Arguments delimiter
specifies a character string that is used as a delimiter between concatenated items. item
specifies a constant, variable, or expression, either character or numeric. If item is numeric, then its value is converted to a character string by using the BESTw. format. In this case, SAS does not write a note to the log. For more information, see “The Basics” on page 537.
Details The Basics The CATX function first copies item-1 to the result, omitting leading and trailing blanks. Then for each subsequent argument item-i, i=2, …, n, if item-i contains at least one non-blank character, then CATX appends delimiter and item-i to the result, omitting leading and trailing blanks from item-i. CATX does not insert the delimiter at the beginning or end of the result. Blank items do not produce delimiters at the beginning or end of the result, nor do blank items produce multiple consecutive delimiters.
Length of Returned Variable In a DATA step, if the CATX function returns a value to a variable that has not previously been assigned a length, then that variable is given a length of 200 bytes. If the concatenation operator (||) returns a value to a variable that has not previously been assigned a length, then that variable is given a length that is the sum of the lengths of the values which are being concatenated.
538
CATX Function
4
Chapter 4
Length of Returned Variable: Special Cases
The CATX function returns a value to a variable, or returns a value in a temporary buffer. The value that is returned from the CATX function has the following length: 3 up to 200 characters in WHERE clauses and in PROC SQL
3 up to 32767 characters in the DATA step except in WHERE clauses 3 up to 65534 characters when CATX is called from the macro processor If CATX returns a value in a temporary buffer, the length of the buffer depends on the calling environment, and the value in the buffer can be truncated after CATX finishes processing. In this case, SAS does not write a message about the truncation to the log. If the length of the variable or the buffer is not large enough to contain the result of the concatenation, SAS does the following: 3 changes the result to a blank value in the DATA step, and in PROC SQL
3 writes a warning message to the log stating that the result was either truncated or set to a blank value, depending on the calling environment
3 writes a note to the log that shows the location of the function call and lists the argument that caused the truncation
3 sets _ERROR_ to 1 in the DATA step
Comparisons The results of the CAT, CATS, CATT, and CATX functions are usually equivalent to results that are produced by certain combinations of the concatenation operator (||) and the TRIM and LEFT functions. However, the default length for the CAT, CATS, CATT, and CATX functions is different from the length that is obtained when you use the concatenation operator. For more information, see “Length of Returned Variable” on page 537. Using the CAT, CATS, CATT, and CATX functions is faster than using TRIM and LEFT, and you can use them with the OF syntax for variable lists in calling environments that support variable lists. Note: In the case of variables that have missing values, the concatenation produces different results. See Example 2 on page 539. 4 The following table shows equivalents of the CAT, CATS, CATT, and CATX functions. The variables X1 through X4 specify character variables, and SP specifies a delimiter, such as a blank or comma. Function
Equivalent Code
CAT(OF X1-X4)
X1||X2||X3||X4
CATS(OF X1-X4)
TRIM(LEFT(X1))||TRIM(LEFT(X2))||TRIM(LEFT(X3))|| TRIM(LEFT(X4))
CATT(OF X1-X4)
TRIM(X1)||TRIM(X2)||TRIM(X3)||TRIM(X4)
CATX(SP, OF X1-X4)
TRIM(LEFT(X1))||SP||TRIM(LEFT(X2))||SP|| TRIM(LEFT(X3))||SP||TRIM(LEFT(X4))
Functions and CALL Routines
4
CATX Function
539
Examples Example 1: Concatenating Strings That Have No Missing Values The following example shows how the CATX function concatenates strings the have no missing values. data _null_; separator=’%%$%%’; x=’The Olympic ’; y=’ Arts Festival ’; z=’ includes works by ’; a=’Dale Chihuly.’; result=catx(separator,x,y,z,a); put result $char.; run;
The following line is written to the SAS log: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7 The Olympic%%$%%Arts Festival%%$%%includes works by%%$%%Dale Chihuly.
Example 2: Concatenating Strings That Have Missing Values The following example shows how the CATX function concatenates strings that contain missing values. options nodate nostimer ls=78 ps=60; data one; length x1--x4 $1; input x1--x4; datalines; A B C D E . F G H . . J ; run; data two; set one; SP=’^’; test1=catx(sp, of x1--x4); test2=trim(left(x1)) || sp || trim(left(x2)) || sp || trim(left(x3)) || sp || trim(left(x4)); run; proc print data=two; run;
540
CDF Function
4
Chapter 4
SAS creates the following output: Output 4.36
Using CATX with Missing Values The SAS System Obs
x1
1 2 3
x2 A E H
x3 B
x4 C
SP
test1
D F
1
^ G
^ J
A^B^C^D E^F^G ^ H^J
test2 A^B^C^D E^ ^F^G H^ ^ ^J
See Also Functions and CALL Routines: “CALL CATS Routine” on page 427 “CALL CATT Routine” on page 429 “CALL CATX Routine” on page 430 “CAT Function” on page 526 “CATQ Function” on page 528 “CATS Function” on page 532 “CATT Function” on page 534
CDF Function Returns a value from a cumulative probability distribution. Category:
Probability
Syntax CDF (distribution,quantile)
Arguments distribution
is a character constant, variable, or expression that identifies the distribution. Valid distributions are as follows: Distribution
Argument
Bernoulli
BERNOULLI
Beta
BETA
Binomial
BINOMIAL
Cauchy
CAUCHY
Functions and CALL Routines
Distribution
Argument
Chi-Square
CHISQUARE
Exponential
EXPONENTIAL
F
F
Gamma
GAMMA
Geometric
GEOMETRIC
Hypergeometric
HYPERGEOMETRIC
Laplace
LAPLACE
Logistic
LOGISTIC
Lognormal
LOGNORMAL
Negative binomial
NEGBINOMIAL
Normal
NORMAL|GAUSS
Normal mixture
NORMALMIX
Pareto
PARETO
Poisson
POISSON
T
T
Uniform
UNIFORM
Wald (inverse Gaussian)
WALD|IGAUSS
Weibull
WEIBULL
4
CDF Function
541
Note: Except for T, F, and NORMALMIX, you can minimally identify any distribution by its first four characters. 4 quantile
is a numeric constant, variable, or expression that specifies the value of the random variable. parm-1, … ,parm-k
are optional constants, variables, or expressions that specify shape, location, or scale parameters appropriate for the specific distribution. See:
“Details” on page 541 for complete information about these parameters
Details The CDF function computes the left cumulative distribution function from various continuous and discrete probability distributions. Note: The QUANTILE function returns the quantile from a distribution that you specify. The QUANTILE function is the inverse of the CDF function. For more information, see “QUANTILE Function” on page 1028 . 4
542
CDF Function
4
Chapter 4
Bernoulli Distribution CDF(’BERNOULLI’,x,p) where x is a numeric random variable. p is a numeric probability of success. Range: 0 ≤ p ≤ 1 The CDF function for the Bernoulli distribution returns the probability that an observation from a Bernoulli distribution, with probability of success equal to p, is less than or equal to x. The equation follows:
(
0 x 0 b is a numeric shape parameter. Range: b > 0 l is the numeric left location parameter. Default: 0
r is the right location parameter. Default: 1 Range: r > l
4
Functions and CALL Routines
4
CDF Function
543
The CDF function for the beta distribution returns the probability that an observation from a beta distribution, with shape parameters a and b, is less than or equal to v. The following equation describes the CDF function of the beta distribution:
00
1
CDF BETA ; x; a; b; l; r = 0
8 > <
0
Rx (v0l)a01 (r0v)b01 1 dv a+b01 (a;b) (r 0l ) > : l
1
xl lr
where
(a; b) = 00((aa) +0 (bb)) and
Z1
0 (a) = xa0 e0xdx 1
0
Binomial Distribution CDF(’BINOMIAL’,m,p,n) where m is an integer random variable that counts the number of successes. Range: m = 0, 1, ...
p is a numeric probability of success. Range:
0≤p≤1
n is an integer parameter that counts the number of independent Bernoulli trials. Range: n = 0, 1, ...
The CDF function for the binomial distribution returns the probability that an observation from a binomial distribution, with parameters p and n, is less than or equal to m. The equation follows:
8 > <
0m m j : j =0 1 m>n 00
Note:
0
1
There are no location or scale parameters for the binomial distribution.
4
544
CDF Function
4
Chapter 4
Cauchy Distribution CDF(’CAUCHY’,x) where x is a numeric random variable.
is a numeric location parameter. Default: 0
is a numeric scale parameter. Default: 1 Range: > 0 The CDF function for the Cauchy distribution returns the probability that an observation from a Cauchy distribution, with the location parameter and the scale parameter , is less than or equal to x. The equation follows:
CDF
00
1 CAU CHY ; x; ; 0
1 1 0 1 x0 = 2 + tan
Chi-Square Distribution CDF(’CHISQUARE’,x,df ) where x is a numeric random variable. df is a numeric degrees of freedom parameter. Range: df > 0
nc is an optional numeric non-centrality parameter. Range: nc ≥ 0
Functions and CALL Routines
4
CDF Function
545
The CDF function for the chi-square distribution returns the probability that an observation from a chi-square distribution, with df degrees of freedom and non-centrality parameter nc, is less than or equal to x. This function accepts non-integer degrees of freedom. If nc is omitted or equal to zero, the value returned is from the central chi-square distribution. In the following equation, let $\nu$ = df and let $\lambda$ = nc. The following equation describes the CDF function of the chi-square distribution:
0P x0
546
CDF Function
4
Chapter 4
The CDF function for the exponential distribution returns the probability that an observation from an exponential distribution, with the scale parameter , is less than or equal to x. The equation follows:
0 1 n 0 CDF EXP O ; x; = 10 0 e0 x xx < 0 0
0
F Distribution CDF(’F’,x,ndf,ddf ) where x is a numeric random variable. ndf is a numeric numerator degrees of freedom parameter. Range: ndf > 0
ddf is a numeric denominator degrees of freedom parameter. Range: ddf > 0
nc is a numeric non-centrality parameter. Range: nc ≥ 0 The CDF function for the F distribution returns the probability that an observation from an F distribution, with ndf numerator degrees of freedom, ddf denominator degrees of freedom, and non-centrality parameter nc, is less than or equal to x. This function accepts non-integer degrees of freedom for ndf and ddf. If nc is omitted or equal to zero, the value returned is from a central F distribution. In the following equation, let $\nu_1$ = ndf, let $\nu_2$ = ddf, and let $\lambda$ = nc. The following equation describes the CDF function of the F distribution:
00
1
CDF F 0; x; v1; v2;
=
(
0P 1 j
=0
e0
j ( 2 ) 2 j
! PF (x; v1 + 2j; v2 )
x 0
is a numeric scale parameter. Default: 1 Range: > 0 The CDF function for the gamma distribution returns the probability that an observation from a gamma distribution, with shape parameter a and scale parameter , is less than or equal to x. The equation follows:
0
CDF GAMMA ; x; a; 0
0
1
(0 =
Rx a01 0 v 1 a 0(a) v e dv 0
x 0
The CDF function for the hypergeometric distribution returns the probability that an observation from an extended hypergeometric distribution, with population size N, number of items R, sample size n, and odds ratio o, is less than or equal to x. If o is omitted or equal to 1, the value returned is from the usual hypergeometric distribution. The equation follows:
00
1
CDF 8HY P ER ; x; N; R; n; o = > > > > > > <
0
0
x P R N
P i=0
0R i n0i
m in(R;n) > > > > > > : j=max(0;R+n0N )
R j
x < max (0; R + n 0 N )
oi
N 0R n0j
o
max (0; R + n 0 N ) x min (R; n) j
1
Laplace Distribution CDF(’LAPLACE’,x< ,, >) where x is a numeric random variable.
is a numeric location parameter. Default: 0
is a numeric scale parameter. Default: 1 Range:
>0
x > min (R; n)
Functions and CALL Routines
4
CDF Function
549
The CDF function for the Laplace distribution returns the probability that an observation from the Laplace distribution, with the location parameter and the scale parameter , is less than or equal to x. The equation follows:
CDF
00
LAP LACE 0 ; x; ;
1
( 0 )
=
8 x > > < 1e 2 x > 0 > : 1 2e
( 0 )
10
x 0
The CDF function for the logistic distribution returns the probability that an observation from a logistic distribution, with a location parameter and a scale parameter , is less than or equal to x. The equation follows:
CDF
00
1 LOGIST IC 0; x; ;
=
1
1 + e0 x0
Lognormal Distribution CDF(’LOGNORMAL’,x) where x is a numeric random variable.
is a numeric location parameter. Default: 0
is a numeric scale parameter. Default: 1 Range: > 0
The CDF function for the lognormal distribution returns the probability that an observation from a lognormal distribution, with the location parameter and the scale parameter , is less than or equal to x. The equation follows:
550
CDF Function
4
Chapter 4
00
1
8 <
0
CDF LOGN ; x; ; = : p1 2 0
logR(x) o
2
exp 0 (v202) dv
x0 x>0
Negative Binomial Distribution CDF(’NEGBINOMIAL’,m,p,n) where m is a positive integer random variable that counts the number of failures. Range: m = 0, 1, ...
p is a numeric probability of success. Range: 0 ≤ p ≤ 1
n is a numeric value that counts the number of successes. Range: n > 0
The CDF function for the negative binomial distribution returns the probability that an observation from a negative binomial distribution, with probability of success p and number of successes n, is less than or equal to m. The equation follows:
00
1
CDF NEGB ; m; p; n = 0
8 <
0
m nP p : j =0
m) where x is a numeric random variable.
is a numeric location parameter. Default: 0
is a numeric scale parameter. Default: 1 Range:
>0
The CDF function for the normal distribution returns the probability that an observation from the normal distribution, with the location parameter and the scale parameter , is less than or equal to x. The equation follows:
Functions and CALL Routines
4
CDF Function
551
!
x
Z 2 1 ( v 0 ) CDF NORMAL ; x; ; = p exp 0 2 dv 2 2 01 00
1
0
Normal Mixture Distribution CDF(’NORMALMIX’,x,n,p,m,s) where x is a numeric random variable. n is the integer number of mixtures. Range: n = 1, 2, ... p
p ; p2 ; . . . ; pn , where
is the n proportions, 1 Range: p = 0, 1, ...
m is the n means
iP =n i=1
pi = 1 .
m1 ; m 2 ; . . . ; m n .
s
s ;s ;...;s
is the n standard deviations 1 2 n. Range: s > 0 The CDF function for the normal mixture distribution returns the probability that an observation from a mixture of normal distribution is less than or equal to x. The equation follows:
i=n
1 1 X 0 CDF NORMALM IX 0 ; x; n; p; m; s = pi CDF 0NORMAL0 ; x; mi; si 00
i=1
Note: There are no location or scale parameters for the normal mixture distribution. 4
Pareto Distribution CDF(’PARETO’,x,a) where x is a numeric random variable. a is a numeric shape parameter. Range: a > 0 k is a numeric scale parameter. Default: 1 Range: k > 0
552
CDF Function
4
Chapter 4
The CDF function for the Pareto distribution returns the probability that an observation from a Pareto distribution, with the shape parameter a and the scale parameter k, is less than or equal to x. The equation follows:
CDF P ARET O ; x; a; k = 01 0 0 k 1a x 00
1
0
x 0 The CDF function for the Poisson distribution returns the probability that an observation from a Poisson distribution, with mean m, is less than or equal to n. The equation follows:
00
1
CDF POISSON ; n; m = Note:
0
(
0Pn i=0
exp (0m)
mi i!
n 0 nc is an optional numeric non-centrality parameter. The CDF function for the T distribution returns the probability that an observation from a T distribution, with degrees of freedom df and non-centrality parameter nc, is less than or equal to x. This function accepts non-integer degrees of freedom. If nc is omitted or equal to zero, the value returned is from the central T distribution. In the following equation, let $\nu$ = df and let $\delta$ = nc. The equation follows:
00
CDF T ; t; v; 0
1
1 = 1 0 1 2( 2 v01) 0 1 v 2
Z1
1 2 1 x 0 e0 2 x p v
0
1
2
ptxv Z
2 1 e0 2 u0 dudx (
01
)
Functions and CALL Routines
Note:
4
CDF Function
There are no location or scale parameters for the T distribution.
553
4
Uniform Distribution CDF(’UNIFORM’,x< ,l,r>) where x is a numeric random variable. l is the numeric left location parameter. Default: 0 r is the numeric right location parameter. Default: 1 Range: r > l The CDF function for the uniform distribution returns the probability that an observation from a uniform distribution, with the left location parameter l and the right location parameter r, is less than or equal to x. The equation follows:
00
1
CDF UNIFORM ; x; l; r = Note:
0
(
0x0 0 1
x0
where
8(.) denotes the probability from the standard normal distribution.
Note:
There are no location or scale parameters for the Wald distribution.
Weibull Distribution CDF(’WEIBULL’,x,a< , >)
4
554
CDF Function
4
Chapter 4
where x is a numeric random variable. a is a numeric shape parameter. Range: a > 0
is a numeric scale parameter. Default: 1 Range: > 0 The CDF function for the Weibull distribution returns the probability that an observation from a Weibull distribution, with the shape parameter a and the scale parameter is less than or equal to x. The equation follows:
0 1 n )
Arguments
index-expression
specifies a numeric constant, variable, or expression. selection
specifies a numeric constant, variable, or expression. The value of this argument is returned by the CHOOSEN function.
Details The CHOOSEN function uses the value of index-expression to select from the arguments that follow. For example, if index-expression is 3, CHOOSEN returns the value of selection-3. If the first argument is negative, the function counts backwards from the list of arguments, and returns that value.
Comparisons The CHOOSEN function is similar to the CHOOSEC function except that CHOOSEN returns a numeric value while CHOOSEC returns a character value.
561
562
CINV Function
4
Chapter 4
Examples The following example shows how CHOOSEN chooses from a series of values: data _null_; ItemNumber=choosen(5,100,50,3784,498,679); Rank=choosen(-2,1,2,3,4,5); Score=choosen(3,193,627,33,290,5); Value=choosen(-5,-37,82985,-991,3,1014,-325,3,54,-618); put ItemNumber= Rank= Score= Value=; run;
SAS writes the following line to the log: ItemNumber=679 Rank=4 Score=33 Value=1014
See Also Functions: “CHOOSEC Function” on page 560
CINV Function Returns a quantile from the chi-square distribution. Category:
Quantile
Syntax CINV (p,df< ,nc>)
Arguments p
is a numeric probability. Range: 0 ≤ p < 1 df
is a numeric degrees of freedom parameter. Range: df > 0 nc
is a numeric noncentrality parameter. Range:
nc ≥ 0
Details th
The CINV function returns the p quantile from the chi-square distribution with degrees of freedom df and a noncentrality parameter nc. The probability that an
Functions and CALL Routines
4
CLOSE Function
563
observation from a chi-square distribution is less than or equal to the returned quantile is p. This function accepts a noninteger degrees of freedom parameter df. If the optional parameter nc is not specified or has the value 0, the quantile from the central chi-square distribution is returned. The noncentrality parameter nc is defined 2 such that if X is a normal random variable with mean and variance 1, X has a 2 noncentral chi-square distribution with df=1 and nc = . CAUTION:
For large values of nc, the algorithm could fail; in that case, a missing value is returned. Note:
CINV is the inverse of the PROBCHI function.
4
Examples th
The first statement following shows how to find the 95 percentile from a central chi-square distribution with 3 degrees of freedom. The second statement shows how to th find the 95 percentile from a noncentral chi-square distribution with 3.5 degrees of freedom and a noncentrality parameter equal to 4.5. SAS Statements
Results
q1=cinv(.95,3);
7.8147279033
a2=cinv(.95,3.5,4.5);
7.504582117
See Also Functions: “QUANTILE Function” on page 1028
CLOSE Function Closes a SAS data set. Category: SAS File I/O
Syntax CLOSE(data-set-id)
Arguments
data-set-id
is a numeric variable that specifies the data set identifier that the OPEN function returns.
4
564
CMISS Function
4
Chapter 4
Details CLOSE returns zero if the operation was successful, or returns a non-zero value if it was not successful. Close all SAS data sets as soon as they are no longer needed by the application. Note: All data sets opened within a DATA step are closed automatically at the end of the DATA step. 4
Examples This example uses OPEN to open the SAS data set PAYROLL. If the data set opens successfully, indicated by a positive value for the variable PAYID, the example uses CLOSE to close the data set. %let payid=%sysfunc(open(payroll,is)); macro statements %if &payid > 0 %then %let rc=%sysfunc(close(&payid));
See Also Function: “OPEN Function” on page 948
CMISS Function Counts the number of missing arguments. Category:
Descriptive Statistics
Syntax CMISS(argument-1 < , argument-2,…>)
Arguments
argument
specifies a constant, variable, or expression. Argument can be either a character value or a numeric value.
Details A character expression is counted as missing if it evaluates to a string that contains all blanks or has a length of zero. A numeric expression is counted as missing if it evaluates to a numeric missing value: ., ._, .A, … , .Z.
Functions and CALL Routines
4
CNONCT Function
565
Comparisons The CMISS function does not convert any argument. The NMISS function converts all arguments to numeric values.
See Also Functions: “NMISS Function” on page 917 “MISSING Function” on page 899
CNONCT Function Returns the noncentrality parameter from a chi-square distribution. Category: Mathematical
Syntax CNONCT(x,df,prob)
Arguments x
is a numeric random variable. Range: x ≥ 0 df
is a numeric degrees of freedom parameter. Range:
df > 0
prob
is a probability. Range: 0 < prob < 1
Details The CNONCT function returns the nonnegative noncentrality parameter from a noncentral chi-square distribution whose parameters are x, df, and nc. If prob is greater than the probability from the central chi-square distribution with the parameters x and df, a root to this problem does not exist. In this case a missing value is returned. A Newton-type algorithm is used to find a nonnegative root nc of the equation
Pc (xjdf; nc) 0 prob = 0 where
566
COALESCE Function
4
Chapter 4
Pc (xjdf; nc) = e where
0nc 2
X1
0 nc 1j
2 Pg x j df j! 2 2 j =0
+
j
Pg (xja) is the probability from the gamma distribution given by 1
Pg (xja) =
a
0( )
Zx
0
ta01 e0t dt
If the algorithm fails to converge to a fixed point, a missing value is returned.
Examples data work; x=2; df=4; do nc=1 to 3 by .5; prob=probchi(x,df,nc); ncc=cnonct(x,df,prob); output; end; run; proc print; run;
Output 4.38 Distribution
Computations of the Noncentrality Parameters from the Chi-squared
OBS
x
df
nc
1 2 3 4 5
2 2 2 2 2
4 4 4 4 4
1.0 1.5 2.0 2.5 3.0
prob 0.18611 0.15592 0.13048 0.10907 0.09109
ncc 1.0 1.5 2.0 2.5 3.0
COALESCE Function Returns the first non-missing value from a list of numeric arguments. Category:
Mathematical
Syntax COALESCE(argument-1)
Functions and CALL Routines
4
COALESCEC Function
Arguments
argument
specifies a numeric constant, variable, or expression.
Details The Basics COALESCE accepts one or more numeric arguments. The COALESCE function checks the value of each argument in the order in which they are listed and returns the first non-missing value. If only one value is listed, then the COALESCE function returns the value of that argument. If all the values of all arguments are missing, then the COALESCE function returns a missing value.
Comparisons The COALESCE function searches numeric arguments, whereas the COALESCEC function searches character arguments.
Examples SAS Statements
Results
x = COALESCE(42, .);
42
y = COALESCE(.A, .B, .C);
.
z = COALESCE(., 7, ., ., 42);
7
See Also Function: “COALESCEC Function” on page 567
COALESCEC Function Returns the first non-missing value from a list of character arguments. Category: Character Restriction:
“I18N Level 2” on page 306
Syntax COALESCEC(argument-1)
567
568
COLLATE Function
4
Chapter 4
Arguments
argument
specifies a character constant, variable, or expression.
Details Length of Returned Variable In a DATA step, if the COALESCEC function returns a value to a variable that has not previously been assigned a length, then that variable is given a length of 200 bytes. The Basics COALESCEC accepts one or more character arguments. The COALESCEC function checks the value of each argument in the order in which they are listed and returns the first non-missing value. If only one value is listed, then the COALESCEC function returns the value of that argument. A character value is considered missing if it has a length of zero or if all the characters are blank. If all the values of all arguments are missing, then the COALESCEC function returns a string with a length of zero.
Comparisons The COALESCEC function searches character arguments, whereas the COALESCE function searches numeric arguments.
Examples SAS Statements
Results
COALESCEC(’’, ’Hello’)
Hello
COALESCEC (’’, ’Goodbye’, ’Hello’)
Goodbye
See Also Function: “COALESCE Function” on page 566
COLLATE Function Returns a character string in ASCII or EBCDIC collating sequence. Category: Restriction: See:
Character “I18N Level 0” on page 305
COLLATE Function in the documentation for your operating environment.
Functions and CALL Routines
4
COLLATE Function
569
Syntax COLLATE (start-position) | (start-position)
Arguments start-position
specifies the numeric position in the collating sequence of the first character to be returned. Interaction: If you specify only start-position, COLLATE returns consecutive characters from that position to the end of the collating sequence or up to 255 characters, whichever comes first. end-position
specifies the numeric position in the collating sequence of the last character to be returned. The maximum end-position for the EBCDIC collating sequence is 255. For ASCII collating sequences, the characters that correspond to end-position values between 0 and 127 represent the standard character set. Other ASCII characters that correspond to end-position values between 128 and 255 are available on certain ASCII operating environments, but the information that those characters represent varies with the operating environment. Tip: end-position must be larger than start-position If you specify end-position, COLLATE returns all character values in the collating sequence between start-position and end-position, inclusive.
Tip:
If you omit end-position and use length, mark the end-position place with a comma.
Tip:
length
specifies the number of characters in the collating sequence. Default: 200 Tip:
If you omit end-position, use length to specify the length of the result explicitly.
Details Length of Returned Variable In a DATA step, if the COLLATE function returns a value to a variable that has not previously been assigned a length, then that variable is given a length of 200 bytes. The Basics If you specify both end-position and length, COLLATE ignores length. If you request a string longer than the remainder of the sequence, COLLATE returns a string through the end of the sequence.
Examples The following SAS statements produce these results.
570
COMB Function
4
Chapter 4
SAS Statements
Results
ASCII
----+----1----+-----2--
x=collate(48,,10); y=collate(48,57); put @1 x @14 y;
0123456789
0123456789
0123456789
0123456789
EBCDIC x=collate(240,,10); y=collate(240,249); put @1 x @14 y;
See Also Functions: “BYTE Function” on page 417 “RANK Function” on page 1048
COMB Function Computes the number of combinations of n elements taken r at a time. Category:
Combinatorial
Syntax COMB(n, r)
Arguments n
is a nonnegative integer that represents the total number of elements from which the sample is chosen. r
is a nonnegative integer that represents the number of chosen elements. Restriction: r ≤ n
Details The mathematical representation of the COMB function is given by the following equation:
COMB (n; r) =
n r
= r ! (nn0! r )!
Functions and CALL Routines
4
COMPARE Function
571
with n ≥ 0, r ≥ 0, and n≥ r. If the expression cannot be computed, a missing value is returned. For moderately large values, it is sometimes not possible to compute the COMB function.
Examples SAS Statements
Results
x=comb(5,1);
5
See Also Functions: “FACT Function” on page 654 “PERM Function” on page 974 “LCOMB Function” on page 853
COMPARE Function Returns the position of the leftmost character by which two strings differ, or returns 0 if there is no difference. Category: Character
“I18N Level 0” on page 305 Tip: DBCS equivalent function is KCOMPARE in SAS National Language Support (NLS): Reference Guide. See also “DBCS Compatibility” on page 572. Restriction:
Syntax COMPARE(string–1, string–2)
Arguments string–1
specifies a character constant, variable, or expression. string–2
specifies a character constant, variable, or expression. modifier
specifies a character string that can modify the action of the COMPARE function. You can use one or more of the following characters as a valid modifier: i or I
ignores the case in string–1 and string–2.
l or L
removes leading blanks in string–1 and string–2 before comparing the values.
572
4
COMPARE Function
Chapter 4
n or N
removes quotation marks from any argument that is a name literal and ignores the case of string–1 and string–2. Tip: A name literal is a name token that is expressed as a string
within quotation marks, followed by the uppercase or lowercase letter n. Name literals enable you to use special characters (including blanks) that are not otherwise allowed in SAS data set or variable names. For COMPARE to recognize a string as a name literal, the first character must be a quotation mark. : (colon)
Tip:
truncates the longer of string–1 or string–2 to the length of the shorter string, or to one, whichever is greater. If you do not specify this modifier, the shorter string is padded with blanks to the same length as the longer string.
COMPARE ignores blanks that are used as modifiers.
Details The Basics
The order in which the modifiers appear in the COMPARE function is
relevant.
3 “LN” first removes leading blanks from each string, and then removes quotation marks from name literals.
3 “NL” first removes quotation marks from name literals, and then removes leading blanks from each string. In the COMPARE function, if string–1 and string–2 do not differ, COMPARE returns a value of zero. If the arguments differ, then the following apply:
3 The sign of the result is negative if string–1 precedes string–2 in a sort sequence, and positive if string–1 follows string–2 in a sort sequence.
3 The magnitude of the result is equal to the position of the leftmost character at which the strings differ.
DBCS Compatibility The DBCS equivalent function is KCOMPARE, which is documented in SAS National Language Support (NLS): Reference Guide. There are minor differences between the COMPARE and KCOMPARE functions. While both functions accept varying numbers of arguments, usage of the third argument is not compatible. The following example shows the differences in the syntax: COMPARE(string-1, string-2 ) KCOMPARE(string-1 , string-2)
Examples Example 1: Understanding the Order of Comparisons When Comparing Two Strings following example compares two strings by using the COMPARE function. options pageno=1 nodate ls=80 ps=60; data test; infile datalines missover; input string1 $char8. string2 $char8. modifiers $char8.; result=compare(string1, string2, modifiers); datalines; 1234567812345678
The
Functions and CALL Routines
123 abc xyz aBc aBc
abc abx abcdef abc AbC abc abc abc abc abc abx abc abx ABC ’abc’n ABC ’abc’n ’$12’n $12 ’$12’n $12 ’$12’n $12 ;
4
COMPARE Function
i l l n n nl ln
proc print data=test; run;
The following output shows the results. Output 4.39
Results of Comparing Two Strings by Using the COMPARE Function The SAS System Obs 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
string1
string2
12345678 123 abc xyz aBc aBc abc abc abc abc ABC ABC ’$12’n ’$12’n ’$12’n
12345678 abc abx abcdef abc AbC abc abc abx abx ’abc’n ’abc’n $12 $12 $12
1
modifiers
i l l n n nl ln
Example 2: Truncating Strings Using the COMPARE Function uses the : (colon) modifier to truncate strings. options pageno=1 nodate ls=80 pagesize=60; data test2; pad1=compare(’abc’,’abc ’); pad2=compare(’abc’,’abcdef ’); truncate1=compare(’abc’,’abcdef’,’:’); truncate2=compare(’abcdef’,’abc’,’:’); blank=compare(’’,’abc’, ’:’); run; proc print data=test2 noobs; run;
result 0 -1 -3 1 -2 0 -1 0 2 -3 1 0 -1 1 0
The following example
573
574
COMPBL Function
4
Chapter 4
The following output shows the results. Output 4.40
Results of Using the Truncation Modifier The SAS System pad1 0
1
pad2
truncate1
truncate2
blank
-4
0
0
-1
See Also Functions and CALL Routines: “COMPGED Function” on page 575 “COMPLEV Function” on page 580 “CALL COMPCOST Routine” on page 432
COMPBL Function Removes multiple blanks from a character string. Character Restriction: “I18N Level 0” on page 305 Category:
Syntax COMPBL(source)
Arguments source
specifies a character constant, variable, or expression to compress.
Details Length of Returned Variable In a DATA step, if the COMPBL function returns a value to a variable that has not previously been assigned a length, then the length of that variable defaults to the length of the first argument. The Basics The COMPBL function removes multiple blanks in a character string by translating each occurrence of two or more consecutive blanks into a single blank.
Comparisons The COMPRESS function removes every occurrence of the specific character from a string. If you specify a blank as the character to remove from the source string, the
Functions and CALL Routines
4
COMPGED Function
COMPRESS function removes all blanks from the source string, while the COMPBL function compresses multiple blanks to a single blank and has no effect on a single blank.
Examples The following SAS statements produce these results. SAS Statements
Results ----+----1----+-----2--
string=’Hey Diddle Diddle’; string=compbl(string); put string;
Hey Diddle Diddle
string=’125 E Main St’; length address $10; address=compbl(string); put address;
125 E Main
See Also Function: “COMPRESS Function” on page 584
COMPGED Function Returns the generalized edit distance between two strings. Category: Character Restriction:
“I18N Level 0” on page 305
Syntax COMPGED(string-1, string-2 < ,cutoff> )
Arguments
string–1
specifies a character constant, variable, or expression. string-2
specifies a character constant, variable, or expression.
575
576
COMPGED Function
4
Chapter 4
cutoff
is a numeric constant, variable, or expression. If the actual generalized edit distance is greater than the value of cutoff, the value that is returned is equal to the value of cutoff. Tip: Using a small value of cutoff improves the efficiency of COMPGED if the values of string–1 and string–2 are long. modifiers
specifies a character string that can modify the action of the COMPGED function. You can use one or more of the following characters as a valid modifier: i or I
ignores the case in string–1 and string–2.
l or L
removes leading blanks in string–1 and string–2 before comparing the values.
n or N
removes quotation marks from any argument that is an n-literal and ignores the case of string–1 and string–2.
: (colon)
truncates the longer of string–1 or string–2 to the length of the shorter string, or to one, whichever is greater. Tip: COMPGED ignores blanks that are used as modifiers.
Details The Order in Which Modifiers Appear
The order in which the modifiers appear in the COMPGED function is relevant. 3 “LN” first removes leading blanks from each string and then removes quotation marks from n-literals. 3 “NL” first removes quotation marks from n-literals and then removes leading blanks from each string.
Definition of Generalized Edit Distance Generalized edit distance is a generalization of Levenshtein edit distance, which is a measure of dissimilarity between two strings. The Levenshtein edit distance is the number of deletions, insertions, or replacements of single characters that are required to transform string-1 into string-2. Computing the Generalized Edit Distance The COMPGED function returns the generalized edit distance between string-1 and string-2. The generalized edit distance is the minimum-cost sequence of operations for constructing string-1 from string-2. The algorithm for computing the sum of the costs involves a pointer that points to a character in string-2 (the input string). An output string is constructed by a sequence of operations that might advance the pointer, add one or more characters to the output string, or both. Initially, the pointer points to the first character in the input string, and the output string is empty. The operations and their costs are described in the following table.
Functions and CALL Routines
4
COMPGED Function
Operation
Default Cost in Units
APPEND
50
When the output string is longer than the input string, add any one character to the end of the output string without moving the pointer.
BLANK
10
Do any of the following:
Description of Operation
3 3
3
Add one space character to the end of the output string without moving the pointer. When the character at the pointer is a space character, advance the pointer by one position without changing the output string. When the character at the pointer is a space character, add one space character to the end of the output string, and advance the pointer by one position.
If the cost for BLANK is set to zero by the COMPCOST function, the COMPGED function removes all space characters from both strings before doing the comparison. DELETE
100
Advance the pointer by one position without changing the output string.
DOUBLE
20
Add the character at the pointer to the end of the output string without moving the pointer.
FDELETE
200
When the output string is empty, advance the pointer by one position without changing the output string.
FINSERT
200
When the pointer is in position one, add any one character to the end of the output string without moving the pointer.
FREPLACE
200
When the pointer is in position one and the output string is empty, add any one character to the end of the output string, and advance the pointer by one position.
INSERT
100
Add any one character to the end of the output string without moving the pointer.
MATCH
0
Copy the character at the pointer from the input string to the end of the output string, and advance the pointer by one position.
577
578
COMPGED Function
4
Chapter 4
Operation
Default Cost in Units
Description of Operation
PUNCTUATION
30
Do any of the following:
3 3
3
Add one punctuation character to the end of the output string without moving the pointer. When the character at the pointer is a punctuation character, advance the pointer by one position without changing the output string. When the character at the pointer is a punctuation character, add one punctuation character to the end of the output string, and advance the pointer by one position.
If the cost for PUNCTUATION is set to zero by the COMPCOST function, the COMPGED function removes all punctuation characters from both strings before doing the comparison. REPLACE
100
Add any one character to the end of the output string, and advance the pointer by one position.
SINGLE
20
When the character at the pointer is the same as the character that follows in the input string, advance the pointer by one position without changing the output string.
SWAP
20
Copy the character that follows the pointer from the input string to the output string. Then copy the character at the pointer from the input string to the output string. Advance the pointer two positions.
TRUNCATE
10
When the output string is shorter than the input string, advance the pointer by one position without changing the output string.
To set the cost of the string operations, you can use the CALL COMPCOST routine or use default costs. If you use the default costs, the values that are returned by COMPGED are approximately 100 times greater than the values that are returned by COMPLEV.
Examples of Errors
The rationale for determining the generalized edit distance is based on the number and kinds of typographical errors that can occur. COMPGED assigns a cost to each error and determines the minimum sum of these costs that could be incurred. Some kinds of errors can be more serious than others. For example, inserting an extra letter at the beginning of a string might be more serious than omitting a letter from the end of a string. For another example, if you type a word or
Functions and CALL Routines
4
COMPGED Function
579
phrase that exists in string-2 and introduce a typographical error, you might produce string-1 instead of string-2.
Making the Generalized Edit Distance Symmetric
Generalized edit distance is not necessarily symmetric. That is, the value that is returned by COMPGED(string1, string2) is not always equal to the value that is returned by COMPGED(string2, string1). To make the generalized edit distance symmetric, use the CALL COMPCOST routine to assign equal costs to the operations within each of the following pairs: 3 INSERT, DELETE 3 FINSERT, FDELETE 3 APPEND, TRUNCATE 3 DOUBLE, SINGLE
Comparisons You can compute the Levenshtein edit distance by using the COMPLEV function. You can compute the generalized edit distance by using the CALL COMPCOST routine and the COMPGED function. Computing generalized edit distance requires considerably more computer time than does computing Levenshtein edit distance. But generalized edit distance usually provides a more useful measure than Levenshtein edit distance for applications such as fuzzy file merging and text mining.
Examples The following example uses the default costs to calculate the generalized edit distance. options nodate pageno=1 linesize=70 pagesize=60; data test; infile datalines missover; input String1 $char8. +1 String2 $char8. +1 Operation $40.; GED=compged(string1, string2); datalines; baboon baboon match baXboon baboon insert baoon baboon delete baXoon baboon replace baboonX baboon append baboo baboon truncate babboon baboon double babon baboon single baobon baboon swap bab oon baboon blank bab,oon baboon punctuation bXaoon baboon insert+delete bXaYoon baboon insert+replace bXoon baboon delete+replace Xbaboon baboon finsert aboon baboon trick question: swap+delete Xaboon baboon freplace axoon baboon fdelete+replace axoo baboon fdelete+replace+truncate axon baboon fdelete+replace+single baby baboon replace+truncate*2
580
COMPLEV Function
4
Chapter 4
balloon ;
baboon
replace+insert
proc print data=test label; label GED=’Generalized Edit Distance’; var String1 String2 GED Operation; run;
The following output shows the results. Output 4.41
Generalized Edit Distance Based on Operation The SAS System
Obs
String1
String2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
baboon baXboon baoon baXoon baboonX baboo babboon babon baobon bab oon bab,oon bXaoon bXaYoon bXoon Xbaboon aboon Xaboon axoon axoo axon baby balloon
baboon baboon baboon baboon baboon baboon baboon baboon baboon baboon baboon baboon baboon baboon baboon baboon baboon baboon baboon baboon baboon baboon
Generalized Edit Distance 0 100 100 100 50 10 20 20 20 10 30 200 200 200 200 200 200 300 310 320 120 200
Operation match insert delete replace append truncate double single swap blank punctuation insert+delete insert+replace delete+replace finsert trick question: swap+delete freplace fdelete+replace fdelete+replace+truncate fdelete+replace+single replace+truncate*2 replace+insert
See Also Functions: “COMPARE Function” on page 571 “CALL COMPCOST Routine” on page 432 “COMPLEV Function” on page 580
COMPLEV Function Returns the Levenshtein edit distance between two strings. Category: Restriction:
Character “I18N Level 0” on page 305
1
Functions and CALL Routines
4
COMPLEV Function
581
Syntax COMPLEV(string-1, string-2 )
Arguments
string–1
specifies a character constant, variable, or expression. string–2
specifies a character constant, variable, or expression. cutoff
specifies a numeric constant, variable, or expression. If the actual Levenshtein edit distance is greater than the value of cutoff, the value that is returned is equal to the value of cutoff. Using a small value of cutoff improves the efficiency of COMPLEV if the values of string–1 and string–2 are long.
Tip:
modifiers
specifies a character string that can modify the action of the COMPLEV function. You can use one or more of the following characters as a valid modifier: i or I
ignores the case in string–1 and string–2.
l or L
removes leading blanks in string–1 and string–2 before comparing the values.
n or N
removes quotation marks from any argument that is an n-literal and ignores the case of string–1 and string–2.
: (colon)
truncates the longer of string–1 or string–2 to the length of the shorter string, or to one, whichever is greater.
TIP: COMPLEV ignores blanks that are used as modifiers.
Details The order in which the modifiers appear in the COMPLEV function is relevant.
3 “LN” first removes leading blanks from each string and then removes quotation marks from n-literals.
3 “NL” first removes quotation marks from n-literals and then removes leading blanks from each string. The COMPLEV function ignores trailing blanks. COMPLEV returns the Levenshtein edit distance between string-1 and string-2. Levenshtein edit distance is the number of insertions, deletions, or replacements of single characters that are required to convert one string to the other. Levenshtein edit distance is symmetric. That is, COMPLEV(string-1,string-2) is the same as COMPLEV(string-2,string-1).
Comparisons The Levenshtein edit distance that is computed by COMPLEV is a special case of the generalized edit distance that is computed by COMPGED. COMPLEV executes much more quickly than COMPGED.
582
COMPLEV Function
4
Chapter 4
Examples The following example compares two strings by computing the Levenshtein edit distance. options pageno=1 nodate ls=80 ps=60; data test; infile datalines missover; input string1 $char8. string2 $char8. modifiers $char8.; result=complev(string1, string2, modifiers); datalines; 1234567812345678 abc abxc ac abc aXc abc aXbZc abc aXYZc abc WaXbYcZ abc XYZ abcdef aBc abc aBc AbC i abc abc abc abc l AxC ’abc’n AxC ’abc’n n ; proc print data=test; run;
The following output shows the results. Output 4.42
Results of Comparing Two Strings by Computing the Levenshtein Edit Distance The SAS System Obs 1 2 3 4 5 6 7 8 9 10 11 12 13 14
string1
string2
12345678 abc ac aXc aXbZc aXYZc WaXbYcZ XYZ aBc aBc abc abc AxC AxC
12345678 abxc abc abc abc abc abc abcdef abc AbC abc abc ’abc’n ’abc’n
1
modifiers
i l n
result 0 1 1 1 2 3 4 6 1 0 2 0 6 1
Functions and CALL Routines
4
COMPOUND Function
583
See Also Functions and CALL Routines: “COMPARE Function” on page 571 “COMPGED Function” on page 575 “CALL COMPCOST Routine” on page 432
COMPOUND Function Returns compound interest parameters. Category: Financial
Syntax COMPOUND(a,f,r,n)
Arguments a
is numeric, and specifies the initial amount. Range: a ≥ 0 f
is numeric, and specifies the future amount (at the end of n periods). Range: f ≥ 0 r
is numeric, and specifies the periodic interest rate expressed as a fraction. Range: r ≥ 0 n
is an integer, and specifies the number of compounding periods. Range: n ≥ 0
Details The COMPOUND function returns the missing argument in the list of four arguments from a compound interest calculation. The arguments are related by the following equation:
f
= a (1 + r)n
One missing argument must be provided. A compound interest parameter is then calculated from the remaining three values. No adjustment is made to convert the results to round numbers. n If n=0, then f a and r are equal to 1.
=
(1 + )
584
COMPRESS Function
4
Note:
Chapter 4
If you choose r as your missing value, then COMPOUND returns an error.
4
Examples The accumulated value of an investment of $2000 at a nominal annual interest rate of 9 percent, compounded monthly after 30 months, can be expressed as future=compound(2000,.,0.09/12,30);
The value returned is 2502.54. The second argument has been set to missing, indicating that the future amount is to be calculated. The 9 percent nominal annual rate has been converted to a monthly rate of 0.09/12. The rate argument is the fractional (not the percentage) interest rate per compounding period.
COMPRESS Function Returns a character string with specified characters removed from the original string. Character “I18N Level 0” on page 305 Tip: DBCS equivalent function is KCOMPRESS in SAS National Language Support (NLS): Reference Guide. Category:
Restriction:
Syntax COMPRESS(< source>< , modifiers>)
Arguments source
specifies a character constant, variable, or expression from which specified characters will be removed. chars
specifies a character constant, variable, or expression that initializes a list of characters. By default, the characters in this list are removed from the source argument. If you specify the K modifier in the third argument, then only the characters in this list are kept in the result. Tip: You can add more characters to this list by using other modifiers in the third argument. Tip: Enclose a literal string of characters in quotation marks. modifier
specifies a character constant, variable, or expression in which each non-blank character modifies the action of the COMPRESS function. Blanks are ignored. The following characters can be used as modifiers: a or A
adds alphabetic characters to the list of characters.
Functions and CALL Routines
4
COMPRESS Function
585
c or C
adds control characters to the list of characters.
d or D
adds digits to the list of characters.
f or F
adds the underscore character and English letters to the list of characters.
g or G
adds graphic characters to the list of characters.
h or H
adds a horizontal tab to the list of characters.
i or I
ignores the case of the characters to be kept or removed.
k or K
keeps the characters in the list instead of removing them.
l or L
adds lowercase letters to the list of characters.
n or N
adds digits, the underscore character, and English letters to the list of characters.
o or O
processes the second and third arguments once rather than every time the COMPRESS function is called. Using the O modifier in the DATA step (excluding WHERE clauses), or in the SQL procedure, can make COMPRESS run much faster when you call it in a loop where the second and third arguments do not change.
p or P
adds punctuation marks to the list of characters.
s or S
adds space characters (blank, horizontal tab, vertical tab, carriage return, line feed, and form feed) to the list of characters.
t or T
trims trailing blanks from the first and second arguments.
u or U
adds uppercase letters to the list of characters.
w or W
adds printable characters to the list of characters.
x or X
adds hexadecimal characters to the list of characters. Tip: If the modifier is a constant, enclose it in quotation marks. Specify multiple constants in a single set of quotation marks. Modifier can also be expressed as a variable or an expression.
Details Length of Returned Variable In a DATA step, if the COMPRESS function returns a value to a variable that has not previously been assigned a length, then that variable is given the length of the first argument. The Basics The COMPRESS function allows null arguments. A null argument is treated as a string that has a length of zero. Based on the number of arguments, the COMPRESS functions works as follows:
586
COMPRESS Function
4
Chapter 4
Number of Arguments
Result
only the first argument, source
The argument has all blanks removed. If the argument is completely blank, then the result is a string with a length of zero. If you assign the result to a character variable with a fixed length, then the value of that variable will be padded with blanks to fill its defined length.
the first two arguments, source and chars
All characters that appear in the second argument are removed from the result.
three arguments, source, chars, and modifier(s)
The K modifier (specified in the third argument) determines whether the characters in the second argument are kept or removed from the result.
The COMPRESS function compiles a list of characters to keep or remove, comprising the characters in the second argument plus any types of characters that are specified by the modifiers. For example, the D modifier specifies digits. Both of the following function calls remove digits from the result: COMPRESS(source, "1234567890"); COMPRESS(source, , "d");
To remove digits and plus or minus signs, you can use either of the following function calls: COMPRESS(source, "1234567890+-"); COMPRESS(source, "+-", "d");
Examples Example 1: Compressing Blanks SAS Statements
Results ----+----1
a=’AB C D ’; b=compress(a); put b;
ABCD
Example 2: Compressing Lowercase Letters SAS Statements
Results ----+----1----+----2----+----3
x=’123-4567-8901 B 234-5678-9012 c’; y=compress(x,’ABCD’,’l’); put y;
123-4567-8901 234-5678-9012
Functions and CALL Routines
Example 3: Compressing Space Characters SAS Statements
Results ----+----1
x=’1 2 3 4 y=compress(x,,’s’); put y;
5’; 12345
Example 4: Keeping Characters in the List SAS Statements
Results ----+----1
x=’Math A English B Physics A’; y=compress(x,’ABCD’,’k’); put y;
ABA
Example 5: Compressing a String and Returning a Length of 0 SAS Statements
Results ----+----1
x=’ ’; l=lengthn(compress(x)); put l;
See Also Functions: “COMPBL Function” on page 574 “LEFT Function” on page 854 “TRIM Function” on page 1132
CONSTANT Function Computes machine and mathematical constants. Category: Mathematical
0
4
CONSTANT Function
587
588
CONSTANT Function
4
Chapter 4
Syntax CONSTANT(constant)
Arguments constant
is a character constant, variable, or expression that identifies the constant to be returned. Valid constants are as follows: Description
Constant
The natural base
’E’
Euler constant
’EULER’
Pi
’PI’
Exact integer
’EXACTINT’
The largest double-precision number
’BIG’
The log with respect to base of BIG
’LOGBIG’
The square root of BIG
’SQRTBIG’
The smallest double-precision number
’SMALL’
The log with respect to base of SMALL
’LOGSMALL’
The square root of SMALL
’SQRTSMALL’
Machine precision constant
’MACEPS’
The log with respect to base of MACEPS
’LOGMACEPS’
The square root of MACEPS
’SQRTMACEPS’
parameter
is an optional numeric parameter. Some of the constants specified in constant have an optional argument that alters the functionality of the CONSTANT function.
Details CAUTION:
In some operating environments, the run-time library might have limitations that prevent the use of the full range of floating-point numbers that the hardware provides. In such cases, the CONSTANT function attempts to return values that are compatible with the limitations of the run-time library. For example, if the run-time library cannot compute EXP(LOG(CONSTANT(’BIG’))), then CONSTANT(’LOGBIG’) will not return the same value as LOG(CONSTANT(’BIG’)), but will return a value such that EXP(CONSTANT(’LOGBIG’)) can be computed. 4
The natural base CONSTANT(’E’) The natural base is described by the following equation:
Functions and CALL Routines
1
!
lim (1 +
x 0
x) x
4
CONSTANT Function
589
:
2 71828182 8459045
Euler constant CONSTANT(’EULER’) Euler’s constant is described by the following equation:
8j =n . for some integer j, such that This constant is important in finite precision computations.
1+
1
= 20j
The logarithm of MACEPS CONSTANT(’LOGMACEPS’ ) where base is a numeric value that is the base of the logarithm. Restriction: The base that you specify must be greater than the value of
1+SQRTMACEPS. Default: the natural base, E.
This case returns the logarithm with respect to base of CONSTANT(’MACEPS’).
The square root of MACEPS CONSTANT(’SQRTMACEPS’) This case returns the square root of CONSTANT(’MACEPS’).
CONVX Function Returns the convexity for an enumerated cash flow. Category: Financial
Syntax CONVX(y,f,c(1), ... ,c(k))
Arguments
y
specifies the effective per-period yield-to-maturity, expressed as a fraction. Range:
00
c
specifies the nominal per-period coupon rate, expressed as a fraction. Range:
0
c 0 and is an integer
K
specifies the number of remaining coupons. Range: 0 and is an integer
K>
Functions and CALL Routines
4
COS Function
593
k0
specifies the time from the present date to the first coupon date, expressed in terms of the number of periods. Range:
0
< k0 n1
y
specifies the nominal per-period yield-to-maturity, expressed as a fraction. Range:
y>0
Details The CONVXP function returns the value
1 C= 2 n
0P K c(k) tk (tk + 1) B 1+ ( ) B k=1 0 B 1 2 @ P 1 + ny
y tk n
1 C C C A
where
tk = nk0 + k 0 1 c (k ) = nc A f or k = 1; . . . ; K 0 1 c (K ) = 1 + nc A
0
1
and where
P =
K X k=1
01 c+(ky)1t n
k
Examples In the following example, the CONVXP function returns the convexity of a bond that has a face value of 1000, an annual coupon rate of 0.01, 4 coupons per year, and 14 remaining coupons. The time from settlement date to next coupon date is 0.165, and the annual yield-to-maturity is 0.08. data _null_; y=convxp(1000,.01,4,14,.33/2,.08); put y; run;
The value that is returned is 11.729001987.
COS Function Returns the cosine.
594
COSH Function
4
Chapter 4
Category:
Trigonometric
Syntax COS (argument)
Arguments argument
specifies a numeric constant, variable, or expression and is expressed in radians. If the magnitude of argument is so great that mod(argument,pi) is accurate to less than about three decimal places, COS returns a missing value.
Examples SAS Statements
Results
x=cos(0.5);
0.8775825619
x=cos(0);
1
x=cos(3.14159/3);
0.500000766
COSH Function Returns the hyperbolic cosine. Category:
Hyperbolic
Syntax COSH(argument)
Arguments argument
specifies a numeric constant, variable, or expression.
Details The COSH function returns the hyperbolic cosine of the argument, given by
0 argument e
+ e0argument
1
=2
Functions and CALL Routines
4
COUNT Function
595
Examples SAS Statements
Results
x=cosh(0);
1
x=cosh(-5.0);
74.209948525
x=cosh(0.5);
1.1276259652
COUNT Function Counts the number of times that a specified substring appears within a character string. Category: Character Restriction:
“I18N Level 1” on page 305
You can use the KCOUNT function in SAS National Language Support (NLS): Reference Guide for DBCS processing, but the functionality is different. See “DBCS Compatibility” on page 596. Tip:
Syntax COUNT(string, substring < ,modifiers>)
Arguments
string
specifies a character constant, variable, or expression in which substrings are to be counted. Tip:
Enclose a literal string of characters in quotation marks.
substring
is a character constant, variable, or expression that specifies the substring of characters to count in string. Tip:
Enclose a literal string of characters in quotation marks.
modifiers
is a character constant, variable, or expression that specifies one or more modifiers. The following modifiers can be in uppercase or lowercase: i
ignores character case during the count. If this modifier is not specified, COUNT only counts character substrings with the same case as the characters in substring.
t
trims trailing blanks from string and substring. If the modifier is a constant, enclose it in quotation marks. Specify multiple constants in a single set of quotation marks. Modifier can also be expressed as a variable or an expression.
Tip:
596
COUNT Function
4
Chapter 4
Details The Basics The COUNT function searches string, from left to right, for the number of occurrences of the specified substring, and returns that number of occurrences. If the substring is not found in string, COUNT returns a value of 0. CAUTION:
If two occurrences of the specified substring overlap in the string, the result is undefined. For example, COUNT(’boobooboo’, ’booboo’) might return either a 1 or a 2. 4
DBCS Compatibility You can use the KCOUNT function, which is documented in SAS National Language Support (NLS): Reference Guide, for DBCS processing, but the functionality is different. If the value of substring in the COUNT function is longer than two bytes, then the COUNT function can handle DBCS strings. The following examples show the differences in syntax: COUNT(string, substring KCOUNT(string)
Comparisons The COUNT function counts substrings of characters in a character string, whereas the COUNTC function counts individual characters in a character string.
Examples The following SAS statements produce these results: SAS Statements
Results
xyz=’This is a thistle? Yes, this is a thistle.’; howmanythis=count(xyz,’this’); put howmanythis;
3
xyz=’This is a thistle? Yes, this is a thistle.’; howmanyis=count(xyz,’is’); put howmanyis;
6
howmanythis_i=count(’This is a thistle? Yes, this is a thistle.’ ,’this’,’i’); put howmanythis_i;
4
variable1=’This is a thistle? Yes, this is a thistle.’; variable2=’is ’; variable3=’i’; howmanyis_i=count(variable1,variable2,variable3); put howmanyis_i;
4
expression1=’This is a thistle? ’||’Yes, this is a thistle.’; expression2=kscan(’This is’,2)||’ ’; expression3=compress(’i ’||’ t’); howmanyis_it=count(expression1,expression2,expression3); put howmanyis_it;
6
Functions and CALL Routines
4
COUNTC Function
597
See Also Functions: “COUNTC Function” on page 597 “COUNTW Function” on page 600 “FIND Function” on page 710 “INDEX Function” on page 790
COUNTC Function Counts the number of characters in a string that appear or do not appear in a list of characters. Category: Character Restriction:
“I18N Level 1” on page 305
Syntax COUNTC(string, charlist )
Arguments string
specifies a character constant, variable, or expression in which characters are counted. Tip: Enclose a literal string of characters in quotation marks. charlist
specifies a character constant, variable, or expression that initializes a list of characters. COUNTC counts characters in this list, provided that you do not specify the V modifier in the modifier argument. If you specify the V modifier, then all characters that are not in this list are counted. You can add more characters to the list by using other modifiers. Tip: Enclose a literal string of characters in quotation marks. Tip: If there are no characters in the list after processing the modifiers, COUNTC returns 0. modifier
specifies a character constant, variable, or expression in which each non-blank character modifies the action of the COUNTC function. Blanks are ignored. The following characters, in uppercase or lowercase, can be used as modifiers: blank
is ignored.
a or A
adds alphabetic characters to the list of characters.
b or B
scans string from right to left, instead of from left to right.
c or C
adds control characters to the list of characters.
d or D
adds digits to the list of characters.
598
COUNTC Function
4
Chapter 4
f or F
adds an underscore and English letters ( that is, the characters that can begin a SAS variable name using VALIDVARNAME=V7) to the list of characters.
g or G
adds graphic characters to the list of characters.
h or H
adds a horizontal tab to the list of characters.
i or I
ignores case.
l or L
adds lowercase letters to the list of characters.
n or N
adds digits, an underscore, and English letters (that is, the characters that can appear in a SAS variable name using VALIDVARNAME=V7) to the list of characters.
o or O
processes the charlist and modifier arguments only once, at the first call to this instance of COUNTC. If you change the value of charlist or modifier in subsequent calls, the change might be ignored by COUNTC.
p or P
adds punctuation marks to the list of characters.
s or S
adds space characters to the list of characters (blank, horizontal tab, vertical tab, carriage return, line feed, and form feed).
t or T
trims trailing blanks from string and chars. Tip: If you want to remove trailing blanks from only one character argument instead of both (or all) character arguments, use the TRIM function instead of the COUNTC function with the T modifier.
u or U
adds uppercase letters to the list of characters.
v or V
counts characters that do not appear in the list of characters. If you do not specify this modifier, then COUNTC counts characters that do appear in the list of characters.
w or W
adds printable characters to the list of characters.
x or X
adds hexadecimal characters to the list of characters. If modifier is a constant, enclose it in quotation marks. Specify multiple constants in a single set of quotation marks.
Tip:
Details The COUNTC function allows character arguments to be null. Null arguments are treated as character strings with a length of zero. If there are no characters in the list of characters to be counted, COUNTC returns zero.
Comparisons The COUNTC function counts individual characters in a character string, whereas the COUNT function counts substrings of characters in a character string.
Examples The following example uses the COUNTC function with and without modifiers to count the number of characters in a string. data test; string = ’Baboons Eat Bananas
’;
Functions and CALL Routines
a = countc(string, ’a’); b = countc(string,’b’); b_i = countc(string,’b’,’i’); abc_i = countc(string,’abc’,’i’); /* Scan string for characters that are not "a", "b", /* and "c", ignore case, (and include blanks). abc_iv = countc(string,’abc’,’iv’); /* Scan string for characters that are not "a", "b", /* and "c", ignore case, and trim trailing blanks. abc_ivt = countc(string,’abc’,’ivt’); run;
4
COUNTC Function
*/ */ */ */
options pageno=1 ls=80 nodate; proc print data=test noobs; run;
Output 4.43
Output from Using the COUNTC Functions with and without Modifiers The SAS System string
Baboons Eat Bananas
1
a
b
b_i
abc_i
abc_iv
abc_ivt
5
1
3
8
16
11
See Also Functions: “ANYALNUM Function” on page 367 “ANYALPHA Function” on page 369 “ANYCNTRL Function” on page 371 “ANYDIGIT Function” on page 373 “ANYGRAPH Function” on page 376 “ANYLOWER Function” on page 378 “ANYPRINT Function” on page 381 “ANYPUNCT Function” on page 383 “ANYSPACE Function” on page 385 “ANYUPPER Function” on page 387 “ANYXDIGIT Function” on page 388 “NOTALNUM Function” on page 918 “NOTALPHA Function” on page 920 “NOTCNTRL Function” on page 922 “NOTDIGIT Function” on page 924 “NOTGRAPH Function” on page 929 “NOTLOWER Function” on page 931 “NOTPRINT Function” on page 934 “NOTPUNCT Function” on page 935 “NOTSPACE Function” on page 937 “NOTUPPER Function” on page 940
599
600
COUNTW Function
4
Chapter 4
“NOTXDIGIT Function” on page 941 “FINDC Function” on page 712 “INDEXC Function” on page 792 “VERIFY Function” on page 1153
COUNTW Function Counts the number of words in a character string. Category:
Character
Syntax COUNTW(< string>< , modifiers>)
Arguments string
specifies a character constant, variable, or expression in which words are counted. chars
specifies an optional character constant, variable, or expression that initializes a list of characters. The characters in this list are the delimiters that separate words, provided that you do not use the K modifier in the modifier argument. If you specify the K modifier, then all characters that are not in this list are delimiters. You can add more characters to the list by using other modifiers. modifier
specifies a character constant, variable, or expression in which each non-blank character modifies the action of the COUNTW function. The following characters, in uppercase or lowercase, can be used as modifiers: blank
is ignored.
a or A
adds alphabetic characters to the list of characters.
b or B
counts from right to left instead of from left to right. Right-to-left counting makes a difference only when you use the Q modifier and the string contains unbalanced quotation marks.
c or C
adds control characters to the list of characters.
d or D
adds digits to the list of characters.
f or F
adds an underscore and English letters (that is, the characters that can begin a SAS variable name using VALIDVARNAME=V7) to the list of characters.
g or G
adds graphic characters to the list of characters.
h or H
adds a horizontal tab to the list of characters.
i or I
ignores the case of the characters.
Functions and CALL Routines
4
COUNTW Function
601
k or K
causes all characters that are not in the list of characters to be treated as delimiters. If K is not specified, then all characters that are in the list of characters are treated as delimiters.
l or L
adds lowercase letters to the list of characters.
m or M
specifies that multiple consecutive delimiters, and delimiters at the beginning or end of the string argument, refer to words that have a length of zero. If the M modifier is not specified, then multiple consecutive delimiters are treated as one delimiter, and delimiters at the beginning or end of the string argument are ignored.
n or N
adds digits, an underscore, and English letters (that is, the characters that can appear after the first character in a SAS variable name using VALIDVARNAME=V7) to the list of characters.
o or O
processes the chars and modifier arguments only once, rather than every time the COUNTW function is called. Using the O modifier in the DATA step (excluding WHERE clauses), or in the SQL procedure, can make COUNTW run faster when you call it in a loop where chars and modifier arguments do not change.
p or P
adds punctuation marks to the list of characters.
q or Q
ignores delimiters that are inside of substrings that are enclosed in quotation marks. If the value of string contains unmatched quotation marks, then scanning from left to right will produce different words than scanning from right to left.
s or S
adds space characters (blank, horizontal tab, vertical tab, carriage return, line feed, and form feed) to the list of characters.
t or T
trims trailing blanks from the string and chars arguments.
u or U
adds uppercase letters to the list of characters.
w or W
adds printable characters to the list of characters.
x or X
adds hexadecimal digits to the list of characters.
Details Definition of “Word”
In the COUNTW function, “word” refers to a substring that has one of the following characteristics:
3 is bounded on the left by a delimiter or the beginning of the string 3 is bounded on the right by a delimiter or the end of the string 3 contains no delimiters, except if you use the Q modifier and the delimiters are within substrings that have quotation marks Note: The definition of “word” is the same inboth the SCAN function and the COUNTW.sgml function. 4 Delimiter refers to any of several characters that you can specify to separate words.
Using the COUNTW Function in ASCII and EBCDIC Environments If you use the COUNTW function with only two arguments, the default delimiters depend on whether your computer uses ASCII or EBCDIC characters.
602
COUNTW Function
4
Chapter 4
3 If your computer uses ASCII characters, then the default delimiters are as follows: blank ! $ % & ( )* + , - . / ; < ^ | In ASCII environments that do not contain the ^ character, the SCAN function uses the ~ character instead.
3 If your computer uses EBCDIC characters, then the default delimiters are as follows: blank ! $ % & ( )* + , - . / ; <
|¢
Using Null Arguments The COUNTW function allows character arguments to be null. Null arguments are treated as character strings with a length of zero. Numeric arguments cannot be null. Using the M Modifier If you do not use the M modifier, then a word must contain at least one character. If you use the M modifier, then a word can have a length of zero. In this case, the number of words is one plus the number of delimiters in the string, not counting delimiters inside of strings that are enclosed in quotation marks when you use the Q modifier.
Examples The following example shows how to use the COUNTW function with the M and P modifiers. options ls=64 pageno=1 nodate; data test; length default blanks mp 8; input string $char60.; default = countw(string); blanks = countw(string, ’ ’); mp = countw(string, ’mp’); datalines; The quick brown fox jumps over the lazy dog. Leading blanks 2+2=4 /unix/path/names/use/slashes \Windows\Path\Names\Use\Backslashes ; run; proc print noobs data=test; run;
Output 4.44
Output from the COUNTW Function The SAS System
default blanks mp 9 2 2 5 1
9 2 1 1 1
2 1 1 3 2
string The quick brown fox jumps over the lazy dog. Leading blanks 2+2=4 /unix/path/names/use/slashes \Windows\Path\Names\Use\Backslashes
1
Functions and CALL Routines
4
CSS Function
603
See Also Functions and CALL Routines: “COUNT Function” on page 595 “COUNTC Function” on page 597 “FINDW Function” on page 719 “SCAN Function” on page 1073 “CALL SCAN Routine” on page 499
CSS Function Returns the corrected sum of squares. Category: Descriptive Statistics
Syntax CSS(argument-1)
Arguments argument
specifies a numeric constant, variable, or expression. At least one nonmissing argument is required. Otherwise, the function returns a missing value. If you have more than one argument, the argument list can consist of a variable list, which is preceded by OF.
Examples SAS Statements
Results
x1=css(5,9,3,6);
18.75
x2=css(5,8,9,6,.);
10
x3=css(8,9,6,.);
4.6666666667
x4=css(of x1-x3);
101.11574074
604
CUROBS Function
4
Chapter 4
CUROBS Function Returns the observation number of the current observation. SAS File I/O
Category:
Use this function only with an uncompressed SAS data set that is accessed using a native library engine.
Requirement:
Syntax CUROBS(data-set-id)
Arguments data-set-id
is a numeric value that specifies the data set identifier that the OPEN function returns.
Details If the engine being used does not support observation numbers, the function returns a missing value. With a SAS view, the function returns the relative observation number, that is, the number of the observation within the SAS view (as opposed to the number of the observation within any related SAS data set).
Examples This example uses the FETCHOBS function to fetch the tenth observation in the data set MYDATA. The value of OBSNUM returned by CUROBS is 10. %let dsid=%sysfunc(open(mydata,i)); %let rc=%sysfunc(fetchobs(&dsid,10)); %let obsnum=%sysfunc(curobs(&dsid));
See Also Functions: “FETCHOBS Function” on page 662 “OPEN Function” on page 948
Functions and CALL Routines
4
DACCDB Function
CV Function Returns the coefficient of variation. Category: Descriptive Statistics
Syntax CV(argument-1,argument-2)
Arguments argument
specifies a numeric constant, variable, or expression. At least two arguments are required. The argument list can consist of a variable list, which is preceded by OF.
Examples SAS Statements
Results
x1=cv(5,9,3,6);
43.47826087
x2=cv(5,8,9,6,.);
26.082026548
x3=cv(8,9,6,.);
19.924242152
x4=cv(of x1-x3);
40.953539216
DACCDB Function Returns the accumulated declining balance depreciation. Category: Financial
Syntax DACCDB(p,v,y,r)
605
606
DACCDBSL Function
4
Chapter 4
Arguments p
is numeric, the period for which the calculation is to be done. For noninteger p arguments, the depreciation is prorated between the two consecutive time periods that precede and follow the fractional period. v
is numeric, the depreciable initial value of the asset. y
is numeric, the lifetime of the asset. Range: y > 0 r
is numeric, the rate of depreciation expressed as a decimal. Range: r > 0
Details The DACCDB function returns the accumulated depreciation by using a declining balance method. The formula is
DACCDB(p
; v;
y; r
)=
(0 v
int p r
10 10y
( )
10( 0 p
r
( )) y
int p
p
0
p >
0
Note that int(p) is the integer part of p. The p and y arguments must be expressed by using the same units of time. A double-declining balance is obtained by setting r equal to 2.
Examples An asset has a depreciable initial value of $1000 and a fifteen-year lifetime. Using a 200 percent declining balance, the depreciation throughout the first 10 years can be expressed as a=daccdb(10,1000,15,2);
The value returned is 760.93. The first and the third arguments are expressed in years.
DACCDBSL Function Returns the accumulated declining balance with conversion to a straight-line depreciation. Category:
Financial
Syntax DACCDBSL(p,v,y,r)
Functions and CALL Routines
4
DACCSL Function
607
Arguments p
is numeric, the period for which the calculation is to be done. v
is numeric, the depreciable initial value of the asset. y
is an integer, the lifetime of the asset. Range: y > 0 r
is numeric, the rate of depreciation that is expressed as a fraction. Range: r > 0
Details The DACCDBSL function returns the accumulated depreciation by using a declining balance method, with conversion to a straight-line depreciation function that is defined by
X ) = DEPDBSL (
DACCDBSL (p
p
; v; y; r
i=1
i; v; y; r
)
The declining balance with conversion to a straight-line depreciation chooses for each time period the method of depreciation (declining balance or straight-line on the remaining balance) that gives the larger depreciation. The p and y arguments must be expressed by using the same units of time.
Examples An asset has a depreciable initial value of $1,000 and a ten-year lifetime. Using a declining balance rate of 150 percent, the accumulated depreciation of that asset in its fifth year can be expressed as y5=daccdbsl(5,1000,10,1.5);
The value returned is 564.99. The first and the third arguments are expressed in years.
DACCSL Function Returns the accumulated straight-line depreciation. Category: Financial
Syntax DACCSL(p,v,y)
608
DACCSYD Function
4
Chapter 4
Arguments
p
is numeric, the period for which the calculation is to be done. For fractional p, the depreciation is prorated between the two consecutive time periods that precede and follow the fractional period. v
is numeric, the depreciable initial value of the asset. y
is numeric, the lifetime of the asset. Range:
y>0
Details The DACCSL function returns the accumulated depreciation by using the straight-line method, which is given by
80 0 < p )=: y 0 p <
DACCSL (p
; v; y
v
p
v
p > y
y
The p and y arguments must be expressed by using the same units of time.
Examples An asset, acquired on 01APR86, has a depreciable initial value of $1000 and a ten-year lifetime. The accumulated depreciation in the value of the asset through 31DEC87 can be expressed as a=daccsl(1.75,1000,10);
The value returned is 175.00. The first and the third arguments are expressed in years.
DACCSYD Function Returns the accumulated sum-of-years-digits depreciation. Category:
Financial
Syntax DACCSYD(p,v,y)
Functions and CALL Routines
4
DACCTAB Function
609
Arguments
p
is numeric, the period for which the calculation is to be done. For noninteger p arguments, the depreciation is prorated between the two consecutive time periods that precede and follow the fractional period. v
is numeric, the depreciable initial value of the asset. y
is numeric, the lifetime of the asset. Range: y > 0
Details The DACCSYD function returns the accumulated depreciation by using the sum-of-years-digits method. The formula is
DACCSYD (p
; v; y
80 < )=: v v
( 0 nt p 0nt)y+(0p0int(p))(y0int(p))
int(p) y
i
( )
2
int(y)(y0 i
1
( ) 1 ) 2
+(y0int(y))2
p <
0
0
py
p>y
Note that int(y) indicates the integer part of y. The p and y arguments must be expressed by using the same units of time.
Examples An asset, acquired on 01OCT86, has a depreciable initial value of $1,000 and a five-year lifetime. The accumulated depreciation of the asset throughout 01JAN88 can be expressed as y2=daccsyd(15/12,1000,5);
The value returned is 400.00. The first and the third arguments are expressed in years.
DACCTAB Function Returns the accumulated depreciation from specified tables. Category: Financial
Syntax DACCTAB(p,v,t1, . . . ,tn)
610
DAIRY Function
4
Chapter 4
Arguments
p
is numeric, the period for which the calculation is to be done. For noninteger p arguments, the depreciation is prorated between the two consecutive time periods that precede and follow the fractional period. v
is numeric, the depreciable initial value of the asset. t1,t2, . . . ,tn
are numeric, the fractions of depreciation for each time period with t1+t2+...tn 1.
Details The DACCTAB function returns the accumulated depreciation by using user-specified tables. The formula for this function is
(0 0 DACCTAB (p;v;t1 ;t2; :::; tn ) = v t1 + t2 + ::: + t v
p 0 int (p)) t
int(p ) + (
t ;t2; . . . ;t
For a given p, only the arguments 1
k
1 0p 0 r
is numeric, the rate of depreciation that is expressed as a fraction. Range: r ≥ 0
Details The DEPDB function returns the depreciation by using the declining balance method, which is given by
DEPDB (p
; v; y; r
) = DACCDB ( ) 0 DACCDB (p 0 1 ) p; v; y; r
; v; y; r
The p and y arguments must be expressed by using the same units of time. A double-declining balance is obtained by setting r equal to 2.
Examples An asset has an initial value of $1,000 and a fifteen-year lifetime. Using a declining balance rate of 200 percent, the depreciation of the value of the asset for the tenth year can be expressed as
620
4
DEPDBSL Function
Chapter 4
y10=depdb(10,1000,15,2);
The value returned is 36.78. The first and the third arguments are expressed in years.
DEPDBSL Function Returns the declining balance with conversion to a straight-line depreciation. Financial
Category:
Syntax DEPDBSL(p,v,y,r)
Arguments p
is an integer, the period for which the calculation is to be done. v
is numeric, the depreciable initial value of the asset. y
is an integer, the lifetime of the asset. Range: y > 0 r
is numeric, the rate of depreciation that is expressed as a fraction. Range: r ≥ 0
Details The DEPDBSL function returns the depreciation by using the declining balance method with conversion to a straight-line depreciation, which is given by the following equation:
80 > > < v y 1 0 yr p01 ) = > v(10 ) > : y0t
DEPDBSL (p
r
; v; y; r
0
(
r y
t
)
p
0
t t 0
Details The DEPSL function returns the straight-line depreciation, which is given by
DEPSL (p
; v; y
) = DACCSL ( ) 0 DACCSL ( 0 1 ) p; v; y
p
; v; y
The p and y arguments must be expressed by using the same units of time.
622
DEPSYD Function
4
Chapter 4
Examples An asset, acquired on 01APR86, has a depreciable initial value of $1,000 and a ten-year lifetime. The depreciation in the value of the asset for the year 1986 can be expressed as d=depsl(9/12,1000,10);
The value returned is 75.00. The first and the third arguments are expressed in years.
DEPSYD Function Returns the sum-of-years-digits depreciation. Category:
Financial
Syntax DEPSYD(p,v,y)
Arguments p
is numeric, the period for which the calculation is to be done. For noninteger p arguments, the depreciation is prorated between the two consecutive time periods that precede and follow the fractional period. v
is numeric, the depreciable initial value of the asset. y
is numeric, the lifetime of the asset in number of depreciation periods. Range: y > 0
Details The DEPSYD function returns the sum-of-years-digits depreciation, which is given by
DEPSYD (p
; v; y
) = DACCSYD ( ) 0 DACCSYD ( 0 1 ) p; v; y
p
; v; y
The p and y arguments must be expressed by using the same units of time.
Examples An asset, acquired on 01OCT86, has a depreciable initial value of $1,000 and a five-year lifetime. The depreciations in the value of the asset for the years 1986 and 1987 can be expressed as
Functions and CALL Routines
4
DEPTAB Function
623
y1=depsyd(3/12,1000,5); y2=depsyd(15/12,1000,5);
The values returned are 83.33 and 316.67, respectively. The first and the third arguments are expressed in years.
DEPTAB Function Returns the depreciation from specified tables. Category: Financial
Syntax DEPTAB(p,v,t1,...,tn)
Arguments p
is numeric, the period for which the calculation is to be done. For noninteger p arguments, the depreciation is prorated between the two consecutive time periods that precede and follow the fractional period. v
is numeric, the depreciable initial value of the asset. t1,t2, . . . ,tn
are numeric, the fractions of depreciation for each time period with t1+t2+...tn 1.
Details The DEPTAB function returns the depreciation by using specified tables. The formula is
DEPTAB (p;v;t1;t2; :::; tn) = DACCTAB (p; v; t1;t2; :::; tn) 0 DACCTAB (p 0 1;v;t1;t2; :::; tn) t ;t2; . . . ;tk need to be specified with k=ceil(p).
For a given p, only the arguments 1
Examples An asset has a depreciable initial value of $1,000 and a five-year lifetime. Using a table of the annual depreciation rates of .15, .22, .21, .21, and .21 during the first, second, third, fourth, and fifth years, respectively, the depreciation in the third year can be expressed as
y3 = deptab (3; 1000; :15; :22; :21; :21; :21) ;
624
DEQUOTE Function
4
Chapter 4
The value that is returned is 210.00. The fourth rate, .21, and the fifth rate, .21, can be omitted because they are not needed in the calculation.
DEQUOTE Function Removes matching quotation marks from a character string that begins with a quotation mark, and deletes all characters to the right of the closing quotation mark. Category: Restriction:
Character “I18N Level 2” on page 306
Syntax DEQUOTE(string)
Arguments string
specifies a character constant, variable, or expression.
Details Length of Returned Variable In a DATA step, if the DEQUOTE function returns a value to a variable that has not been previously assigned a length, then that variable is given the length of the argument. The Basics The value that is returned by the DEQUOTE function is determined as follows: 3 If the first character of string is not a single or double quotation mark, DEQUOTE returns string unchanged. 3 If the first two characters of string are both single quotation marks or both double quotation marks, and the third character is not the same type of quotation mark, then DEQUOTE returns a result with a length of zero. 3 If the first character of string is a single quotation mark, the DEQUOTE function removes that single quotation mark from the result. DEQUOTE then scans string from left to right, looking for more single quotation marks. Each pair of consecutive, single quotation marks is reduced to one single quotation mark. The first single quotation mark that does not have an ending quotation mark in string is removed and all characters to the right of that quotation mark are also removed. 3 If the first character of string is a double quotation mark, the DEQUOTE function removes that double quotation mark from the result. DEQUOTE then scans string from left to right, looking for more double quotation marks. Each pair of consecutive, double quotation marks is reduced to one double quotation mark. The first double quotation mark that does not have an ending quotation mark in string is removed and all characters to the right of that quotation mark are also removed. Note: If string is a constant enclosed by quotation marks, those quotation marks are not part of the value of string. Therefore, you do not need to use DEQUOTE to remove the quotation marks that denote a constant. 4
Functions and CALL Routines
4
DEQUOTE Function
Examples This example demonstrates the use of DEQUOTE within a DATA step. options pageno=1 nodate ls=80 ps=64; data test; input string $60.; result = dequote(string); datalines; No quotation marks, no change No "leading" quotation marks, no change "Matching double quotation marks are removed" ’Matching single quotation marks are removed’ "Paired ""quotation marks"" are reduced" ’Paired ’’ quotation marks ’’ are reduced’ "Single ’quotation marks’ inside ’’ double’’ quotation marks are unchanged" ’Double "quotation marks" inside ""single"" quotation marks are unchanged’ "No matching quotation mark, no problem Don’t remove this apostrophe "Text after the matching quotation mark" is "deleted" ; proc print noobs; title ’Input Strings and Output Results from DEQUOTE’; run;
Output 4.45
Removing Matching Quotation Marks with the DEQUOTE Function Input Strings and Output Results from DEQUOTE string No quotation marks, no change No "leading" quotation marks, no change "Matching double quotation marks are removed" ’Matching single quotation marks are removed’ "Paired ""quotation marks"" are reduced" ’Paired ’’ quotation marks ’’ are reduced’ "Single ’quotation marks’ inside ’’ double’’ quotation marks ’Double "quotation marks" inside ""single"" quotation marks "No matching quotation mark, no problem Don’t remove this apostrophe "Text after the matching quotation mark" is "deleted" result No quotation marks, no change No "leading" quotation marks, no change Matching double quotation marks are removed Matching single quotation marks are removed Paired "quotation marks" are reduced Paired ’ quotation marks ’ are reduced Single ’quotation marks’ inside ’’ double’’ quotation marks Double "quotation marks" inside ""single"" quotation marks No matching quotation mark, no problem Don’t remove this apostrophe Text after the matching quotation mark
1
625
626
DEVIANCE Function
4
Chapter 4
DEVIANCE Function Returns the deviance based on a probability distribution. Category:
Mathematical
Syntax DEVIANCE(distribution, variable, shape-parametersn 00
0
1
(
:
The Gamma Distribution DEVIANCE(’GAMMA’, variable, where variable is a numeric random variable.
), less-than (
>
lt
<
<
The HTMLENCODE function encodes these characters by default. If you need to encode these characters only, then you do not need to specify the options argument. However, if you specify any value for the options argument, then the defaults are overridden, and you must explicitly specify the options for all of the characters you want to encode.
apos
’
'
Use this option to encode the apostrophe ( ’) character in text that is used in an HTML or XML tag attribute.
784
IBESSEL Function
4
Chapter 4
Option
Character
Character Entity Reference
Description
quot
"
"
Use this option to encode the double quotation mark (") character in text that is used in an HTML or XML tag attribute.
7bit
any character that is not represented in 7-bit ASCII encoding
nnn; (Unicode)
nnn is a one or more digit hexadecimal number. Encode these characters to create HTML or XML that is easily transferred through communication paths that might support only 7-bit ASCII encodings (for example, ftp or e-mail).
Examples SAS Statements
Results
htmlencode("John’s test ")
John’s test
htmlencode("John’s test ",’apos’)
John's test
htmlencode(’John "Jon" Smith ’,’quot’)
John "Jon" Smith
htmlencode("’A&B&C’",’amp lt gt apos’)
'A&B&C'
htmlencode(’80’x, ’7bit’) (’80’x is the euro symbol in Western European locales.)
€ (20AC is the Unicode code point for the euro symbol.)
See Also Function: “HTMLDECODE Function” on page 781
IBESSEL Function Returns the value of the modified Bessel function. Category:
Mathematical
Syntax IBESSEL(nu,x,kode)
Functions and CALL Routines
4
IFC Function
785
Arguments
nu
specifies a numeric constant, variable, or expression. Range: nu ≥ 0 x
specifies a numeric constant, variable, or expression. Range: x ≥ 0 kode
is a numeric constant, variable, or expression that specifies a nonnegative integer.
Details The IBESSEL function returns the value of the modified Bessel function of order nu evaluated at x (Abramowitz, Stegun 1964; Amos, Daniel, Weston 1977). When kode equals 0, the Bessel function is returned. Otherwise, the value of the following function is returned:
e0x Inm (x)
Examples SAS Statements
Results
x=ibessel(2,2,0);
0.6889484477
x=ibessel(2,2,1);
0.0932390333
IFC Function Returns a character value based on whether an expression is true, false, or missing. Category: Character Restriction:
“I18N Level 2” on page 306
Syntax IFC(logical-expression, value-returned-when-true, value-returned-when-false )
786
IFC Function
4
Chapter 4
Arguments logical-expression
specifies a numeric constant, variable, or expression. value-returned-when-true
specifies a character constant, variable, or expression that is returned when the value of logical-expression is true. value-returned-when-false
specifies a character constant, variable, or expression that is returned when the value of logical-expression is false. value-returned-when-missing
specifies a character constant, variable, or expression that is returned when the value of logical-expression is missing.
Details Length of Returned Variable In a DATA step, if the IFC function returns a value to a variable that has not previously been assigned a length, then that variable is given a length of 200 bytes. The Basics The IFC function uses conditional logic that enables you to select among several values based on the value of a logical expression. IFC evaluates the first argument, logical-expression. If logical-expression is true (that is, not zero and not missing), then IFC returns the value in the second argument. If logical-expression is a missing value, and you have a fourth argument, then IFC returns the value in the fourth argument. Otherwise, if logical-expression is false, IFC returns the value in the third argument. The IFC function is useful in DATA step expressions, and even more useful in WHERE clauses and other expressions where it is not convenient or possible to use an IF/THEN/ELSE construct.
Comparisons The IFC function is similar to the IFN function except that IFC returns a character value while IFN returns a numeric value.
Examples In the following example, IFC evaluates the expression grade>80 to implement the logic that determines the performance of several members on a team. The results are written to the SAS log. data _null_; input name $ grade; performance = ifc(grade>80, ’Pass put name= performance=; datalines; John 74 Kareem 89 Kati 100 Maria 92 ;
’, ’Needs Improvement’);
Functions and CALL Routines
4
IFN Function
787
run;
Output 4.49
Partial SAS Log: IFC Function
name=John performance=Needs Improvement name=Kareem performance=Pass name=Kati performance=Pass name=Maria performance=Pass
This example uses an IF/THEN/ELSE construct to generate the same output that is generated by the IFC function. The results are written to the SAS log. data _null_; input name $ grade; if grade>80 then performance=’Pass else performance = ’Needs Improvement’; put name= performance=; datalines; John 74 Sam 89 Kati 100 Maria 92 ;
’;
run;
Output 4.50
Partial SAS Log: IF/THEN/ELSE Construct
name=John performance=Needs Improvement name=Sam performance=Pass name=Kati performance=Pass name=Maria performance=Pass
See Also Functions: “IFN Function” on page 787
IFN Function Returns a numeric value based on whether an expression is true, false, or missing. Category: Numeric Restriction: “I18N Level 2” on page 306
Syntax IFN(logical-expression, value-returned-when-true, value-returned-when-false )
788
IFN Function
4
Chapter 4
Arguments logical-expression
specifies a numeric constant, variable, or expression. value-returned-when-true
specifies a numeric constant, variable, or expression that is returned when the value of logical-expression is true. value-returned-when-false
specifies a numeric constant, variable, or expression that is returned when the value of logical-expression is false. value-returned-when-missing
specifies a numeric constant, variable or expression that is returned when the value of logical-expression is missing.
Details The IFN function uses conditional logic that enables you to select among several values based on the value of a logical expression. IFN evaluates the first argument, then logical-expression. If logical-expression is true (that is, not zero and not missing), then IFN returns the value in the second argument. If logical-expression is a missing value, and you have a fourth argument, then IFN returns the value in the fourth argument. Otherwise, if logical-expression is false, IFN returns the value in the third argument. The IFN function, an IF/THEN/ELSE construct, or a WHERE statement can produce the same results (see “Examples” on page 788). However, the IFN function is useful in DATA step expressions when it is not convenient or possible to use an IF/THEN/ELSE construct or a WHERE statement.
Comparisons The IFN function is similar to the IFC function, except that IFN returns a numeric value whereas IFC returns a character value.
Examples Example 1: Calculating Sales Commission
The following examples show how to calculate sales commission using the IFN function, an IF/THEN/ELSE construct, and a WHERE statement. In each of the examples, the commission that is calculated is the same.
Calculating Commission Using the IFN Function
In the following example, IFN evaluates the expression TotalSales > 10000. If total sales exceeds $10,000, then the sales commission is 5% of the total sales. If total sales is less than $10,000, then the sales commission is 2% of the total sales. data _null_; input TotalSales; commission=ifn(TotalSales > 10000, TotalSales*.05, TotalSales*.02); put commission=; datalines; 25000 10000
Functions and CALL Routines
4
IFN Function
789
500 10300 ; run;
SAS writes the following output to the log: commission=1250 commission=200 commission=10 commission=515
Calculating Commission Using an IF/THEN/ELSE Construct In the following example, an IF/THEN/ELSE construct evaluates the expression TotalSales > 10000. If total sales exceeds $10,000, then the sales commission is 5% of the total sales. If total sales is less than $10,000, then the sales commission is 2% of the total sales. data _null_; input TotalSales; if TotalSales > 10000 then commission = .05 * TotalSales; else commission = .02 * TotalSales; put commission=; datalines; 25000 10000 500 10300 ; run;
SAS writes the following output to the log: commission=1250 commission=200 commission=10 commission=515
Calculating Commission Using a WHERE Statement
In the following example, a WHERE statement evaluates the expression TotalSales > 10000. If total sales exceeds $10,000, then the sales commission is 5% of the total sales. If total sales is less than $10,000, then the sales commission is 2% of the total sales. The output shows only those salespeople whose total sales exceed $10,000. options pageno=1 nodate ls=80 ps=64; data sales; input SalesPerson $ TotalSales; datalines; Michaels 25000 Janowski 10000 Chen 500 Gupta 10300 ;
790
INDEX Function
4
Chapter 4
data commission; set sales; where TotalSales > 10000; commission = TotalSales * .05; run; proc print data=commission; title ’Commission for Total Sales > 1000’; run;
Output 4.51
Output from a WHERE Statement Commission for Total Sales > 1000
Obs 1 2
Sales Person
Total Sales
commission
Michaels Gupta
25000 10300
1250 515
1
See Also Functions: “IFC Function” on page 785
INDEX Function Searches a character expression for a string of characters, and returns the position of the string’s first character for the first occurrence of the string. Category:
Character
“I18N Level 0” on page 305 Tip: DBCS equivalent function is KINDEX in SAS National Language Support (NLS): Reference Guide. See “DBCS Compatibility” on page 791. Restriction:
Syntax INDEX(source,excerpt)
Arguments source
specifies a character constant, variable, or expression to search. excerpt
is a character constant, variable, or expression that specifies the string of characters to search for in source.
Functions and CALL Routines
Tip:
4
INDEX Function
791
Enclose a literal string of characters in quotation marks.
Both leading and trailing spaces are considered part of the excerpt argument. To remove trailing spaces, include the TRIM function with the excerpt variable inside the INDEX function.
Tip:
Details The Basics
The INDEX function searches source, from left to right, for the first occurrence of the string specified in excerpt, and returns the position in source of the string’s first character. If the string is not found in source, INDEX returns a value of 0. If there are multiple occurrences of the string, INDEX returns only the position of the first occurrence.
DBCS Compatibility The DBCS equivalent function is KINDEX, which is documented in SAS National Language Support (NLS): Reference Guide. However, there is a minor difference in the way trailing blanks are handled. In KINDEX, multiple blanks in the second argument match a single blank in the first argument. The following example shows the differences between the two functions: index(’ABC,DE F(X=Y)’,’ kindex(’ABC,DE F(X=Y)’,’
’) ’)
=> 0 => 7
Examples Example 1: Finding the Position of a Variable in the Source String
The following
example finds the first position of the excerpt argument in source. data _null_; a = ’ABC.DEF(X=Y)’; b = ’X=Y’; x = index(a,b); put x=; run;
SAS writes the following output to the log: x=9
Example 2: Removing Trailing Spaces When You Use the INDEX Function with the TRIM Function The following example shows the results when you use the INDEX function with and without the TRIM function. If you use INDEX without the TRIM function, leading and trailing spaces are considered part of the excerpt argument. If you use INDEX with the TRIM function, TRIM removes trailing spaces from the excerpt argument as you can see in this example. Note that the TRIM function is used inside the INDEX function. options nodate nostimer ls=78 ps=60; data _null_; length a b $14; a=’ABC.DEF (X=Y)’; b=’X=Y’; q=index(a,b); w=index(a,trim(b));
792
INDEXC Function
4
Chapter 4
put q= w=; run;
SAS writes the following output to the log: q=0 w=10
See Also Functions: “FIND Function” on page 710 “INDEXC Function” on page 792 “INDEXW Function” on page 793
INDEXC Function Searches a character expression for any of the specified characters, and returns the position of that character. Character Restriction: “I18N Level 0” on page 305 Tip: DBCS equivalent function is KINDEXC in SAS National Language Support (NLS): Reference Guide. Category:
Syntax INDEXC(source,excerpt-1< ,… excerpt-n>)
Arguments source
specifies a character constant, variable, or expression to search. excerpt
specifies the character constant, variable, or expression to search for in source. Tip: If you specify more than one excerpt, separate them with a comma.
Details The INDEXC function searches source, from left to right, for the first occurrence of any character present in the excerpts and returns the position in source of that character. If none of the characters in excerpt-1 through excerpt-n in source are found, INDEXC returns a value of 0.
Comparisons The INDEXC function searches for the first occurrence of any individual character that is present within the character string, whereas the INDEX function searches for the
Functions and CALL Routines
4
INDEXW Function
793
first occurrence of the character string as a substring. The FINDC function provides more options.
Examples SAS Statements
Results
a=’ABC.DEP (X2=Y1)’; x=indexc(a,’0123’,’;()=.’); put x;
4
b=’have a good day’; x=indexc(b,’pleasant’,’very’); put x;
2
See Also Functions: “FINDC Function” on page 712 “INDEX Function” on page 790 “INDEXW Function” on page 793
INDEXW Function Searches a character expression for a string that is specified as a word, and returns the position of the first character in the word. Category: Character Restriction:
“I18N Level 0” on page 305
Syntax INDEXW(source, excerpt)
Arguments source
specifies a character constant, variable, or expression to search. excerpt
specifies a character constant, variable, or expression to search for in source. SAS removes leading and trailing delimiters from excerpt. delimiter
specifies a character constant, variable, or expression containing the characters that you want INDEXW to use as delimiters in the character string. The default delimiter is the blank character.
794
INDEXW Function
4
Chapter 4
Details The INDEXW function searches source, from left to right, for the first occurrence of excerpt and returns the position in source of the substring’s first character. If the substring is not found in source, then INDEXW returns a value of 0. If there are multiple occurrences of the string, then INDEXW returns only the position of the first occurrence. The substring pattern must begin and end on a word boundary. For INDEXW, word boundaries are delimiters, the beginning of source, and the end of source. If you use an alternate delimiter, then INDEXW does not recognize the end of the text as the end data. INDEXW has the following behavior when the second argument contains blank spaces or has a length of 0:
3 If both source and excerpt contain only blank spaces or have a length of 0, then INDEXW returns a value of 1.
3 If excerpt contains only blank spaces or has a length of 0, and source contains character or numeric data, then INDEXW returns a value of 0.
Comparisons The INDEXW function searches for strings that are words, whereas the INDEX function searches for patterns as separate words or as parts of other words. INDEXC searches for any characters that are present in the excerpts. The FINDW function provides more options.
Examples Example 1: Table of SAS Examples
The following SAS statements give these results.
SAS Statements
Results
s=’asdf adog dog’; p=’dog ’; x=indexw(s,p); put x;
11
s=’abcdef x=y’; p=’def’; x=indexw(s,p); put x;
0
x="abc,def@ xyz"; abc=indexw(x, " abc ", "@"); put abc;
0
x="abc,def@ xyz"; comma=indexw(x, ",", "@"); put comma;
0
x=’abc,def% xyz’; def=indexw(x, ’def’, ’%,’); put def;
5
x="abc,def@ xyz"; at=indexw(x, "@", "@"); put at;
0
x="abc,def@ xyz"; xyz=indexw(x, " xyz", "@"); put xyz;
9
Functions and CALL Routines
SAS Statements
x
INDEXW Function
795
Results
c=indexw(trimn(’ ’), ’ g=indexw(’
4
y
’);
1
’, trimn(’ ’));
0
Example 2: Using a Semicolon (;) As the Delimiter The following example shows how to use the semicolon delimiter in a SAS program that also calls the CATX function. A semicolon delimeter must be in place after each call to CATX, and the second argument in the INDEXW function must be trimmed or searches will not be successful. data temp; infile datalines; input name $12.; datalines; abcdef abcdef ; run; data temp2; set temp; format name_list $1024.; retain name_list ’ ’; exists=indexw(name_list, trim(name), ’;’); if exists=0 then do name_list=catx(’;’, name_list, name)||’;’ ; name_count +1; put ’-------------------------------’; put exists= ; put name_list= ; put name_count= ; end; run;
Output 4.52
Output from Using a Semicolon As the Delimiter
------------------------------exists=0 name_list=abcdef; name_count=1
In this example, the first time that CATX is called name_list is blank and the value of name is ’abcdef’. CATX returns ’abcdef’ with no semicolon appended. However, when INDEXW is called the second time, the value of name_list is ’abcdef’ followed by 1018 (1024–6) blanks, and the value of name is ’abcdef’ followed by six blanks. Because the third argument in INDEXW is a semicolon (;), the blanks are significant and do not denote a word boundary. Therefore, the second argument cannot be found in the first argument.
796
INDEXW Function
4
Chapter 4
If the example has no blanks, the behavior of INDEXW is easier to understand. In the following example, we expect the value of x to be 0 because the complete word ABCDE was not found in the first argument: x = indexw(’ABCDEF;XYZ’, ’ABCDE’, ’;’);
The only values for the second argument that would return a nonzero result are ABCDEF and XYZ.
Example 3: Using a Space As the Delimiter The following example uses a space as a delimiter: data temp; infile datalines; input name $12.; datalines; abcdef abcdef ; run; data temp2; set temp; format name_list $1024.; retain name_list ’ ’; exists=indexw(name_list, name, ’ ’); if exists=0 then do name_list=catx(’ ’, name_list, name) ; name_count +1; put ’-------------------------------’; put exists= ; put name_list= ; put name_count= ; end; run;
Output 4.53
Output from Using a Space as the Delimiter
------------------------------exists=0 name_list=abcdef name_count=1
See Also Functions: “FINDW Function” on page 719 “INDEX Function” on page 790 “INDEXC Function” on page 792
Functions and CALL Routines
4
INPUT Function
797
INPUT Function Returns the value that is produced when SAS converts an expression using the specified informat. Category: Special
Syntax INPUT(source, informat.)
Arguments source
specifies a character constant, variable, or expression to which you want to apply a specific informat. ? or ??
specifies the optional question mark (?) and double question mark (??) modifiers that suppress the printing of both the error messages and the input lines when invalid data values are read. The ? modifier suppresses the invalid data message. The ?? modifier also suppresses the invalid data message and, in addition, prevents the automatic variable _ERROR_ from being set to 1 when invalid data are read. informat.
is the SAS informat that you want to apply to the source. This argument must be the name of an informat followed by a period, and cannot be a character constant, variable, or expression.
Details If the INPUT function returns a character value to a variable that has not yet been assigned a length, by default the variable length is determined by the width of the informat. The INPUT function enables you to convert the value of source by using a specified informat. The informat determines whether the result is numeric or character. Use INPUT to convert character values to numeric values or other character values.
Comparisons The INPUT function returns the value produced when a SAS expression is converted using a specified informat. You must use an assignment statement to store that value in a variable. The INPUT statement uses an informat to read a data value and then optionally stores that value in a variable. The INPUT function requires the informat to be specified as a name followed by a period and optional decimal specification. The INPUTC and INPUTN functions allow the informat to be specified as a character constant, variable, or expression.
798
INPUT Function
4
Chapter 4
Examples Example 1: Converting Character Values to Numeric Values
This example uses the INPUT function to convert a character value to a numeric value and store it in another variable. The COMMA9. informat reads the value of the SALE variable, stripping the commas. The resulting value, 2115353, is stored in FMTSALE. data testin; input sale $9.; fmtsale=input(sale,comma9.); datalines; 2,115,353 ;
Example 2: Using PUT and INPUT Functions
In this example, PUT returns a numeric value as a character string. The value 122591 is assigned to the CHARDATE variable. INPUT returns the value of the character string as a SAS date value using a SAS date informat. The value 11681 is stored in the SASDATE variable. numdate=122591; chardate=put(numdate,z6.); sasdate=input(chardate,mmddyy6.);
Example 3: Suppressing Error Messages
In this example, the question mark (?) modifier tells SAS not to print the invalid data error message if it finds data errors. The automatic variable _ERROR_ is set to 1 and input data lines are written to the SAS log. y=input(x,? 3.1);
Because the double question mark (??) modifier suppresses printing of error messages and input lines and prevents the automatic variable _ERROR_ from being set to 1 when invalid data are read, the following two examples produce the same result:
3 y=input(x,?? 2.); 3 y=input(x,? 2.); _error_=0;
See Also Functions: “INPUTC Function” on page 799 “INPUTN Function” on page 801 “PUT Function” on page 1021 “PUTC Function” on page 1023 “PUTN Function” on page 1025 Statements: “INPUT Statement” on page 1567
Functions and CALL Routines
4
INPUTC Function
799
INPUTC Function Enables you to specify a character informat at run time. Category: Special
Syntax INPUTC(source, informat< ,w>)
Arguments source
specifies a character constant, variable, or expression to which you want to apply the informat. informat
is a character constant, variable, or expression that contains the character informat you want to apply to source. w
is a numeric constant, variable, or expression that specifies a width to apply to the informat. Interaction: If you specify a width here, it overrides any width specification in the
informat.
Details If the INPUTC function returns a value to a variable that has not yet been assigned a length, by default the variable length is determined by the length of the first argument.
Comparisons The INPUTN function enables you to specify a numeric informat at run time. Using the INPUT function is faster because you specify the informat at compile time.
800
INPUTC Function
4
Chapter 4
Examples Example 1: Specifying Character Informats
The PROC FORMAT step in this example creates a format, TYPEFMT., that formats the variable values 1, 2, and 3 with the name of one of the three informats that this step also creates. The informats store responses of "positive," "negative," and "neutral" as different words, depending on the type of question. After PROC FORMAT creates the format and informats, the DATA step creates a SAS data set from raw data consisting of a number identifying the type of question and a response. After reading a record, the DATA step uses the value of TYPE to create a variable, RESPINF, that contains the value of the appropriate informat for the current type of question. The DATA step also creates another variable, WORD, whose value is the appropriate word for a response. The INPUTC function assigns the value of WORD based on the type of question and the appropriate informat. proc format; value typefmt 1=’$groupx’ 2=’$groupy’ 3=’$groupz’; invalue $groupx ’positive’=’agree’ ’negative’=’disagree’ ’neutral’=’notsure’; invalue $groupy ’positive’=’accept’ ’negative’=’reject’ ’neutral’=’possible’; invalue $groupz ’positive’=’pass’ ’negative’=’fail’ ’neutral’=’retest’; run; data answers; input type response $; respinformat = put(type, typefmt.); word = inputc(response, respinformat); datalines; 1 positive 1 negative 1 neutral 2 positive 2 negative 2 neutral 3 positive 3 negative 3 neutral ;
The value of WORD for the first observation is agree. The value of WORD for the last observation is retest.
Functions and CALL Routines
4
INPUTN Function
801
See Also Functions: “INPUT Function” on page 797 “INPUTN Function” on page 801 “PUT Function” on page 1021 “PUTC Function” on page 1023 “PUTN Function” on page 1025
INPUTN Function Enables you to specify a numeric informat at run time. Category: Special
Syntax INPUTN(source, informat< ,w>)
Arguments
source
specifies a character constant, variable, or expression to which you want to apply the informat. informat
is a character constant, variable or expression that contains the numeric informat you want to apply to source. w
is a numeric constant, variable, or expression that specifies a width to apply to the informat. Interaction:
If you specify a width here, it overrides any width specification in the
informat. d
is a numeric constant, variable, or expression that specifies the number of decimal places to use. If you specify a number here, it overrides any decimal-place specification in the informat.
Interaction:
Comparisons The INPUTC function enables you to specify a character informat at run time. Using the INPUT function is faster because you specify the informat at compile time.
802
INPUTN Function
4
Chapter 4
Examples Example 1: Specifying Numeric Informats
The PROC FORMAT step in this example creates a format, READDATE., that formats the variable values 1 and 2 with the name of a SAS date informat. The DATA step creates a SAS data set from raw data originally from two different sources (indicated by the value of the variable SOURCE). Each source specified dates differently. After reading a record, the DATA step uses the value of SOURCE to create a variable, DATEINF, that contains the value of the appropriate informat for reading the date. The DATA step also creates a new variable, NEWDATE, whose value is a SAS date. The INPUTN function assigns the value of NEWDATE based on the source of the observation and the appropriate informat. proc format; value readdate 1=’date7.’ 2=’mmddyy8.’; run; options yearcutoff=1920; data fixdates (drop=start dateinformat); length jobdesc $12; input source id lname $ jobdesc $ start $; dateinformat=put(source, readdate.); newdate = inputn(start, dateinformat); datalines; 1 1604 Ziminski writer 09aug90 1 2010 Clavell editor 26jan95 2 1833 Rivera writer 10/25/92 2 2222 Barnes proofreader 3/26/98 ;
See Also Functions: “INPUT Function” on page 797 “INPUTC Function” on page 799 “PUT Function” on page 1021 “PUTC Function” on page 1023 “PUTN Function” on page 1025
Functions and CALL Routines
4
INT Function
803
INT Function Returns the integer value, fuzzed to avoid unexpected floating-point results. Category: Truncation
Syntax INT(argument)
Arguments
argument
specifies a numeric constant, variable, or expression.
Details The INT function returns the integer portion of the argument (truncates the decimal portion). If the argument’s value is within 1E-12 of an integer, the function results in that integer. If the value of argument is positive, the INT function has the same result as the FLOOR function. If the value of argument is negative, the INT function has the same result as the CEIL function.
Comparisons Unlike the INTZ function, the INT function fuzzes the result. If the argument is within 1E-12 of an integer, the INT function fuzzes the result to be equal to that integer. The INTZ function does not fuzz the result. Therefore, with the INTZ function you might get unexpected results.
Examples The following SAS statements produce these results. SAS Statements
Results
var1=2.1; x=int(var1); put x;
2
var2=-2.4; y=int(var2); put y;
-2
a=int(1+1.e-11); put a;
1
b=int(-1.6); put b;
-1
804
INTCINDEX Function
4
Chapter 4
See Also Functions: “CEIL Function” on page 555 “FLOOR Function” on page 731 “INTZ Function” on page 835
INTCINDEX Function Returns the cycle index when a date, time, or datetime interval and value are specified. Date and Time
Category:
Syntax INTCINDEX(interval>, date-time-value)
Arguments
interval
specifies a character constant, a variable, or an expression that contains an interval name such as WEEK, MONTH, or QTR. Interval can appear in uppercase or lowercase. The possible values of interval are listed in the “Intervals Used with Date and Time Functions” table in SAS Language Reference: Concepts. Tip:
If interval is a character constant, then enclose the value in quotation marks.
Valid values for interval depend on whether date-time-value is a date, time, or datetime value.
Requirement:
Multipliers and shift indexes can be used with the basic interval names to construct more complex interval specifications. The general form of an interval name is as follows: interval The three parts of the interval name are as follows: interval specifies the name of the basic interval type. For example, YEAR specifies yearly intervals. multiple specifies an optional multiplier that sets the interval equal to a multiple of the period of the basic interval type. For example, the interval YEAR2 consists of two-year, or biennial, periods. See: “Incrementing Dates and Times by Using Multipliers and by Shifting
Intervals” on page 320 for more information.
Functions and CALL Routines
4
INTCINDEX Function
805
shift-index specifies an optional shift index that shifts the interval to start at a specified subperiod starting point. For example, YEAR.3 specifies yearly periods shifted to start on the first of March of each calendar year and to end in February of the following year. Restriction: The shift index cannot be greater than the number of subperiods
in the whole interval. For example, you could use YEAR2.24, but YEAR2.25 would be an error because there is no 25th month in a two-year interval. Restriction: If the default shift period is the same as the interval, then only
multiperiod intervals can be shifted with the optional shift index. For example, because MONTH intervals shift by MONTH periods by default, monthly intervals cannot be shifted with the shift index. However, bimonthly intervals can be shifted with the shift index, because there are two MONTH intervals in each MONTH2 interval. For example, the interval name MONTH2.2 specifies bimonthly periods starting on the first day of even-numbered months. See: “Incrementing Dates and Times by Using Multipliers and by Shifting
Intervals” on page 320 for more information. date-time-value
specifies a date, time, or datetime value that represents a time period of a specified interval.
Details The INTCINDEX function returns the index of the seasonal cycle when you specify an interval and a SAS date, time, or datetime value. For example, if the interval is MONTH, each observation in the data corresponds to a particular month. Monthly data is considered to be periodic for a one-year period. A year contains 12 months, so the number of intervals (months) in a seasonal cycle (year) is 12. WEEK is the seasonal cycle for an interval that is equal to DAY. Therefore, intcindex(’day’,’01SEP78’d); returns a value of 35 because September 1, 1978, is the sixth day of the 35th week of the year. For more information about working with date and time intervals, see “Date and Time Intervals” on page 319. The INTCINDEX function can also be used with calendar intervals from the retail industry. These intervals are ISO 8601 compliant. For a list of these intervals, see “Retail Calendar Intervals: ISO 8601 Compliant” in SAS Language Reference: Concepts.
Comparisons The INTCINDEX function returns the cycle index, whereas the INTINDEX function returns the seasonal index. In the example cycle_index = intcindex(’day’,’04APR2005’d);, the INTCINDEX function returns the week of the year. In the example index = intindex(’day’,’04APR2005’d);, the INTINDEX function returns the day of the week. In the example cycle_index = intcindex(’minute’,’01Sep78:00:00:00’dt);, the INTCINDEX function returns the hour of the day. In the example index = intindex(’minute’,’01Sep78:00:00:00’dt);, the INTINDEX function returns the minute of the hour. In the example intseas(intcycle(’interval’));, the INTSEAS function returns the maximum number that could be returned by intcindex(’interval’,date);.
806
INTCK Function
4
Chapter 4
Examples The following SAS statements produce these results: SAS Statements
Results
cycle_index1 = intcindex(’day’, ’01SEP05’d); put cycle_index1;
35
cycle_index2 = intcindex(’dtqtr’, ’23MAY2005:05:03:01’dt); put cycle_index2;
1
cycle_index3 = intcindex(’tenday’, ’13DEC2005’ d); put cycle_index3;
1
cycle_index4 = intcindex(’minute’, ’23:13:02’t); put cycle_index4; 24 var1 = ’semimonth’; cycle_index5 = intcindex(var1, ’05MAY2005:10:54:03’dt); put cycle_index5;
1
See Also Functions: “INTINDEX Function” on page 819 “INTCYCLE Function” on page 810 “INTSEAS Function” on page 829
INTCK Function Returns the count of the number of interval boundaries between two dates, two times, or two datetime values. Category:
Date and Time
Syntax INTCK(interval, start-from, increment, ) INTCK(custom-interval, start-from, increment, )
Functions and CALL Routines
4
INTCK Function
807
Arguments
interval
specifies a character constant, a variable, or an expression that contains an interval name. Interval can appear in uppercase or lowercase. The possible values of interval are listed in the “Intervals Used with Date and Time Functions” table in SAS Language Reference: Concepts. The type of interval (date, datetime, or time) must match the type of value in from.
Requirement:
Multipliers and shift indexes can be used with the basic interval names to construct more complex interval specifications. The general form of an interval name is as follows: interval The three parts of the interval name are interval specifies the name of the basic interval type. For example, YEAR specifies yearly intervals. multiple specifies an optional multiplier that sets the interval equal to a multiple of the period of the basic interval type. For example, the interval YEAR2 consists of two-year, or biennial, periods. See: “Incrementing Dates and Times by Using Multipliers and by Shifting
Intervals” on page 320for more information. custom-interval specifies a user-defined interval that is defined by a SAS data set. Each observation contains two variables, begin and end. See: “Details” on page 808 for more information about custom intervals. Requirement: You must use the INTERVALDS system option if you use the
custom-interval variable. shift-index specifies an optional shift index that shifts the interval to start at a specified subperiod starting point. For example, YEAR.3 specifies yearly periods shifted to start on the first of March of each calendar year and to end in February of the following year. Restriction: The shift index cannot be greater than the number of subperiods
in the whole interval. For example, you could use YEAR2.24, but YEAR2.25 would be an error because there is no 25th month in a two-year interval. Restriction: If the default shift period is the same as the interval type, then
only multiperiod intervals can be shifted with the optional shift index. For example, MONTH type intervals shift by MONTH subperiods by default; thus, monthly intervals cannot be shifted with the shift index. However, bimonthly intervals can be shifted with the shift index, because there are two MONTH intervals in each MONTH2 interval. The interval name MONTH2.2, for example, specifies bimonthly periods starting on the first day of even-numbered months. See: “Incrementing Dates and Times by Using Multipliers and by Shifting
Intervals” on page 320for more information.
808
INTCK Function
4
Chapter 4
start-from
specifies a SAS expression that represents the starting SAS date, time, or datetime value. increment
specifies a SAS expression that represents the ending SAS date, time, or datetime value. ’alignment’
controls the position of SAS dates within the interval. You must enclose alignment in quotation marks. Alignment can be one of these values: CONTINUOUS specifies that continuous time is measured (the interval is shifted based on the starting date). Alias: C or CONT DISCRETE specifies that discrete time is measured. Alias: D or DISC
Details Time Series Analysis: The Basics
Times series analysis uses time intervals to analyze events. All values within the interval are interpreted as being equivalent. This means that the dates of January 1, 2005 and January 15, 2005 are equivalent when you specify a monthly interval. Both of these dates represent the interval that begins on January 1, 2005 and ends on January 31, 2005. You can use the date for the beginning of the interval (January 1, 2005) or the date for the end of the interval (January 31, 2005) to identify the interval. These dates represent all of the dates within the monthly interval. In the example intck(’qtr’,’14JAN2005’d,’02SEP2005’d);, the start-from argument (’14JAN2005’d) is equivalent to the first quarter of 2005. The increment argument (’02SEP2005’d) is equivalent to the third quarter of 2005. The interval count, that is, the number of times the beginning of an interval is reached in moving from the start-from argument to the increment argument is 2. WEEK intervals are determined by the number of Sundays that occur between the start-from argument and the increment argument, and not by how many seven-day periods fall between the start-from argument and the increment argument. Both the multiple and the shift-index arguments are optional and default to 1. For example, YEAR, YEAR1, YEAR.1, and YEAR1.1 are all equivalent ways of specifying ordinary calendar years. For more information about working with date and time intervals, see “Date and Time Intervals” on page 319.
Custom Intervals A custom interval is defined by a SAS data set. The data set must contain two variables, begin and end. Each observation represents one interval with the begin variable containing the start of the interval, and the end variable containing the end of the interval. The intervals must be listed in ascending order. You cannot have gaps between intervals, and intervals cannot overlap. The SAS system option INTERVALDS is used to define custom intervals and associate interval data sets with new interval names. The following example shows how to specify the INTERVALDS system option: options intervalds=(interval=libref.dataset-name);
where
Functions and CALL Routines
4
INTCK Function
809
interval specifies the name of an interval. The value of interval is the data set that is named in libref.dataset-name. libref.dataset-name specifies the libref and data set name of the file that contains user-supplied holidays.
Retail Calendar Intervals The retail industry often accounts for its data by dividing the yearly calendar into four 13-week periods, based on one of the following formats: 4-4-5, 4-5-4, or 5-4-4. The first, second, and third numbers specify the number of weeks in the first, second, and third month of each period, respectively. For more information, see “Retail Calendar Intervals: ISO 8601 Compliant” in SAS Language Reference: Concepts.
Examples The following SAS statements produce these results: SAS Statements
Results
qtr=intck(’qtr’,’10jan95’d,’01jul95’d); put qtr;
2
year=intck(’year’,’31dec94’d, ’01jan95’d); put year;
1
year=intck(’year’,’01jan94’d, ’31dec94’d); put year;
0
semi=intck(’semiyear’,’01jan95’d, ’01jan98’d); put semi;
6
weekvar=intck(’week2.2’,’01jan97’d, ’31mar97’d); put weekvar;
6
wdvar=intck(’weekday7w’,’01jan97’d, ’01feb97’d); put wdvar;
26
y=’year’; date1=’1sep1991’d; date2=’1sep2001’d; newyears=intck(y,date1,date2); put newyears;
10
y=trim(’year ’); date1=’1sep1991’d + 300; date2=’1sep2001’d - 300; newyears=intck(y,date1,date2); put newyears;
8
In the second example, INTCK returns a value of 1 even though only one day has elapsed. This result is because the interval from December 31, 1994, to January 1, 1995, contains the starting point for the YEAR interval. However, in the third example, a value of 0 is returned even though 364 days have elapsed. This result is because the
810
INTCYCLE Function
4
Chapter 4
period between January 1, 1994, and December 31, 1994, does not contain the starting point for the interval. In the fourth example, SAS returns a value of 6 because January 1, 1995, through January 1, 1998, contains six semiyearly intervals. (Note that if the ending date were December 31, 1997, SAS would count five intervals.) In the fifth example, SAS returns a value of 6 because there are six two-week intervals beginning on a first Monday during the period of January 1, 1997, through March 31, 1997. In the sixth example, SAS returns the value 26. That indicates that beginning with January 1, 1997, and counting only Saturdays as weekend days through February 1, 1997, the period contains 26 weekdays. In the seventh example, the use of variables for the arguments is illustrated. The use of expressions for the arguments is illustrated in the last example.
See Also Functions: “INTNX Function” on page 822 System Options: “INTERVALDS= System Option” on page 1872
INTCYCLE Function Returns the date, time, or datetime interval at the next higher seasonal cycle when a date, time, or datetime interval is specified. Category:
Date and Time
Syntax INTCYCLE(interval >)
Arguments interval
specifies a character constant, a variable, or an expression that contains an interval name such as WEEK, MONTH, or QTR. Interval can appear in uppercase or lowercase. The possible values of interval are listed in the “Intervals Used with Date and Time Functions” table in SAS Language Reference: Concepts.
Functions and CALL Routines
4
INTCYCLE Function
811
Multipliers and shift indexes can be used with the basic interval names to construct more complex interval specifications. The general form of an interval name is as follows: interval
The three parts of the interval name are as follows: interval specifies the name of the basic interval type. For example, YEAR specifies yearly intervals. multiple specifies an optional multiplier that sets the interval equal to a multiple of the period of the basic interval type. For example, the interval YEAR2 consists of two-year, or biennial, periods. See: “Incrementing Dates and Times by Using Multipliers and by Shifting
Intervals” on page 320 for more information. shift-index specifies an optional shift index that shifts the interval to start at a specified subperiod starting point. For example, YEAR.3 specifies yearly periods shifted to start on the first of March of each calendar year and to end in February of the following year. Restriction: The shift index cannot be greater than the number of subperiods in
the whole interval. For example, you could use YEAR2.24, but YEAR2.25 would be an error because there is no 25th month in a two-year interval. Restriction: If the default shift period is the same as the interval type, then only
multiperiod intervals can be shifted with the optional shift index. For example, because MONTH type intervals shift by MONTH subperiods by default, monthly intervals cannot be shifted with the shift index. However, bimonthly intervals can be shifted with the shift index, because there are two MONTH intervals in each MONTH2 interval. For example, the interval name MONTH2.2 specifies bimonthly periods starting on the first day of even-numbered months. See: “Incrementing Dates and Times by Using Multipliers and by Shifting
Intervals” on page 320 for more information.
Details The INTCYCLE function returns the interval of the seasonal cycle, depending on a date, time, or datetime interval. For example, INTCYCLE(’MONTH’); returns the value YEAR because the months from January through December constitute a yearly cycle. INTCYCLE(’DAY’); returns the value WEEK because the days from Sunday through Saturday constitute a weekly cycle. See “Incrementing Dates and Times by Using Multipliers and by Shifting Intervals” on page 320 for information about multipliers and shift indexes. See “Commonly Used Time Intervals” on page 320 for information about how intervals are calculated. For more information about working with date and time intervals, see “Date and Time Intervals” on page 319. The INTCYCLE function can also be used with calendar intervals from the retail industry. These intervals are ISO 8601 compliant. For more information, see “Retail Calendar Intervals: ISO 8601 Compliant” in SAS Language Reference: Concepts.
812
INTFIT Function
4
Chapter 4
Examples The following examples produce these results: SAS Statements
Results
cycle_year = intcycle(’year’); put cycle_year;
YEAR
cycle_quarter = intcycle(’qtr’); put cycle_quarter;
YEAR
cycle_month = intcycle(’month’); put cycle_month;
YEAR
cycle_day = intcycle(’day’); put cycle_day; var1 = ’second’; cycle_second = intcycle(var1); put cycle_second;
WEEK
DTMINUTE
See Also Functions: “INTSEAS Function” on page 829 “INTINDEX Function” on page 819 “INTCINDEX Function” on page 804
INTFIT Function Returns a time interval that is aligned between two dates. Category:
Date and Time
Syntax INTFIT(argument-1, argument-2, ’type’)
Arguments
argument
specifies a SAS expression that represents a SAS date or datetime value, or an observation. Observation numbers are more likely to be used as arguments if date or datetime values are not available.
Tip:
Functions and CALL Routines
4
INTFIT Function
813
’type’
specifies whether the arguments are SAS date values, datetime values, or observations. The following values for type are valid: d
specifies that argument-1 and argument-2 are date values.
dt
specifies that argument-1 and argument-2 are datetime values.
obs
specifies that argument-1 and argument-2 are observations.
Details The INTFIT function returns the most likely time interval based on two dates, datetime values, or observations that have been aligned within an interval. INTFIT assumes that the alignment value is SAME, which specifies that the date is aligned to the same calendar date with the corresponding interval increment. For more information about the alignment argument, see “INTNX Function” on page 822. If the arguments that are used with INTFIT are observations, you can determine the cycle of an occurrence by using observation numbers. In the following example, the first two arguments of INTFIT are observation numbers, and the type argument is obs. If Jason used the gym the first time and the 25th time that a researcher recorded data, you could determine the interval by using the following statement: interval=intfit(1,25,’obs’);. In this case, the value of interval is OBS24.2. For information about time series, see the SAS/ETS User’s Guide. The INTFIT function can also be used with calendar intervals from the retail industry. These intervals are ISO 8601 compliant. For more information, see “Retail Calendar Intervals: ISO 8601 Compliant” in SAS Language Reference: Concepts.
Examples Example 1: Finding Intervals That Are Aligned between Two Dates
The following example shows the intervals that are aligned between two dates. The type argument in this example identifies the input as date values. options pageno=1 nodate ls=80 ps=64; data a; length interval $20; date1=’01jan06’d; do i=1 to 25; date2=intnx(’day’, date1, i); interval=intfit(date1, date2, ’d’); output; end; format date1 date2 date.; run; proc print data=a; run;
814
INTFIT Function
4
Chapter 4
Output 4.54
Interval Output from the INTFIT Function The SAS System Obs
interval
date1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
DAY DAY2 DAY3.3 DAY4.3 DAY5.3 DAY6.3 WEEK DAY8.3 DAY9.9 TENDAY DAY11.6 DAY12.3 DAY13.7 WEEK2.8 SEMIMON DAY16.3 DAY17.7 DAY18.9 DAY19.7 TENDAY2 WEEK3.8 DAY22.17 DAY23.13 DAY24.3 DAY25.3
01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06
1 i 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
date2 02JAN06 03JAN06 04JAN06 05JAN06 06JAN06 07JAN06 08JAN06 09JAN06 10JAN06 11JAN06 12JAN06 13JAN06 14JAN06 15JAN06 16JAN06 17JAN06 18JAN06 19JAN06 20JAN06 21JAN06 22JAN06 23JAN06 24JAN06 25JAN06 26JAN06
The output shows that if the increment value is one day, then the result of the INTFIT function is DAY. If the increment value is two days, then the result of the INTFIT function is DAY2. If the increment value is three days, then the result is DAY3.3, with a shift index of 3. (If the two input dates are a Friday and a Monday, then the result is WEEKDAY.) If the increment value is seven days, then the result is WEEK.
Example 2: Finding Intervals That Are Aligned between Two Dates When the Dates Are Identified As Observations The following example shows the intervals that are aligned between two dates. The type argument in this example identifies the input as observations. options pageno=1 nodate ls=80 ps=64; data a; length interval $20; date1=’01jan06’d; do i=1 to 25; date2=intnx(’day’, date1, i); interval=intfit(date1, date2, ’obs’); output; end; format date1 date2 date.; run; proc print data=a; run;
Functions and CALL Routines
Output 4.55
4
INTFMT Function
815
Interval Output from the INTFIT Function When Dates Are Identified as Observations The SAS System
1
Obs
interval
date1
i
date2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
OBS OBS2 OBS3.3 OBS4.3 OBS5.3 OBS6.3 OBS7.3 OBS8.3 OBS9.9 OBS10.3 OBS11.6 OBS12.3 OBS13.7 OBS14.3 OBS15.3 OBS16.3 OBS17.7 OBS18.9 OBS19.7 OBS20.3 OBS21.3 OBS22.17 OBS23.13 OBS24.3 OBS25.3
01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06 01JAN06
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
02JAN06 03JAN06 04JAN06 05JAN06 06JAN06 07JAN06 08JAN06 09JAN06 10JAN06 11JAN06 12JAN06 13JAN06 14JAN06 15JAN06 16JAN06 17JAN06 18JAN06 19JAN06 20JAN06 21JAN06 22JAN06 23JAN06 24JAN06 25JAN06 26JAN06
See Also Functions: “INTNX Function” on page 822 “INTCK Function” on page 806
INTFMT Function Returns a recommended SAS format when a date, time, or datetime interval is specified. Category: Date and Time
Syntax INTFMT(interval, ’size’)
Arguments interval
specifies a character constant, a variable, or an expression that contains an interval name such as WEEK, MONTH, or QTR. Interval can appear in uppercase or
816
INTFMT Function
4
Chapter 4
lowercase. The possible values of interval are listed in the “Intervals Used with Date and Time Functions” table in SAS Language Reference: Concepts. Multipliers and shift indexes can be used with the basic interval names to construct more complex interval specifications. The general form of an interval name is as follows: interval The three parts of the interval name are as follows: interval specifies the name of the basic interval type. For example, YEAR specifies yearly intervals. multiple specifies an optional multiplier that sets the interval equal to a multiple of the period of the basic interval type. For example, the interval YEAR2 consists of two-year, or biennial, periods. See: “Incrementing Dates and Times by Using Multipliers and by Shifting
Intervals” on page 320 for more information. shift-index specifies an optional shift index that shifts the interval to start at a specified subperiod starting point. For example, YEAR.3 specifies yearly periods shifted to start on the first of March of each calendar year and to end in February of the following year. Restriction: The shift index cannot be greater than the number of subperiods
in the whole interval. For example, you could use YEAR2.24, but YEAR2.25 would be an error because there is no 25th month in a two-year interval. Restriction: If the default shift period is the same as the interval type, then
only multiperiod intervals can be shifted with the optional shift index. For example, because MONTH type intervals shift by MONTH subperiods by default, monthly intervals cannot be shifted with the shift index. However, bimonthly intervals can be shifted with the shift index, because there are two MONTH intervals in each MONTH2 interval. For example, the interval name MONTH2.2 specifies bimonthly periods starting on the first day of even-numbered months. See: “Incrementing Dates and Times by Using Multipliers and by Shifting
Intervals” on page 320 for more information. ’size’
specifies either LONG or SHORT. When a format includes a year value, LONG or L specifies a format that uses a four-digit year. SHORT or S specifies a format that uses a two-digit year.
Details The INTFMT function returns a recommended format depending on a date, time, or datetime interval for displaying the time ID values that are associated with a time series of a given interval. The valid values of SIZE (LONG, L, SHORT, or S) specify whether to use a two-digit or a four-digit year when the format refers to a SAS date value. For more information about working with date and time intervals, see “Date and Time Intervals” on page 319. The INTFMT function can also be used with calendar intervals from the retail industry. These intervals are ISO 8601 compliant. For a list of these intervals, see “Retail Calendar Intervals: ISO 8601 Compliant” in SAS Language Reference: Concepts.
Functions and CALL Routines
4
INTGET Function
817
Examples The following SAS statements produce these results: SAS Statements
Results
fmt1 = intfmt(’qtr’, ’s’); put fmt1;
YYQC4.
fmt2 = intfmt(’qtr’, ’l’); put fmt2;
YYQC6.
fmt3 = intfmt(’month’, ’l’); put fmt3;
MONYY7.
fmt4 = intfmt(’week’, ’short’); put fmt4;
WEEKDATX15.
fmt5 = intfmt(’week3.2’, ’l’); put fmt5;
WEEKDATX17.
fmt6 = intfmt(’day’, ’long’); put fmt6;
DATE9.
var1 = ’month2’; fmt7 = intfmt(var1, ’long’); put fmt7;
MONYY7.
INTGET Function Returns a time interval based on three date or datetime values. Category: Date and Time
Syntax INTGET(date-1, date-2, date-3)
Arguments date
specifies a SAS date or datetime value.
Details INTGET Function Intervals The INTGET function returns a time interval based on three date or datetime values. The function first determines all possible intervals between the first two dates, and then determines all possible intervals between the second and third dates. If the intervals are the same, INTGET returns that interval. If the intervals for the first and second dates differ, and the intervals for the second and third dates differ, INTGET compares the intervals. If one interval is a multiple of the other, then INTGET returns the smaller of the two intervals. Otherwise, INTGET
818
INTGET Function
4
Chapter 4
returns a missing value. INTGET works best with dates generated by the INTNX function whose alignment value is BEGIN. In the following example, INTGET returns the interval DAY2: interval=intget(’01mar00’d, ’03mar00’d, ’09mar00’d);
The interval between the first and second dates is DAY2, because the number of days between March 1, 2000, and March 3, 2000, is two. The interval between the second and third dates is DAY6, because the number of days between March 3, 2000, and March 9, 2000, is six. DAY6 is a multiple of DAY2. INTGET returns the smaller of the two intervals. In the following example, INTGET returns the interval MONTH4: interval=intget(’01jan00’d, ’01may00’d, ’01may01’d);
The interval between the first two dates is MONTH4, because the number of months between January 1, 2000, and May 1, 2000, is four. The interval between the second and third dates is YEAR. INTGET determines that YEAR is a multiple of MONTH4 (there are three MONTH4 intervals in YEAR), and returns the smaller of the two intervals. In the following example, INTGET returns a missing value: interval=intget(’01Jan2006’d, ’01Apr2006’d, ’01Dec2006’d);
The interval between the first two dates is MONTH3, and the interval between the second and third dates is MONTH8. INTGET determines that MONTH8 is not a multiple of MONTH3, and returns a missing value. The intervals that are returned are valid SAS intervals, including multiples of the intervals and shift intervals. Valid SAS intervals are listed in the “Intervals Used with Date and Time Functions” table in SAS Language Reference: Concepts. Note: If INTGET cannot determine a matching interval, then the function returns a missing value. No message is written to the SAS log. 4
Retail Calendar Intervals The INTGET function can also be used with calendar intervals from the retail industry. These intervals are ISO 8601 compliant. For more information, see “Retail Calendar Intervals: ISO 8601 Compliant” in SAS Language Reference: Concepts.
Examples The following examples produce these results:
SAS Statements
Results
interval=intget(’01jan00’d,’01jan01’d,’01may01’d); put interval;
MONTH4
interval=intget(’29feb80’d,’28feb82’d,’29feb84’d); put interval;
YEAR2.2
interval=intget(’01feb80’d,’16feb80’d,’01mar80’d); put interval;
SEMIMONTH
interval=intget(’2jan09’d,’2feb10’d,’2mar11’d); put interval;
MONTH13.4
Functions and CALL Routines
4
SAS Statements
Results
interval=intget(’10feb80’d,’19feb80’d,’28feb80’d); put interval;
DAY9.2
interval=intget(’01apr2006:00:01:02’dt, ’01apr2006:00:02:02’dt, ’01apr2006:00:03:02’dt); put interval;
MINUTE
INTINDEX Function
819
See Also Functions: “INTFIT Function” on page 812 “INTNX Function” on page 822
INTINDEX Function Returns the seasonal index when a date, time, or datetime interval and value are specified. Category: Date and Time
Syntax INTINDEX(interval>, date-value)
Arguments interval
specifies a character constant, a variable, or an expression that contains an interval name such as WEEK, MONTH, or QTR. Interval can appear in uppercase or lowercase. The possible values of interval are listed in the “Intervals Used with Date and Time Functions” table in SAS Language Reference: Concepts. Tip: If interval is a character constant, then enclose the value in quotation marks. Requirement: Valid values for interval depend on whether date-value is a date, time, or datetime value. For more information, see “Commonly Used Time Intervals” on page 320. Multipliers and shift indexes can be used with the basic interval names to construct more complex interval specifications. The general form of an interval name is as follows: interval The three parts of the interval name are as follows: interval specifies the name of the basic interval type. For example, YEAR specifies yearly intervals.
820
INTINDEX Function
4
Chapter 4
multiple specifies an optional multiplier that sets the interval equal to a multiple of the period of the basic interval type. For example, the interval YEAR2 consists of two-year, or biennial, periods. See: “Incrementing Dates and Times by Using Multipliers and by Shifting
Intervals” on page 320 for more information. shift-index specifies an optional shift index that shifts the interval to start at a specified subperiod starting point. For example, YEAR.3 specifies yearly periods shifted to start on the first of March of each calendar year and to end in February of the following year. Restriction: The shift index cannot be greater than the number of subperiods
in the whole interval. For example, you could use YEAR2.24, but YEAR2.25 would be an error because there is no 25th month in a two-year interval. Restriction: If the default shift period is the same as the interval type, then
only multiperiod intervals can be shifted with the optional shift index. For example, because MONTH type intervals shift by MONTH subperiods by default, monthly intervals cannot be shifted with the shift index. However, bimonthly intervals can be shifted with the shift index, because there are two MONTH intervals in each MONTH2 interval. For example, the interval name MONTH2.2 specifies bimonthly periods starting on the first day of even-numbered months. See: “Incrementing Dates and Times by Using Multipliers and by Shifting
Intervals” on page 320 for more information. date-value
specifies a date, time, or datetime value that represents a time period of the given interval.
Details INTINDEX Function Intervals
The INTINDEX function returns the seasonal index when you supply an interval and an appropriate date, time, or datetime value. The seasonal index is a number that represents the position of the date, time, or datetime value in the seasonal cycle of the specified interval. For example, intindex(’month’, ’01DEC2000’d); returns a value of 12 because there are 12 months in a yearly cycle and December is the 12th month of the year. In the following examples, INTINDEX returns the same value because both statements have values that occur in the first quarter of the year 2000: intindex(’qtr’, ’01JAN2000’d); and intindex(’qtr’, ’31MAR2000’d);. The statement intindex(’day’, ’01DEC2000’d); returns a value of 6 because daily data is weekly periodic and December 1, 2000, is a Friday, the sixth day of the week.
How Interval and Date-Time-Value Are Related
To correctly identify the seasonal index, the interval should agree with the date, time, or datetime value. For example, intindex(’month’, ’01DEC2000’d); returns a value of 12 because there are 12 months in a yearly interval and December is the 12th month of the year. The MONTH interval requires a SAS date value. In the following example, intindex(’day’, ’01DEC2000’d); returns a value of 6 because there are seven days in a weekly interval and December 1, 2000, is a Friday, the sixth day of the week. The DAY interval requires a SAS date value. The example intindex(’qtr’, ’01JAN2000:00:00:00’dt); results in an error because the QTR interval expects the date to be a SAS date value rather than a
Functions and CALL Routines
4
INTINDEX Function
821
datetime value. The example intindex(’dtmonth’, ’01DEC2000:00:00:00’dt); returns a value of 12. The DTMONTH interval requires a datetime value. For more information about working with date and time intervals, see “Date and Time Intervals” on page 319.
Retail Calendar Intervals The INTINDEX function can also be used with calendar intervals from the retail industry. These intervals are ISO 8601 compliant. For more information, see “Retail Calendar Intervals: ISO 8601 Compliant” in SAS Language Reference: Concepts.
Comparisons The INTINDEX function returns the seasonal index whereas the INTCINDEX function returns the cycle index. In the example index = intindex(’day’, ’04APR2005’d);, the INTINDEX function returns the day of the week. In the example cycle_index = intcindex(’day’, ’04APR2005’d);, the INTCINDEX function returns the week of the year. In the example index = intindex(’minute’,’01Sep78:00:00:00’dt);, the INTINDEX function returns the minute of the hour. In the example cycle_index = intcindex(’minute’,’01Sep78:00:00:00’dt);, the INTCINDEX function returns the hour of the day. In the example intseas(’interval’);, INTSEAS returns the maximum number that could be returned by intindex(’interval’, date);.
Examples The following SAS statements produce these results: SAS Statements
Results
interval1 = intindex(’qtr’, ’14AUG2005’d); put interval1;
3
interval2 = intindex(’dtqtr’,’23DEC2005:15:09:19’dt); put interval2;
4
interval3 = intindex(’hour’, ’09:05:15’t); put interval3;
10
interval4 = intindex(’month’, ’26FEB2005’d); put interval4;
2
interval5 = intindex(’dtmonth’, ’28MAY2005:05:15:00’dt); put interval5;
5
interval6 = intindex(’week’, ’09SEP2005’d); put interval6;
36
interval7 = intindex(’tenday’, ’16APR2005’d); put interval7;
11
See Also Function: “INTCINDEX Function” on page 804
822
INTNX Function
4
Chapter 4
INTNX Function Increments a date, time, or datetime value by a given time interval, and returns a date, time, or datetime value. Date and Time
Category:
Syntax INTNX(interval< .shift-index>, start-from, increment) INTNX(custom-interval, start-from, increment )
Arguments
interval
specifies a character constant, variable, or expression that contains a time interval such as WEEK, SEMIYEAR, QTR, or HOUR. Interval can appear in uppercase or lowercase. The possible values of interval are listed in the “Intervals Used with Date and Time Functions” table in SAS Language Reference: Concepts. The type of interval (date, datetime, or time) must match the type of value in start-from and increment.
Requirement:
“Commonly Used Time Intervals” on page 320 for a list of commonly used time intervals.
See:
Multipliers and shift indexes can be used with the basic interval names to construct more complex interval specifications. The general form of an interval name is as follows: interval Here are the three parts of the interval name: interval specifies the name of the basic interval type. For example, YEAR specifies yearly intervals. multiple specifies an optional multiplier that sets the interval equal to a multiple of the period of the basic interval type. For example, the interval YEAR2 consists of two-year, or biennial, periods. See: “Incrementing Dates and Times by Using Multipliers and by Shifting
Intervals” on page 320 for more information. shift-index specifies an optional shift index that shifts the interval to start at a specified subperiod starting point. For example, YEAR.3 specifies yearly periods shifted to start on the first of March of each calendar year and to end in February of the following year. Restriction: The shift index cannot be greater than the number of subperiods
in the whole interval. For example, you could use YEAR2.24, but YEAR2.25 would be an error because there is no 25th month in a two-year interval.
Functions and CALL Routines
4
INTNX Function
823
Restriction: If the default shift period is the same as the interval type, then
only multiperiod intervals can be shifted with the optional shift index. For example, MONTH type intervals shift by MONTH subperiods by default; thus, monthly intervals cannot be shifted with the shift index. However, bimonthly intervals can be shifted with the shift index because there are two MONTH intervals in each MONTH2 interval. The interval name MONTH2.2, for example, specifies bimonthly periods starting on the first day of even-numbered months. See: “Incrementing Dates and Times by Using Multipliers and by Shifting Intervals” on page 320 for more information. start-from
specifies a SAS expression that represents a SAS date, time, or datetime value that identifies a starting point. increment
specifies a negative, positive, or zero integer that represents the number of date, time, or datetime intervals. Increment is the number of intervals to shift the value of start-from. ’alignment’
controls the position of SAS dates within the interval. You must enclose alignment in quotation marks. Alignment can be one of these values: BEGINNING specifies that the returned date or datetime value is aligned to the beginning of the interval. Alias: B MIDDLE specifies that the returned date or datetime value is aligned to the midpoint of the interval, which is the average of the beginning and ending alignment values. Alias: M END specifies that the returned date or datetime value is aligned to the end of the interval. Alias: E SAME specifies that the date that is returned has the same alignment as the input date. Alias: S Alias: SAMEDAY See: “SAME Alignment” on page 824 for more information. Default: BEGINNING See: “Aligning SAS Date Output within Its Intervals” on page 824 for more information.
Details The Basics The INTNX function increments a date, time, or datetime value by intervals such as DAY, WEEK, QTR, and MINUTE, or a custom interval that you define. The increment is based on a starting date, time, or datetime value, and on the number of time intervals that you specify. The INTNX function returns the SAS date value for the beginning date, time, or datetime value of the interval that you specify in the start–from argument. (To convert
824
INTNX Function
4
Chapter 4
the SAS date value to a calendar date, use any valid SAS date format, such as the DATE9. format.) The following example shows how to determine the date of the start of the week that is six weeks from the week of October 17, 2003. x=intnx(’week’, ’17oct03’d, 6); put x date9.;
INTNX returns the value 23NOV2003. For more information about working with date and time intervals, see “Date and Time Intervals” on page 319.
Aligning SAS Date Output within Its Intervals SAS date values are typically aligned with the beginning of the time interval that is specified with the interval argument. You can use the optional alignment argument to specify the alignment of the date that is returned. The values BEGINNING, MIDDLE, or END align the date to the beginning, middle, or end of the interval, respectively.
SAME Alignment
If you use the SAME value of the alignment argument, then INTNX returns the same calendar date after computing the interval increment that you specified. The same calendar date is aligned based on the interval’s shift period, not the interval. To view the valid shift periods, see the “Intervals Used with Date and Time Functions” table in SAS Language Reference: Concepts. Most of the values of the shift period are equal to their corresponding intervals. The exceptions are the intervals WEEK, WEEKDAY, QTR, SEMIYEAR, YEAR, and their DT counterparts. WEEK and WEEKDAY intervals have a shift period of DAYS; and QTR, SEMIYEAR, and YEAR intervals have a shift period of MONTH. When you use SAME alignment with YEAR, for example, the result is same-day alignment based on MONTH, the interval’s shift period. The result is not aligned to the same day of the YEAR interval. If you specify a multiple interval, then the default shift interval is based on the interval, and not on the multiple interval. When you use SAME alignment for QTR, SEMIYEAR, and YEAR intervals, the computed date is the same number of months from the beginning of the interval as the input date. The day of the month matches as closely as possible. Because not all months have the same number of days, it is not always possible to match the day of the month. For more information about shift periods, see the “Intervals Used with Date and Time Functions” table in SAS Language Reference: Concepts.
Alignment Intervals
Use the SAME value of the alignment argument if you want to base the alignment of the computed date on the alignment of the input date: intnx(’week’, ’15mar2000’d, 1, ’same’); returns 22MAR2000 intnx(’dtweek’, ’15mar2000:8:45’dt, 1, ’same’); returns 22MAR00:08:45:00 intnx(’year’, ’15mar2000’d, 5, ’same’); returns 15MAR2005
Adjusting Dates
The INTNX function automatically adjusts for the date if the date in the interval that is incremented does not exist. Here is an example: intnx(’month’, ’15mar2000’d, 5, ’same’); intnx(’year’, ’29feb2000’d, 2, ’same’); intnx(’month’, ’31aug2001’d, 1, ’same’); intnx(’year’, ’01mar1999’d, 1, ’same’);
intnx(’year’, ’01mar1999’d, 1, ’same’);
returns 15AUG2000 returns 28FEB2002 returns 30SEP2001 returns 01MAR2000 (the first day of the third month of the year) returns 29FEB2000 (the 60th day
Functions and CALL Routines
4
INTNX Function
825
of the year)
In the example intnx(’year’, ’29feb2000’d, 2);, the INTNX function returns the value 01JAN2002, which is the beginning of the year two years from the starting date (2000). In the example intnx(’year’, ’29feb2000’d, 2, ’same’);, the INTNX function returns the value 28FEB2002. In this case, the starting date begins in the year 2000, the year is two years later (2002), the month is the same (February), and the date is the 28th, because that is the closest date to the 29th in February 2002.
Retail Calendar Intervals The retail industry often accounts for its data by dividing the yearly calendar into four 13-week periods, based on one of the following formats: 4-4-5, 4-5-4, or 5-4-4. The first, second, and third numbers specify the number of weeks in the first, second, and third month of each period, respectively. For more information, see “Retail Calendar Intervals: ISO 8601 Compliant” in SAS Language Reference: Concepts.
Examples Example 1: Examples of Using Intervals with the INTNX Function
The following SAS
statements produce these results. SAS Statements
Results
yr=intnx(’year’,’05feb94’d,3); put yr / yr date7.;
13515 01JAN97
x=intnx(’month’,’05jan95’d,0); put x / x date7.;
12784 01JAN95
next=intnx(’semiyear’,’01jan97’d,1); put next / next date7.;
13696 01JUL97
past=intnx(’month2’,’01aug96’d,-1); put past / past date7.;
13270 01MAY96
sm=intnx(’semimonth2.2’,’01apr97’d,4); put sm / sm date7.;
13711 16JUL97
x=’month’; date=’1jun1990’d; nextmon=intnx(x,date,1); put nextmon / nextmon date7.;
11139 01JUL90
x1=’month ’; x2=trim(x1); date=’1jun1990’d - 100; nextmonth=intnx(x2,date,1); put nextmonth / nextmonth date7.;
11017 01MAR90
The following examples show the results of advancing a date by using the optional alignment argument.
826
INTNX Function
4
Chapter 4
SAS Statements
Results
date1=intnx(’month’,’01jan95’d,5,’beginning’); put date1 / date1 date7.;
12935 01JUN95
date2=intnx(’month’,’01jan95’d,5,’middle’); put date2 / date2 date7.;
12949 15JUN95
date3=intnx(’month’,’01jan95’d,5,’end’); put date3 / date3 date7.;
12964 30JUN95
date4=intnx(’month’,’01jan95’d,5,’sameday’); put date4 / date4 date7.;
12935 01JUN95
date5=intnx(’month’,’15mar2000’d,5,’same’); put date5 / date5 date9.;
14837 15AUG2000
interval=’month’; date=’1sep2001’d; align=’m’; date4=intnx(interval,date,2,align); put date4 / date4 date7.;
15294 15NOV01
x1=’month ’; x2=trim(x1); date=’1sep2001’d + 90; date5=intnx(x2,date,2,’m’); put date5 / date5 date7.;
15356 16JAN02
Example 2: Example of Using Custom Intervals The following example uses the custom-interval form of the INTNX function to increment a date, time, or datetime value by a given time interval. options intervalds=(weekdaycust=dstest); data dstest; format begin end date9.; begin=’01jan2008’d; end=’01jan2008’d; begin=’02jan2008’d; end=’02jan2008’d; begin=’03jan2008’d; end=’03jan2008’d; begin=’04jan2008’d; end=’06jan2008’d; begin=’07jan2008’d; end=’07jan2008’d; begin=’08jan2008’d; end=’08jan2008’d; begin=’09jan2008’d; end=’09jan2008’d; begin=’10jan2008’d; end=’10jan2008’d; begin=’11jan2008’d; end=’13jan2008’d; begin=’14jan2008’d; end=’14jan2008’d; begin=’15jan2008’d; end=’15jan2008’d; run;
output; output; output; output; output; output; output; output; output; output; output;
data _null_; format start date9. endcustom date9.; start=’01jan2008’d; do i=0 to 9; endcustom=intnx(’weekdaycust’, start, i);
Functions and CALL Routines
4
INTRR Function
put endcustom; end; run;
SAS writes the following output to the log: 01JAN2008 02JAN2008 03JAN2008 04JAN2008 07JAN2008 08JAN2008 09JAN2008 10JAN2008 11JAN2008 14JAN2008
See Also Functions: “INTCK Function” on page 806 “INTSHIFT Function” on page 831 System Options: “INTERVALDS= System Option” on page 1872
INTRR Function Returns the internal rate of return as a fraction. Category: Financial
Syntax INTRR(freq,c0, c1,..., cn)
Arguments
freq
is numeric, the number of payments over a specified base period of time that is associated with the desired internal rate of return. Range: freq > 0 Tip:
The case freq = 0 is a flag to allow continuous compounding.
c0,c1, ... ,cn
are numeric, the optional cash payments.
827
828
INTRR Function
4
Chapter 4
Details The INTRR function returns the internal rate of return over a specified base period of time for the set of cash payments c0, c1,..., cn. The time intervals between any two consecutive payments are assumed to be equal. The argument freq > 0 describes the number of payments that occur over the specified base period of time. The number of notes issued from each instance is limited. The internal rate of return is the interest rate such that the sequence of payments has a 0 net present value (see the NETPV function). It is given by
r
1 = 0x log 0(x1) e
0 =0
f req > f req
f req
where x is the real root of the polynomial. n X
=0
ci x i
=0
i
In the case of multiple roots, one real root is returned and a warning is issued concerning the non-uniqueness of the returned internal rate of return. Depending on the value of payments, a root for the equation does not always exist; in that case, a missing value is returned. Missing values in the payments are treated as 0 values. When freq > 0, the computed rate of return is the effective rate over the specified base period. To compute a quarterly internal rate of return (the base period is three months) with monthly payments, set freq to 3. If freq is 0, continuous compounding is assumed and the base period is the time interval between two consecutive payments. The computed internal rate of return is the nominal rate of return over the base period. To compute with continuous compounding and monthly payments, set freq to 0. The computed internal rate of return will be a monthly rate.
Comparisons The IRR function is identical to INTRR, except for in the IRR function, the internal rate of return is a percentage.
Examples For an initial outlay of $400 and expected payments of $100, $200, and $300 over the following three years, the annual internal rate of return can be expressed as rate=intrr(1,-400,100,200,300);
The value returned is 0.19438.
See Also Functions: “IRR Function” on page 838
Functions and CALL Routines
4
INTSEAS Function
829
INTSEAS Function Returns the length of the seasonal cycle when a date, time, or datetime interval is specified. Category: Date and Time
Syntax INTSEAS(interval>)
Arguments interval
specifies a character constant, a variable, or an expression that contains an interval name such as WEEK, MONTH, or QTR. Interval can appear in uppercase or lowercase. The possible values of interval are listed in the “Intervals Used with Date and Time Functions” table in SAS Language Reference: Concepts. Multipliers and shift indexes can be used with the basic interval names to construct more complex interval specifications. The general form of an interval name is as follows: interval The three parts of the interval name are as follows: interval specifies the name of the basic interval type. For example, YEAR specifies yearly intervals. multiple specifies an optional multiplier that sets the interval equal to a multiple of the period of the basic interval type. For example, the interval YEAR2 consists of two-year, or biennial, periods. See: “Incrementing Dates and Times by Using Multipliers and by Shifting Intervals” on page 320 for more information. shift-index specifies an optional shift index that shifts the interval to start at a specified subperiod starting point. For example, YEAR.3 specifies yearly periods shifted to start on the first of March of each calendar year and to end in February of the following year. Restriction: The shift index cannot be greater than the number of subperiods in the whole interval. For example, you could use YEAR2.24, but YEAR2.25 would be an error because there is no 25th month in a two-year interval. Restriction: If the default shift period is the same as the interval type, then only multiperiod intervals can be shifted with the optional shift index. For example, because MONTH type intervals shift by MONTH subperiods by default, monthly intervals cannot be shifted with the shift index. However, bimonthly intervals can be shifted with the shift index, because there are two MONTH intervals in each MONTH2 interval. For example, the interval name MONTH2.2 specifies bimonthly periods starting on the first day of even-numbered months.
830
INTSEAS Function
4
Chapter 4
See: “Incrementing Dates and Times by Using Multipliers and by Shifting
Intervals” on page 320 for more information.
Details The Basics The INTSEAS function returns the number of intervals in a seasonal cycle. For example, when the interval for a time series is described as monthly, then many procedures use the option INTERVAL=MONTH. Each observation in the data then corresponds to a particular month. Monthly data is considered to be periodic for a one-year period. A year contains 12 months, so the number of intervals (months) in a seasonal cycle (year) is 12. Quarterly data is also considered to be periodic for a one-year period. A year contains four quarters, so the number of intervals in a seasonal cycle is four. The periodicity is not always one year. For example, INTERVAL=DAY is considered to have a period of one week. Because there are seven days in a week, the number of intervals in the seasonal cycle is seven. For more information about working with date and time intervals, see “Date and Time Intervals” on page 319. Retail Calendar Intervals The retail industry often accounts for its data by dividing the yearly calendar into four 13-week periods, based on one of the following formats: 4-4-5, 4-5-4, or 5-4-4. The first, second, and third numbers specify the number of weeks in the first, second, and third month of each period, respectively. For more information, see “Retail Calendar Intervals: ISO 8601 Compliant” in SAS Language Reference: Concepts.
Examples The following SAS statements produce these results: SAS Statements
Results
cycle_years = intseas(’year’); put cycle_years;
1
cycle_smiyears = intseas(’semiyear’); put cycle_smiyears;
2
cycle_quarters = intseas(’quarter’); put cycle_quarters;
4
cycle_months = intseas(’month’); put cycle_months;
12
cycle_smimonths = intseas(’semimonth’); put cycle_smimonths;
24
cycle_tendays = intseas(’tenday’); put cycle_tendays;
36
cycle_weeks = intseas(’week’); put cycle_weeks;
52
cycle_wkdays = intseas(’weekday’); put cycle_wkdays;
5
cycle_hours = intseas(’hour’); put cycle_hours;
24
cycle_minutes = intseas(’minute’); put cycle_minutes;
60
Functions and CALL Routines
SAS Statements
Results
cycle_month2 = intseas(’month2.2’); put cycle_month2;
6
cycle_week2 = intseas(’week2’); put cycle_week2;
26
var1 = ’month4.3’; cycle_var1 = intseas(var1); put cycle_var1;
3
cycle_day1 = intseas(’day1’); put cycle_day1;
7
4
INTSHIFT Function
831
See Also Function: “INTCYCLE Function” on page 810
INTSHIFT Function Returns the shift interval that corresponds to the base interval. Category: Date and Time
Syntax INTSHIFT(interval )
Arguments interval
specifies a character constant, a variable, or an expression that contains a time interval such as WEEK, SEMIYEAR, QTR, or HOUR. Interval can appear in uppercase or lowercase. The possible values of interval are listed in the “Intervals Used with Date and Time Functions” table in SAS Language Reference: Concepts. Multipliers and shift indexes can be used with the basic interval names to construct more complex interval specifications. The general form of an interval name is as follows: interval
The three parts of the interval name are as follows: interval specifies the name of the basic interval type. For example, YEAR specifies yearly intervals.
832
INTSHIFT Function
4
Chapter 4
multiple specifies an optional multiplier that sets the interval equal to a multiple of the period of the basic interval type. For example, the interval YEAR2 consists of two-year, or biennial, periods. See: “Incrementing Dates and Times by Using Multipliers and by Shifting
Intervals” on page 320 for more information. shift-index specifies an optional shift index that shifts the interval to start at a specified subperiod starting point. For example, YEAR.3 specifies yearly periods shifted to start on the first of March of each calendar year and to end in February of the following year. Restriction: The shift index cannot be greater than the number of subperiods in
the whole interval. For example, you could use YEAR2.24, but YEAR2.25 would be an error because there is no 25th month in a two-year interval. Restriction: If the default shift period is the same as the interval type, then only
multiperiod intervals can be shifted with the optional shift index. For example, because MONTH type intervals shift by MONTH subperiods by default, monthly intervals cannot be shifted with the shift index. However, bimonthly intervals can be shifted with the shift index, because there are two MONTH intervals in each MONTH2 interval. For example, the interval name MONTH2.2 specifies bimonthly periods starting on the first day of even-numbered months. See: “Incrementing Dates and Times by Using Multipliers and by Shifting
Intervals” on page 320 for more information.
Details The INTSHIFT function returns the shift interval that corresponds to the base interval. INTSHIFT ignores multiples of the interval and interval shifts. The INTSHIFT function can also be used with calendar intervals from the retail industry. These intervals are ISO 8601 compliant. For more information, see “Retail Calendar Intervals: ISO 8601 Compliant” in SAS Language Reference: Concepts.
Examples The following examples produce these results: SAS Statements
Results
shift1 = intshift(’year’); put shift1;
MONTH
shift2 = intshift(’dtyear’); put shift2;
DTMONTH
shift3 = intshift(’minute’); put shift3;
DTMINUTE
interval = ’weekdays’; shift4 = intshift(interval); put shift4;
WEEKDAY
shift5 = intshift(’weekday5.4’); put shift5;
WEEKDAY
Functions and CALL Routines
SAS Statements
Results
shift6 = intshift(’qtr’); put shift6;
MONTH
shift7 = intshift(’dttenday’); put shift7;
DTTENDAY
4
INTTEST Function
833
INTTEST Function Returns 1 if a time interval is valid, and returns 0 if a time interval is invalid. Category: Date and Time
Syntax INTTEST(interval< >>)
Arguments interval
specifies a character constant, variable, or expression that contains an interval name, such as WEEK, MONTH, or QTR. Interval can appear in uppercase or lowercase. The possible values of interval are listed in the “Intervals Used with Date and Time Functions” table in SAS Language Reference: Concepts. Multipliers and shift indexes can be used with the basic interval names to construct more complex interval specifications. The general form of an interval name is as follows: interval Here are the three parts of the interval name: interval specifies the name of the basic interval type. For example, YEAR specifies yearly intervals. multiple specifies an optional multiplier that sets the interval equal to a multiple of the period of the basic interval type. For example, YEAR2 consists of two-year, or biennial, periods. See: “Incrementing Dates and Times by Using Multipliers and by Shifting Intervals” on page 320 for more information. shift-index specifies an optional shift index that shifts the interval to start at a specified subperiod starting point. For example, YEAR.3 specifies yearly periods that are shifted to start on the first of March of each calendar year and to end in February of the following year. Restriction: The shift index cannot be greater than the number of subperiods in the whole interval. For example, you could use YEAR2.24, but YEAR2.25 is invalid because there is no 25th month in a two-year interval.
834
INTTEST Function
4
Chapter 4
Restriction: If the default shift period is the same as the interval type, then
only multiperiod intervals can be shifted with the optional shift index. For example, because MONTH type intervals shift by MONTH subperiods by default, monthly intervals cannot be shifted with the shift index. However, bimonthly intervals can be shifted with the shift index, because there are two MONTH intervals in each MONTH2 interval. For example, the interval name MONTH2.2 specifies bimonthly periods starting on the first day of even-numbered months. See: “Incrementing Dates and Times by Using Multipliers and by Shifting
Intervals” on page 320 for more information.
Details The INTTEST function checks for a valid interval name. This function is useful when checking for valid values of multiple and shift-index. For more information about multipliers and shift indexes, see “Multiunit Intervals” in SAS Language Reference: Concepts. The INTTEST function can also be used with calendar intervals from the retail industry. These intervals are ISO 8601 compliant. For more information, see “Retail Calendar Intervals: ISO 8601 Compliant” in SAS Language Reference: Concepts.
Examples In the following examples, SAS returns a value of 1 if the interval argument is valid, and 0 if the interval argument is invalid. SAS Statements
Results
test1 = inttest(’month’); put test1;
1
test2 = inttest(’week6.13’); put test2;
1
test3 = inttest(’tenday’); put test3;
1
test4 = inttest(’twoweeks’); put test4;
0
var1 = ’hour2.2’; test5 = inttest(var1); put test5;
1
Functions and CALL Routines
4
INTZ Function
835
INTZ Function Returns the integer portion of the argument, using zero fuzzing. Category: Truncation
Syntax INTZ (argument)
Arguments argument
is a numeric constant, variable, or expression.
Details The following rules apply: 3 If the value of the argument is an exact integer, INTZ returns that integer. 3 If the argument is positive and not an integer, INTZ returns the largest integer that is less than the argument. 3 If the argument is negative and not an integer, INTZ returns the smallest integer that is greater than the argument.
Comparisons Unlike the INT function, the INTZ function uses zero fuzzing. If the argument is within 1E-12 of an integer, the INT function fuzzes the result to be equal to that integer. The INTZ function does not fuzz the result. Therefore, with the INTZ function you might get unexpected results.
Examples The following SAS statements produce these results. SAS Statements
Results
var1=2.1; a=intz(var1); put a;
2
var2=-2.4; b=intz(var2); put b;
-2
var3=1+1.e-11; c=intz(var3); put c;
1
f=intz(-1.6); put f;
-1
836
IORCMSG Function
4
Chapter 4
See Also Functions: “CEIL Function” on page 555 “CEILZ Function” on page 556 “FLOOR Function” on page 731 “FLOORZ Function” on page 732 “INT Function” on page 803 “ROUND Function” on page 1061 “ROUNDZ Function” on page 1069
IORCMSG Function Returns a formatted error message for _IORC_. Category:
SAS File I/O
Syntax IORCMSG()
Details If the IORCMSG function returns a value to a variable that has not yet been assigned a length, then by default the variable is assigned a length of 200. The IORCMSG function returns the formatted error message that is associated with the current value of the automatic variable _IORC_. The _IORC_ variable is created when you use the MODIFY statement, or when you use the SET statement with the KEY= option. The value of the _IORC_ variable is internal and is meant to be read in conjunction with the SYSRC autocall macro. If you try to set _IORC_ to a specific value, you might get unexpected results.
Examples In the following program, observations are either rewritten or added to the updated master file that contains bank accounts and current bank balance. The program queries the _IORC_ variable and returns a formatted error message if the _IORC_ value is unexpected.
libname bank ’SAS-library’; data bank.master(index=(AccountNum)); infile ’external-file-1’; format balance dollar8.; input @ 1 AccountNum $ 1--3 @ 5 balance 5--9; run;
Functions and CALL Routines
4
IQR Function
837
data bank.trans(index=(AccountNum)); infile ’external-file-2’; format deposit dollar8.; input @ 1 AccountNum $ 1--3 @ 5 deposit 5--9; run; data bank.master; set bank.trans; modify bank.master key=AccountNum; if (_IORC_ EQ %sysrc(_SOK)) then do; balance=balance+deposit; replace; end; else if (_IORC_ = %sysrc(_DSENOM)) then do; balance=deposit; output; _error_=0; end; else do; errmsg=IORCMSG(); put ’Unknown error condition:’ errmsg; end; run;
IQR Function Returns the interquartile range. Category: Descriptive Statistics
Syntax IQR(value-1 )
Arguments value
specifies a numeric constant, variable, or expression for which the interquartile range is to be computed.
838
IRR Function
4
Chapter 4
Details If all arguments have missing values, the result is a missing value. Otherwise, the result is the interquartile range of the non-missing values. The formula for the interquartile range is the same as the one that is used in the UNIVARIATE procedure. For more information, see Base SAS Procedures Guide.
Examples SAS Statements
Results
iqr=iqr(2,4,1,3,999999); put iqr;
2
See Also Functions: “MAD Function” on page 887 “PCTL Function” on page 953
IRR Function Returns the internal rate of return as a percentage. Category:
Financial
Syntax IRR(freq,c0,c1,…,cn)
Arguments freq
is numeric, the number of payments over a specified base period of time that is associated with the desired internal rate of return. Range: freq > 0. Tip: The case freq = 0 is a flag to allow continuous compounding. c0,c1,…,cn
are numeric, the optional cash payments.
Details The IRR function returns the internal rate of return over a specified base period of time for the set of cash payments c0, c1,…,cn. The time intervals between any two consecutive payments are assumed to be equal. The argument freq > 0 describes the
Functions and CALL Routines
4
JBESSEL Function
839
number of payments that occur over the specified base period of time. The number of notes issued from each instance is limited.
Comparisons The IRR function is identical to INTRR, except for in the IRR function, the internal rate of return is a percentage.
See Also Functions: “INTRR Function” on page 827
JBESSEL Function Returns the value of the Bessel function. Category: Mathematical
Syntax JBESSEL(nu,x)
Arguments
nu
specifies a numeric constant, variable, or expression. Range: nu ≥ 0 x
specifies a numeric constant, variable, or expression. Range: x ≥ 0
Details The JBESSEL function returns the value of the Bessel function of order nu evaluated at x (For more information, see Abramowitz and Stegun 1964; Amos, Daniel, and Weston 1977).
Examples SAS Statements
Results
x=jbessel(2,2);
0.3528340286
840
JULDATE Function
4
Chapter 4
JULDATE Function Returns the Julian date from a SAS date value. Category:
Date and Time
Syntax JULDATE(date)
Arguments date
specifies a SAS date value.
Details A SAS date value is a number that represents the number of days from January 1, 1960 to a specific date. The JULDATE function converts a SAS date value to a Julian date. If date falls within the 100-year span defined by the system option YEARCUTOFF=, the result has three, four, or five digits. In a five digit result, the first two digits represent the year, and the next three digits represent the day of the year (1 to 365, or 1 to 366 for leap years). Because leading zeros are dropped from the result, the year portion of a Julian date might be omitted (for years ending in 00), or it might have only one digit (for years ending 01–09). Otherwise, the result has seven digits: the first four digits represent the year, and the next three digits represent the day of the year. For example, if YEARCUTOFF=1920, JULDATE would return 97001 for January 1, 1997, and return 1878365 for December 31, 1878.
Comparisons The function JULDATE7 is similar to JULDATE except that JULDATE7 always returns a four digit year. Thus JULDATE7 is year 2000 compliant because it eliminates the need to consider the implications of a two digit year.
Examples The following SAS statements produce these results: SAS Statements
Results
julian=juldate(’31dec99’d);
99365
julian=juldate(’01jan2099’d);
2099001
Functions and CALL Routines
4
JULDATE7 Function
841
See Also Function: “DATEJUL Function” on page 614 “JULDATE7 Function” on page 841 System Option: “YEARCUTOFF= System Option” on page 1998
JULDATE7 Function Returns a seven-digit Julian date from a SAS date value. Category: Date and Time
Syntax JULDATE7(date)
Arguments date
specifies a SAS date value.
Details The JULDATE7 function returns a seven digit Julian date from a SAS date value. The first four digits represent the year, and the next three digits represent the day of the year.
Comparisons The function JULDATE7 is similar to JULDATE except that JULDATE7 always returns a four digit year. Thus JULDATE7 is year 2000 compliant because it eliminates the need to consider the implications of a two digit year.
842
KURTOSIS Function
4
Chapter 4
Examples The following SAS statements produce these results: SAS Statements
Results
julian=juldate7(’31dec96’d);
1996366
julian=juldate7(’01jan2099’d);
2099001
See Also Function: “JULDATE Function” on page 840
KURTOSIS Function Returns the kurtosis. Category:
Descriptive Statistics
Syntax KURTOSIS(argument-1,argument-2,argument-3,argument-4)
Arguments argument
specifies a numeric constant, variable, or expression.
Details At least four non-missing arguments are required. Otherwise, the function returns a missing value. If all non-missing arguments have equal values, the kurtosis is mathematically undefined. The KURTOSIS function returns a missing value and sets _ERROR_ equal to 1. The argument list can consist of a variable list, which is preceded by OF.
Examples SAS Statements
Results
x1=kurtosis(5,9,3,6);
0.928
x2=kurtosis(5,8,9,6,.);
-3.3
x3=kurtosis(8,9,6,1);
1.5
Functions and CALL Routines
SAS Statements
Results
x4=kurtosis(8,1,6,1);
-4.483379501
x5=kurtosis(of x1-x4);
-5.065692754
4
LAG Function
843
LAG Function Returns values from a queue. Category: Special
Syntax LAG< n>(argument)
Arguments n
specifies the number of lagged values. argument
specifies a numeric or character constant, variable, or expression.
Details The Basics If the LAG function returns a value to a character variable that has not yet been assigned a length, by default the variable is assigned a length of 200. The LAG functions, LAG1, LAG2, ..., LAGn return values from a queue. LAG1 can also be written as LAG. A LAGn function stores a value in a queue and returns a value stored previously in that queue. Each occurrence of a LAGn function in a program generates its own queue of values. The queue for each occurrence of LAGn is initialized with n missing values, where n is the length of the queue (for example, a LAG2 queue is initialized with two missing values). When an occurrence of LAGn is executed, the value at the top of its queue is removed and returned, the remaining values are shifted upwards, and the new value of the argument is placed at the bottom of the queue. Hence, missing values are returned for the first n executions of each occurrence of LAGn, after which the lagged values of the argument begin to appear. Note: Storing values at the bottom of the queue and returning values from the top of the queue occurs only when the function is executed. An occurrence of the LAGn function that is executed conditionally will store and return values only from the observations for which the condition is satisfied. 4 If the argument of LAGn is an array name, a separate queue is maintained for each variable in the array.
Memory Limit for the LAG Function
When the LAG function is compiled, SAS allocates memory in a queue to hold the values of the variable that is listed in the LAG
844
LAG Function
4
Chapter 4
function. For example, if the variable in function LAG100(x) is numeric with a length of 8 bytes, then the memory that is needed is 8 times 100, or 800 bytes. Therefore, the memory limit for the LAG function is based on the memory that SAS allocates, which varies with different operating environments.
Examples Example 1: Generating Two Lagged Values
The following program generates two
lagged values for each observation. options pagesize= linesize= pageno=1 nodate; data one; input x @@; y=lag1(x); z=lag2(x); datalines; 1 2 3 4 5 6 ; proc print data=one; title ’LAG Output’; run;
Output 4.56
Output from Generating Two Lagged Values LAG Output
1
Obs
x
y
z
1 2 3 4 5 6
1 2 3 4 5 6
. 1 2 3 4 5
. . 1 2 3 4
LAG1 returns one missing value and the values of X (lagged once). LAG2 returns two missing values and the values of X(lagged twice).
Example 2: Generating Multiple Lagged Values in BY-Groups
The following example shows how to generate up to three lagged values within each BY group. /***************************************************************************/ /* This program generates up to three lagged values. By increasing the */ /* size of the array and the number of assignment statements that use */ /* the LAGn functions, you can generate as many lagged values as needed. */ /***************************************************************************/ options pageno=1 ls=80 ps=64 nodate; /* Create starting data. */ data old; input start end; datalines;
Functions and CALL Routines
1 1 1 1 1 1 1 2 2 3 3 3 3 3 ;
4
LAG Function
1 2 3 4 5 6 7 1 2 1 2 3 4 5
data new(drop=i count); set old; by start; /* Create and assign values to three new variables. Use ENDLAG1/* ENDLAG3 to store lagged values of END, from the most recent to the /* third preceding value. array x(*) endlag1-endlag3; endlag1=lag1(end); endlag2=lag2(end); endlag3=lag3(end); /* Reset COUNT at the start of each new BY-Group */ if first.start then count=1; /* /* /* do
On each iteration, set to missing array elements that have not yet received a lagged value for the current BY-Group. Increase count by 1. i=count to dim(x); x(i)=.; end; count + 1; run; proc print; run;
*/ */ */
*/ */ */
845
846
LAG Function
4
Chapter 4
Output 4.57
Output from Generating Three Lagged Values The SAS System Obs 1 2 3 4 5 6 7 8 9 10 11 12 13 14
1
start
end
endlag1
endlag2
endlag3
1 1 1 1 1 1 1 2 2 3 3 3 3 3
1 2 3 4 5 6 7 1 2 1 2 3 4 5
. 1 2 3 4 5 6 . 1 . 1 2 3 4
. . 1 2 3 4 5 . . . . 1 2 3
. . . 1 2 3 4 . . . . . 1 2
Example 3: Computing the Moving Average of a Variable
The following is an example that computes the moving average of a variable in a data set. /* Title: Compute the moving average of a variable Goal: Compute the moving average of a variable through the entire data set, of the last n observations and of the last n observations within a BY-group. Input: */ options pageno=1 ls=80 ps=64 fullstimer nodate; data x; do x=1 to 10; output; end; run; /* Compute the moving average of the entire data set. */ data avg; retain s 0; set x; s=s+x; a=s/_n_; run; proc print; run; /* Compute the moving average of the last 5 observations. */ %let n = 5; data avg (drop=s); retain s; set x; s = sum (s, x, -lag&n(x)) ; a = s / min(_n_, &n); run; proc print; run; /* Compute the moving average within a BY-group of last n observations.
Functions and CALL Routines
4
LAG Function
For the first n-1 observations within the BY-group, the moving average is set to missing. */ data ds1; do patient=’A’,’B’,’C’; do month=1 to 7; num=int(ranuni(0)*10); output; end; end; run; proc sort; by patient; %let n = 4; data ds2; set ds1; by patient; retain num_sum 0; if first.patient then do; count=0; num_sum=0; end; count+1; last&n=lag&n(num); if count gt &n then num_sum=sum(num_sum,num,-last&n); else num_sum=sum(num_sum,num); if count ge &n then mov_aver=num_sum/&n; else mov_aver=.; run; proc print; run;
Output 4.58
Output from Computing the Moving Average of a Variable The SAS System Obs
s
x
1 2 3 4 5 6 7 8 9 10
1 3 6 10 15 21 28 36 45 55
1 2 3 4 5 6 7 8 9 10
1 a 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5
847
848
LAG Function
4
Chapter 4
The SAS System Obs
x
a
1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
1.0 1.5 2.0 2.5 3.0 4.0 5.0 6.0 7.0 8.0
2
The SAS System Obs
patient
month
num
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
A A A A A A A B B B B B B B C C C C C C C
1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7
4 9 2 9 3 6 8 7 5 5 0 5 9 0 4 2 0 1 2 9 5
num_sum 4 13 15 24 23 20 26 7 12 17 17 15 19 14 4 6 6 7 5 12 17
3 count
last4
mov_aver
1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7
. . . . 4 9 2 9 3 6 8 7 5 5 0 5 9 0 4 2 0
. . . 6.00 5.75 5.00 6.50 . . . 4.25 3.75 4.75 3.50 . . . 1.75 1.25 3.00 4.25
Example 4: Generating a Fibonacci Sequence of Numbers
The following example generates a Fibonacci sequence of numbers. You start with 0 and 1, and then add the two previous Fibonacci numbers to generate the next Fibonacci number. data _null_; put ’Fibonacci Sequence’; n=1; f=1; put n= f=; do n=2 to 10; f=sum(f,lag(f)); put n= f=; end; run;
SAS writes the following output to the log: Fibonacci Sequence n=1 f=1 n=2 f=1
Functions and CALL Routines
4
LAG Function
849
n=3 f=2 n=4 f=3 n=5 f=5 n=6 f=8 n=7 f=13 n=8 f=21 n=9 f=34 n=10 f=55
Example 5: Using Expressions for the LAG Function Argument
The following program uses an expression for the value of argument and creates a data set that contains the values for X, Y, and Z. LAG dequeues the previous values of the expression and enqueues the current value. options nodate pageno=1 ls=80 ps=85; data one; input X @@; Y=lag1(x+10); Z=lag2(x); datalines; 1 2 3 4 5 6 ; proc print; title ’Lag Output: Using an Expression’; run;
Output 4.59
Output from the LAG Function: Using an Expression Lag Output: Using an Expression Obs
X
Y
Z
1 2 3 4 5 6
1 2 3 4 5 6
. 11 12 13 14 15
. . 1 2 3 4
See Also Function: “DIF Function” on page 630
1
850
LARGEST Function
4
Chapter 4
LARGEST Function Returns the kth largest non-missing value. Descriptive Statistics
Category:
Syntax LARGEST (k, value-1)
Arguments k
is a numeric constant, variable, or expression that specifies which value to return. value
specifies the value of a numeric constant, variable, or expression to be processed.
Details If k is missing, less than zero, or greater than the number of values, the result is a missing value and _ERROR_ is set to 1. Otherwise, if k is greater than the number of non-missing values, the result is a missing value but _ERROR_ is not set to 1.
Examples The following SAS statements produce these results. SAS Statements
Results
k=1; largest1=largest(k, 456, 789, .Q, 123); put largest1;
789
k=2; largest2=largest(k, 456, 789, .Q, 123); put largest2;
456
k=3; largest3=largest(k, 456, 789, .Q, 123); put largest3;
123
k=4; largest4=largest(k, 456, 789, .Q, 123); put largest4;
.
See Also Functions: “ORDINAL Function” on page 951
Functions and CALL Routines
4
LBOUND Function
851
“PCTL Function” on page 953 “SMALLEST Function” on page 1088
LBOUND Function Returns the lower bound of an array. Category: Array
Syntax LBOUND< n>(array-name) LBOUND(array-name,bound-n)
Arguments
n
is an integer constant that specifies the dimension for which you want to know the lower bound. If no n value is specified, the LBOUND function returns the lower bound of the first dimension of the array. array-name
is the name of an array that was defined previously in the same DATA step. bound-n
is a numeric constant, variable, or expression that specifies the dimension for which you want to know the lower bound. Use bound-n only if n is not specified.
Details The LBOUND function returns the lower bound of a one-dimensional array or the lower bound of a specified dimension of a multidimensional array. Use LBOUND in array processing to avoid changing the lower bound of an iterative DO group each time you change the bounds of the array. LBOUND and HBOUND can be used together to return the values of the lower and upper bounds of an array dimension.
Examples Example 1: One-dimensional Array
In this example, LBOUND returns the lower bound of the dimension, a value of 2. SAS repeats the statements in the DO loop five times. array big{2:6} weight sex height state city; do i=lbound(big) to hbound(big); ...more SAS statements...; end;
852
LCM Function
4
Chapter 4
Example 2: Multidimensional Array
This example shows two ways of specifying the LBOUND function for multidimensional arrays. Both methods return the same value for LBOUND, as shown in the table that follows the SAS code example. array mult{2:6,4:13,2} mult1-mult100;
Syntax
Alternative Syntax
Value
LBOUND(MULT)
LBOUND(MULT,1)
2
LBOUND2(MULT)
LBOUND(MULT,2)
4
LBOUND3(MULT)
LBOUND(MULT,3)
1
See Also Functions: “DIM Function” on page 632 “HBOUND Function” on page 776 Statements: “ARRAY Statement” on page 1391 “Array Reference Statement” on page 1396 “Array Processing” in SAS Language Reference: Concepts
LCM Function Returns the least common multiple. Category:
Mathematical
Syntax LCM(x1, x2, x3, …, xn)
Arguments x
specifies a numeric constant, variable, or expression that has an integer value.
Details The LCM (least common multiple) function returns the smallest multiple that is exactly divisible by every member of a set of numbers. For example, the least common multiple of 12 and 18 is 36. If any of the arguments are missing, then the returned value is a missing value.
Functions and CALL Routines
4
LCOMB Function
853
Examples The following example returns the smallest multiple that is exactly divisible by the integers 10 and 15. data _null_; x=lcm(10,15); put x=; run;
SAS writes the following output to the log: x=30
See Also Functions: “GCD Function” on page 759
LCOMB Function Computes the logarithm of the COMB function; that is, the logarithm of the number of combinations of n objects taken r at a time. Category: Combinatorial
Syntax LCOMB(n,r)
Arguments n
is a non-negative integer that represents the total number of elements from which the sample is chosen. r
is a non-negative integer that represents the number of chosen elements. Restriction: r ≤ n
Comparisons The LCOMB function computes the logarithm of the COMB function.
854
LEFT Function
4
Chapter 4
Examples The following statements produce these results: SAS Statements
Results
x=lcomb(5000,500); put x;
1621.4411361
y=lcomb(100,10); put y;
30.482323362
See Also Functions: “COMB Function” on page 570
LEFT Function Left-aligns a character string. Character Restriction: “I18N Level 0” on page 305 Tip: DBCS equivalent function is KLEFT in SAS National Language Support (NLS): Reference Guide. See “DBCS Compatibility” on page 854. Category:
Syntax LEFT(argument)
Arguments argument
specifies a character constant, variable, or expression.
Details The Basics In a DATA step, if the LEFT function returns a value to a variable that has not previously been assigned a length, then that variable is given the length of the argument. LEFT returns an argument with leading blanks moved to the end of the value. The argument’s length does not change. DBCS Compatibility
The LEFT function left-aligns a character string. You can use the LEFT function in most cases. If an application can be executed in an ASCII
Functions and CALL Routines
4
LENGTH Function
855
environment, or if the application does not manipulate character strings, then using the LEFT function rather than the KLEFT function.
Examples SAS Statements
Results ----+----1----+
a=’ DUE DATE’; b=left(a); put b;
DUE DATE
See Also Functions: “COMPRESS Function” on page 584 “RIGHT Function” on page 1059 “STRIP Function” on page 1098 “TRIM Function” on page 1132
LENGTH Function Returns the length of a non-blank character string, excluding trailing blanks, and returns 1 for a blank character string. Category: Character Restriction:
“I18N Level 0” on page 305
DBCS equivalent function is KLENGTH in SAS National Language Support (NLS): Reference Guide.
Tip:
Tip: The LENGTH function returns a length in bytes, while the KLENGTH function returns a length in a character based unit.
Syntax LENGTH(string)
856
LENGTH Function
4
Chapter 4
Arguments string
specifies a character constant, variable, or expression.
Details The LENGTH function returns an integer that represents the position of the rightmost non-blank character in string. If the value of string is blank, LENGTH returns a value of 1. If string is a numeric constant, variable, or expression (either initialized or uninitialized), SAS automatically converts the numeric value to a right-justified character string by using the BEST12. format. In this case, LENGTH returns a value of 12 and writes a note in the SAS log stating that the numeric values have been converted to character values.
Comparisons 3 The LENGTH and LENGTHN functions return the same value for non-blank character strings. LENGTH returns a value of 1 for blank character strings, whereas LENGTHN returns a value of 0. 3 The LENGTH function returns the length of a character string, excluding trailing blanks, whereas the LENGTHC function returns the length of a character string, including trailing blanks.
3 The LENGTH function returns the length of a character string, excluding trailing blanks, whereas the LENGTHM function returns the amount of memory in bytes that is allocated for a character string.
Examples SAS Statements
Results
len=length(’ABCDEF’); put len;
6
len2=length(’ ’); put len2;
1
See Also Functions: “LENGTHC Function” on page 857 “LENGTHM Function” on page 858 “LENGTHN Function” on page 859
Functions and CALL Routines
4
LENGTHC Function
857
LENGTHC Function Returns the length of a character string, including trailing blanks. Category: Character Restriction:
“I18N Level 2” on page 306
Syntax LENGTHC(string)
Arguments string
specifies a character constant, variable, or expression.
Details The LENGTHC function returns the number of characters, both blanks and non-blanks, in string. If string is a numeric constant, variable or expression (either initialized or uninitialized), SAS automatically converts the numeric value to a right-justified character string by using the BEST12. format. In this case, LENGTHC returns a value of 12 and writes a note in the SAS log stating that the numeric values have been converted to character values.
Comparisons 3 The LENGTHC function returns the length of a character string, including trailing blanks, whereas the LENGTH and LENGTHN functions return the length of a character string, excluding trailing blanks. LENGTHC always returns a value that is greater than or equal to the value of LENGTHN. 3 The LENGTHC function returns the length of a character string, including trailing blanks, whereas the LENGTHM function returns the amount of memory in bytes that is allocated for a character string. For fixed-length character strings, LENGTHC and LENGTHM always return the same value. For varying-length character strings, LENGTHC always returns a value that is less than or equal to the value returned by LENGTHM.
Examples The following SAS statements produce these results: SAS Statements
Results
x=lengthc(’variable with trailing blanks put x; length fixed $35; fixed=’variable with trailing blanks x=lengthc(fixed); put x;
’); 32
’; 35
858
LENGTHM Function
4
Chapter 4
See Also Functions: “LENGTH Function” on page 855 “LENGTHM Function” on page 858 “LENGTHN Function” on page 859
LENGTHM Function Returns the amount of memory (in bytes) that is allocated for a character string. Category:
Character
Restriction:
“I18N Level 2” on page 306
Syntax LENGTHM(string)
Arguments
string
specifies a character constant, variable, or expression.
Details The LENGTHM function returns an integer that represents the amount of memory in bytes that is allocated for string. If string is a numeric constant, variable, or expression (either initialized or uninitialized), SAS automatically converts the numeric value to a right-justified character string by using the BEST12. format. In this case, LENGTHM returns a value of 12 and writes a note in the SAS log stating that the numeric values have been converted to character values.
Comparisons The LENGTHM function returns the amount of memory in bytes that is allocated for a character string, whereas the LENGTH, LENGTHC, and LENGTHN functions return the length of a character string. LENGTHM always returns a value that is greater than or equal to the values that are returned by LENGTH, LENGTHC, and LENGTHN.
Examples Example 1: Determining the Amount of Allocated Memory for a Character Expression This example determines the amount of memory (in bytes) that is allocated for a buffer that stores intermediate results in a character expression. Because SAS does not know how long the value of the expression CAT(x,y) will be, SAS allocates memory for values up to 32767 bytes long.
Functions and CALL Routines
4
LENGTHN Function
859
data _null_; x=’x’; y=’y’; lc=lengthc(cat(x,y)); lm=lengthm(cat(x,y)); put lc= lm=; run;
SAS writes the following output to the log: lc=2 lm=32767
Example 2: Determining the Amount of Allocated Memory for a Variable from an External File This example determines the amount of memory (in bytes) that is allocated to a variable that is input into a SAS file from an external file. data _null_; file ’test.txt’; put ’trailing blanks run;
’;
data test; infile ’test.txt’; input; x=lengthm(_infile_); put x; run;
The following line is written to the SAS log: 256
See Also Functions: “LENGTH Function” on page 855 “LENGTHC Function” on page 857 “LENGTHN Function” on page 859
LENGTHN Function Returns the length of a character string, excluding trailing blanks. Category: Character Restriction:
“I18N Level 0” on page 305
Syntax LENGTHN(string)
860
4
LENGTHN Function
Chapter 4
Arguments string
specifies a character constant, variable, or expression.
Details The LENGTHN function returns an integer that represents the position of the rightmost non-blank character in string. If the value of string is blank, LENGTHN returns a value of 0. If string is a numeric constant, variable, or expression (either initialized or uninitialized), SAS automatically converts the numeric value to a right-justified character string by using the BEST12. format. In this case, LENGTHN returns a value of 12 and writes a note in the SAS log stating that the numeric values have been converted to character values.
Comparisons 3 The LENGTHN and LENGTH functions return the same value for non-blank character strings. LENGTHN returns a value of 0 for blank character strings, whereas LENGTH returns a value of 1. 3 The LENGTHN function returns the length of a character string, excluding trailing blanks, whereas the LENGTHC function returns the length of a character string, including trailing blanks. LENGTHN always returns a value that is less than or equal to the value returned by LENGTHC.
3 The LENGTHN function returns the length of a character string, excluding trailing blanks, whereas the LENGTHM function returns the amount of memory in bytes that is allocated for a character string. LENGTHN always returns a value that is less than or equal to the value returned by LENGTHM.
Examples SAS Statements
Results
len=lengthn(’ABCDEF’); put len;
6
len2=lengthn(’ ’); put len2;
0
See Also Functions: “LENGTH Function” on page 855 “LENGTHC Function” on page 857 “LENGTHM Function” on page 858
Functions and CALL Routines
4
LEXCOMB Function
861
LEXCOMB Function Generates all distinct combinations of the non-missing values of n variables taken k at a time in lexicographic order. Category: Combinatorial Restriction:
The LEXCOMB function cannot be executed when you use the %SYSFUNC
macro.
Syntax LEXCOMB(count, k, variable-1, …, variable-n)
Arguments count
specifies an integer variable that is assigned values ffrom 1 to the number of combinations in a loop. k
is an constant, variable, or expression between 1 and n, inclusive, that specifies the number of items in each combination. variable
specifies either all numeric variables, or all character variables that have the same length. The values of these variables are permuted. Requirement: Initialize these variables before you execute the LEXCOMB function. After executing the LEXCOMB function, the first k variables contain the values in one combination.
Tip:
Details The Basics Use the LEXCOMB function in a loop where the first argument to LEXCOMB takes each integral value from 1 to the number of distinct combinations of the non-missing values of the variables. In each execution of LEXCOMB within this loop, k should have the same value. Number of Combinations
When all of the variables have non-missing, unequal values, then the number of combinations is COMB(n,k). If the number of variables that have missing values is m, and all the non-missing values are unequal, then LEXCOMB produces COMB(n-m,k) combinations because the missing values are omitted from the combinations. When some of the variables have equal values, the exact number of combinations is difficult to compute, but COMB(n,k) provides an upper bound. You do not need to compute the exact number of combinations, provided that your program leaves the loop when LEXCOMB returns a value that is less than zero.
LEXCOMB Processing On the first execution of the LEXCOMB function, the following actions occur: 3 The argument types and lengths are checked for consistency.
862
LEXCOMB Function
4
Chapter 4
3 The m missing values are assigned to the last m arguments. 3 The n-m non-missing values are assigned in ascending order to the first n-m arguments following count.
3 LEXCOMB returns 1. On subsequent executions, up to and including the last combination, the following actions occur: 3 The next distinct combination of the non-missing values is generated in lexicographic order. 3 If variable-1 through variable-i did not change, but variable-j did change, where j=i+1, then LEXCOMB returns j. If you execute the LEXCOMB function after generating all the distinct combinations, then LEXCOMB returns –1. If you execute the LEXCOMB function with the first argument out of sequence, then the results are not useful. In particular, if you initialize the variables and then immediately execute the LEXCOMB function with a first argument of j, for example, th th you will not get the j combination (except when j is 1). To get the j combination, you must execute the LEXCOMB function j times, with the first argument taking values from 1 through j in that exact order.
Comparisons The LEXCOMB function generates all distinct combinations of the non-missing values of n variables taken k at a time in lexicographic order. The ALLCOMB function generates all combinations of the values of k variables taken k at a time in a minimum change order.
Examples Example 1: Generating Distinct Combinations in Lexicographic Order The following example uses the LEXCOMB function to generate distinct combinations in lexicographic order. data _null_; array x[5] $3 (’ant’ ’bee’ ’cat’ ’dog’ ’ewe’); n=dim(x); k=3; ncomb=comb(n,k); do j=1 to ncomb+1; rc=lexcomb(j, k, of x[*]); put j 5. +3 x1-x3 +3 rc=; if rc)
Arguments value
is a numeric constant, variable, or expression.
Details The MEDIAN function returns the median of the nonmissing values. If all arguments have missing values, the result is a missing value. Note: The formula that is used in the MEDIAN function is the same as the formula that is used in PROC UNIVARIATE. For more information, see “SAS Elementary Statistics Procedures” in Base SAS Procedures Guide. 4
Functions and CALL Routines
4
MIN Function
897
Comparisons The MEDIAN function returns the median of nonmissing values, whereas the MEAN function returns the arithmetic mean (average).
Examples SAS Statements
Results
x=median(2,4,1,3);
2.5
y=median(5,8,0,3,4);
4
See Also Function: “MEAN Function” on page 895
MIN Function Returns the smallest value. Category: Descriptive Statistics
Syntax MIN(argument-1,argument-2)
Arguments argument
specifies a numeric constant, variable, or expression. At least two arguments are required. The argument list can consist of a variable list, which is preceded by OF.
Comparisons The MIN function returns a missing value (.) only if all arguments are missing. The MIN operator (>module-name)
Details For details on the MODULEC function, see “CALL MODULE Routine” on page 459.
See Also Functions and CALL Routines: “CALL MODULE Routine” on page 459 “MODULEN Function” on page 904
903
904
4
MODULEN Function
Chapter 4
MODULEN Function Calls an external routine and returns a numeric value. External Routines “CALL MODULE Routine” on page 459
Category: See:
Syntax MODULEN(< cntl-string,>module-name< ,argument-1, ..., argument-n>)
Details For details about the MODULEN function, see “CALL MODULE Routine” on page 459.
See Also Functions and CALL Routines: “CALL MODULE Routine” on page 459 “MODULEC Function” on page 903
MODZ Function Returns the remainder from the division of the first argument by the second argument, using zero fuzzing. Category:
Mathematical
Syntax MODZ (argument-1, argument-2)
Arguments argument-1
is a numeric constant, variable, or expression that specifies the dividend. argument-2
is a non-zero numeric constant, variable, or expression that specifies the divisor.
Details The MODZ function returns the remainder from the division of argument-1 by argument-2. When the result is non-zero, the result has the same sign as the first argument. The sign of the second argument is ignored.
Functions and CALL Routines
4
MODZ Function
905
The computation that is performed by the MODZ function is exact if both of the following conditions are true: 3 Both arguments are exact integers. 3 All integers that are less than either argument have exact 8-byte floating-point representation. To determine the largest integer for which the computation is exact, execute the following DATA step: data _null_; exactint = constant(’exactint’); put exactint=; run;
Operating Environment Information: You can also refer to the SAS documentation for your operating environment for information about the largest integer. 4 If either of the above conditions is not true, a small amount of numerical error can occur in the floating-point computation. For example, when you use exact arithmetic and the result is zero, MODZ might return a very small positive value or a value slightly less than the second argument.
Comparisons Here are some comparisons between the MODZ and MOD functions: 3 The MODZ function performs no fuzzing. 3 The MOD function performs extra computations, called fuzzing, to return an exact zero when the result would otherwise differ from zero because of numerical error. 3 Both the MODZ and MOD functions return a missing value if the remainder cannot be computed to a precision of approximately three digits or more.
Examples The following SAS statements produce results for MOD and MODZ. SAS Statements
Results
x1=mod(10,3); put x1 9.4;
1.0000
xa=modz(10,3); put xa 9.4;
1.0000
x2=mod(.3,-.1); put x2 9.4;
0.0000
xb=modz(.3,-.1); put xb 9.4;
0.1000
x3=mod(1.7,.1); put x3 9.4;
0.0000
xc=modz(1.7,.1); put xc 9.4;
0.0000
x4=mod(.9,.3); put x4 24.20;
0.00000000000000000000
xd=modz(.9,.3); put xd 24.20;
0.00000000000000005551
906
MONTH Function
4
Chapter 4
See Also Functions: “INT Function” on page 803 “INTZ Function” on page 835 “MOD Function” on page 900
MONTH Function Returns the month from a SAS date value. Category:
Date and Time
Syntax MONTH(date)
Arguments date
specifies a numeric constant, variable, or expression that represents a SAS date value.
Details The MONTH function returns a numeric value that represents the month from a SAS date value. Numeric values can range from 1 through 12.
Examples SAS Statements
Results
date=’25jan94’d; m=month(date); put m;
1
See Also Functions: “DAY Function” on page 616 “YEAR Function” on page 1191
MOPEN Function Opens a file by directory ID and member name, and returns either the file identifier or a 0.
Functions and CALL Routines
4
MOPEN Function
907
Category: External Files See:
MOPEN Function in the documentation for your operating environment.
Syntax MOPEN(directory-id,member-name)
Arguments directory-id
is a numeric variable that specifies the identifier that was assigned when the directory was opened, generally by the DOPEN function. member-name
is a character constant, variable, or expression that specifies the member name in the directory. open-mode
is a character constant, variable, or expression that specifies the type of access to the file: A
APPEND mode allows writing new records after the current end of the file.
I
INPUT mode allows reading only (default).
O
OUTPUT mode defaults to the OPEN mode specified in the operating environment option in the FILENAME statement or function. If no operating environment option is specified, it allows writing new records at the beginning of the file.
S
Sequential input mode is used for pipes and other sequential devices such as hardware ports.
U
UPDATE mode allows both reading and writing.
W
Sequential update mode is used for pipes and other sequential devices such as ports.
Default: I record-length
is a numeric variable, constant, or expression that specifies a new logical record length for the file. To use the existing record length for the file, specify a length of 0, or do not provide a value here. record-format
is a character constant, variable, or expression that specifies a new record format for the file. To use the existing record format, do not specify a value here. The following values are valid: B
specifies that data is to be interpreted as binary data.
D
specifies the default record format.
E
specifies the record format that you can edit.
F
specifies that the file contains fixed-length records.
P
specifies that the file contains printer carriage control in operating environment-dependent record format.
908
MOPEN Function
4
Chapter 4
V
specifies that the file contains variable-length records.
Note: If an argument is invalid, then MOPEN returns 0. You can obtain the text of the corresponding error message from the SYSMSG function. Invalid arguments do not produce a message in the SAS log and do not set the _ERROR_ automatic variable. 4
Details MOPEN returns the identifier for the file, or 0 if the file could not be opened. You can use a file-id that is returned by the MOPEN function as you would use a file-id returned by the FOPEN function. CAUTION:
Use OUTPUT mode with care. Opening an existing file for output might overwrite the current contents of the file without warning. 4 The member is identified by directory-id and member-name instead of by a fileref. You can also open a directory member by using FILENAME to assign a fileref to the member, followed by a call to FOPEN. However, when you use MOPEN, you do not have to use a separate fileref for each member. If the file already exists, the output and update modes default to the operating environment option (append or replace) specified with the FILENAME statement or function. For example, %let %let %let %let %let %let
rc=%sysfunc(filename(file,physical-name,,mod)); did=%sysfunc(dopen(&file)); fid=%sysfunc(mopen(&did,member-name,o,0,d)); rc=%sysfunc(fput(&fid,This is a test.)); rc=%sysfunc(fwrite(&fid)); rc=%sysfunc(fclose(&fid));
If ’file’ already exists, FWRITE appends the new record instead of writing it at the beginning of the file. However, if no operating environment option is specified with the FILENAME function, the output mode implies that the record be replaced. If the open fails, use SYSMSG to retrieve the message text. Operating Environment Information: The term directory in this description refers to an aggregate grouping of files that are managed by the operating environment. Different host operating environments identify such groupings with different names, such as directory, subdirectory, folder, MACLIB, or partitioned data set. For details, see the SAS documentation for your operating environment. Opening a directory member for output or append is not possible in some operating environments. 4
Examples This example assigns the fileref MYDIR to a directory. Then it opens the directory, determines the number of members, retrieves the name of the first member, and opens that member. The last three arguments to MOPEN are the defaults. Note that in a macro statement you do not enclose character strings in quotation marks. %let filrf=mydir; %let rc=%sysfunc(filename(filrf,physical-name)); %let did=%sysfunc(dopen(&filrf)); %let frstname=’ ’; %let memcount=%sysfunc(dnum(&did)); %if (&memcount > 0) %then %do;
Functions and CALL Routines
4
MORT Function
%let frstname = %sysfunc(dread(&did,1)); %let fid = %sysfunc(mopen(&did,&frstname,i,0,d)); macro statements to process the member %let rc=%sysfunc(fclose(&fid)); %end; %else %put %sysfunc(sysmsg()); %let rc=%sysfunc(dclose(&did));
See Also Functions: “DCLOSE Function” on page 616 “DNUM Function” on page 637 “DOPEN Function” on page 638 “DREAD Function” on page 642 “FCLOSE Function” on page 656 “FILENAME Function” on page 666 “FOPEN Function” on page 736 “FPUT Function” on page 745 “FWRITE Function” on page 752 “SYSMSG Function” on page 1114
MORT Function Returns amortization parameters. Category: Financial
Syntax MORT(a,p,r,n)
Arguments a
is numeric, and specifies the initial amount. p
is numeric, and specifies the periodic payment. r
is numeric, and specifies the periodic interest rate that is expressed as a fraction.
909
910
MSPLINT Function
4
Chapter 4
n
is an integer, and specifies the number of compounding periods. Range:
n≥0
Details Calculating Results The MORT function returns the missing argument in the list of four arguments from an amortization calculation with a fixed interest rate that is compounded each period. The arguments are related by the following equation:
p=
ar (1 + r)n (1 + r)n 0 1
One missing argument must be provided. The value is then calculated from the remaining three. No adjustment is made to convert the results to round numbers.
Restrictions in Calculating Results
The MORT function returns an invalid argument note to the SAS log and sets _ERROR_ to 1 if one of the following argument combinations is true:
3 3 3 3 3
rate < –1 or n < 0 principal 0 Exception: The case freq = 0 is a flag to allow continuous discounting. c0,c1,...,cn
are numeric cash flows that represent cash outlays (payments) or cash inflows (income) occurring at times 0, 1, ...n. These cash flows are assumed to be equally spaced, beginning-of-period values. Negative values represent payments, positive values represent income, and values of 0 represent no cash flow at a given time. The c0 argument and the c1 argument are required.
Details The NETPV function returns the net present value at time 0 for the set of cash payments c0,c1, ...,cn, with a rate r over a specified base period of time. The argument freq>0 describes the number of payments that occur over the specified base period of time. The net present value is given by
NETPV (r; f req; c0; c1; :::; cn )
X = n
=0
i
ci x
i
Functions and CALL Routines
4
NLITERAL Function
915
where
x=
1
(1+r )
e0r
(1=freq)
f req > 0 f req
=0
Missing values in the payments are treated as 0 values. When freq>0, the rate r is the effective rate over the specified base period. To compute with a quarterly rate (the base period is three months) of 4 percent with monthly cash payments, set freq to 3 and set r to .04. If freq is 0, continuous discounting is assumed. The base period is the time interval between two consecutive payments, and the rate r is a nominal rate. To compute with a nominal annual interest rate of 11 percent discounted continuously with monthly payments, set freq to 0 and set r to .11/12.
Examples For an initial investment of $500 that returns biannual payments of $200, $300, and $400 over the succeeding 6 years and an annual discount rate of 10 percent, the net present value of the investment can be expressed as follows: value=netpv(.10,.5,-500,200,300,400);
The value returned is 95.98.
NLITERAL Function Converts a character string that you specify to a SAS name literal. Category: Character Restriction:
“I18N Level 2” on page 306
Syntax NLITERAL(string)
Arguments string
specifies a character constant, variable, or expression that is to be converted to a SAS name literal. Tip: Enclose a literal string of characters in quotation marks. Restriction: If the string is a valid SAS variable name, it is not changed.
Details Length of Returned Variable In a DATA step, if the NLITERAL function returns a value to a variable that has not previously been assigned a length, then the variable is given a length of 200 bytes.
916
NLITERAL Function
4
Chapter 4
The Basics String will be converted to a name literal, unless it qualifies under the default rules for a SAS variable name. These default rules are in effect when the SAS system option VALIDVARNAME=V7: 3 It begins with an English letter or an underscore.
3 All subsequent characters are English letters, underscores, or digits. 3 The length is 32 or fewer alphanumeric characters. String qualifies as a SAS variable name, when all of these rules are true. The NLITERAL function encloses the value of string in single or double quotation marks, based on the contents of string. Value in string
Result
an ampersand (&)
enclosed in single quotation marks
a percent sign (%)
enclosed in single quotation marks
more double quotation marks than single quotation marks
enclosed in single quotation marks
none of the above
enclosed in double quotation marks
If insufficient space is available for the resulting n-literal, NLITERAL returns a blank string, prints an error message, and sets _ERROR_ to 1.
Examples This example demonstrates multiple uses of NLITERAL. data test; input string $32.; length result $ 67; result = nliteral(string); datalines; abc_123 This and That cats & dogs Company’s profits (%) "Double Quotes" ’Single Quotes’ ; proc print; title ’Strings Converted to N-Literals or Returned Unchanged’; run;
Functions and CALL Routines
Output 4.61
4
NMISS Function
Converting Strings to Name Literals with NLITERAL Strings Converted to N-Literals or Returned Unchanged Obs 1 2 3 4 5 6
string
result
abc_123 This and That cats & dogs Company’s profits (%) "Double Quotes" ’Single Quotes’
abc_123 "This and That"N ’cats & dogs’N ’Company’’s profits (%)’N ’"Double Quotes"’N "’Single Quotes’"N
1
See Also Functions: “COMPARE Function” on page 571 “DEQUOTE Function” on page 624 “NVALID Function” on page 944 System Option: “VALIDVARNAME= System Option” on page 1989 “Rules for Words and Names in the SAS Language” in SAS Language Reference: Concepts
NMISS Function Returns the number of missing numeric values. Category: Descriptive Statistics
Syntax NMISS(argument-1)
Arguments argument
specifies a numeric constant, variable, or expression. At least one argument is required. The argument list can consist of a variable list, which is preceded by OF.
Comparisons The NMISS function returns the number of missing values, whereas the N function returns the number of nonmissing values. NMISS requires numeric values, whereas CMISS works with both numeric and character values. NMISS works with multiple numeric values, whereas MISSING works with only one value that can be either numeric or character.
917
918
NORMAL Function
4
Chapter 4
Examples SAS Statements
Results
x1=nmiss(1,0,.,2,5,.);
2
x2=nmiss(1,0);
0
x3=nmiss(of x1-x2);
0
NORMAL Function Returns a random variate from a normal, or Gaussian, distribution. Random Number Alias: RANNOR Category: See:
“RANNOR Function” on page 1049
NOTALNUM Function Searches a character string for a non-alphanumeric character, and returns the first position at which the character is found. Category: Restriction:
Character “I18N Level 2” on page 306
Syntax NOTALNUM(string )
Arguments string
specifies a character constant, variable, or expression to search. start
is an optional numeric constant, variable, or expression with an integer value that specifies the position at which the search should start and the direction in which to search.
Details The results of the NOTALNUM function depend directly on the translation table that is in effect (see “TRANTAB System Option”) and indirectly on the “ENCODING System
Functions and CALL Routines
4
NOTALNUM Function
919
Option” and the “LOCALE System Option” in SAS National Language Support (NLS): Reference Guide. The NOTALNUM function searches a string for the first occurrence of any character that is not a digit or an uppercase or lowercase letter. If such a character is found, NOTALNUM returns the position in the string of that character. If no such character is found, NOTALNUM returns a value of 0. If you use only one argument, NOTALNUM begins the search at the beginning of the string. If you use two arguments, the absolute value of the second argument, start, specifies the position at which to begin the search. The direction in which to search is determined in the following way:
3 If the value of start is positive, the search proceeds to the right. 3 If the value of start is negative, the search proceeds to the left. 3 If the value of start is less than the negative length of the string, the search begins at the end of the string. NOTALNUM returns a value of zero when one of the following is true:
3 The character that you are searching for is not found. 3 The value of start is greater than the length of the string. 3 The value of start = 0.
Comparisons The NOTALNUM function searches a character string for a non-alphanumeric character. The ANYALNUM function searches a character string for an alphanumeric character.
Examples The following example uses the NOTALNUM function to search a string from left to right for non-alphanumeric characters. data _null_; string=’Next = Last + 1;’; j=0; do until(j=0); j=notalnum(string,j+1); if j=0 then put +3 "That’s all"; else do; c=substr(string,j,1); put +3 j= c=; end; end; run;
The following lines are written to the SAS log: j=5 c= j=6 c== j=7 c= j=12 c= j=13 c=+ j=14 c= j=16 c=; That’s all
920
NOTALPHA Function
4
Chapter 4
See Also Function: “ANYALNUM Function” on page 367
NOTALPHA Function Searches a character string for a nonalphabetic character, and returns the first position at which the character is found. Category: Restriction:
Character “I18N Level 2” on page 306
Syntax NOTALPHA(string )
Arguments
string
is the character constant, variable, or expression to search. start
is an optional numeric constant, variable, or expression with an integer value that specifies the position at which the search should start and the direction in which to search.
Details The results of the NOTALPHA function depend directly on the translation table that is in effect (see “TRANTAB System Option”) and indirectly on the “ENCODING System Option” and the “LOCALE System Option” in the SAS National Language Support (NLS): Reference Guide. The NOTALPHA function searches a string for the first occurrence of any character that is not an uppercase or lowercase letter. If such a character is found, NOTALPHA returns the position in the string of that character. If no such character is found, NOTALPHA returns a value of 0. If you use only one argument, NOTALPHA begins the search at the beginning of the string. If you use two arguments, the absolute value of the second argument, start, specifies the position at which to begin the search. The direction in which to search is determined in the following way:
3 If the value of start is positive, the search proceeds to the right. 3 If the value of start is negative, the search proceeds to the left. 3 If the value of start is less than the negative length of the string, the search begins at the end of the string.
Functions and CALL Routines
4
NOTALPHA Function
NOTALPHA returns a value of zero when one of the following is true: 3 The character that you are searching for is not found. 3 The value of start is greater than the length of the string.
3 The value of start = 0.
Comparisons The NOTALPHA function searches a character string for a nonalphabetic character. The ANYALPHA function searches a character string for an alphabetic character.
Examples Example 1: Searching a String for Nonalphabetic Characters
The following example uses the NOTALPHA function to search a string from left to right for nonalphabetic characters. data _null_; string=’Next = _n_ + 12E3;’; j=0; do until(j=0); j=notalpha(string,j+1); if j=0 then put +3 "That’s all"; else do; c=substr(string,j,1); put +3 j= c=; end; end; run;
The following lines are written to the SAS log: j=5 c= j=6 c== j=7 c= j=8 c=_ j=10 c=_ j=11 c= j=12 c=+ j=13 c= j=14 c=1 j=15 c=2 j=17 c=3 j=18 c=; That’s all
921
922
NOTCNTRL Function
4
Chapter 4
Example 2: Identifying Control Characters by Using the NOTALPHA Function
You can execute the following program to show the control characters that are identified by the NOTALPHA function. data test; do dec=0 to 255; byte=byte(dec); hex=put(dec,hex2.); notalpha=notalpha(byte); output; end; proc print data=test; run;
See Also Function: “ANYALPHA Function” on page 369
NOTCNTRL Function Searches a character string for a character that is not a control character, and returns the first position at which that character is found. Category: Restriction:
Character “I18N Level 2” on page 306
Syntax NOTCNTRL(string)
Arguments string
is the character constant, variable, or expression to search. start
is an optional numeric constant, variable, or expression with an integer value that specifies the position at which the search should start and the direction in which to search.
Details The results of the NOTCNTRL function depend directly on the translation table that is in effect (see “TRANTAB System Option”) and indirectly on the “ENCODING System Option” and the “LOCALE System Option” in the SAS National Language Support (NLS): Reference Guide.
Functions and CALL Routines
4
NOTCNTRL Function
923
The NOTCNTRL function searches a string for the first occurrence of a character that is not a control character. If such a character is found, NOTCNTRL returns the position in the string of that character. If no such character is found, NOTCNTRL returns a value of 0. If you use only one argument, NOTCNTRL begins the search at the beginning of the string. If you use two arguments, the absolute value of the second argument, start, specifies the position at which to begin the search. The direction in which to search is determined in the following way:
3 If the value of start is positive, the search proceeds to the right. 3 If the value of start is negative, the search proceeds to the left. 3 If the value of start is less than the negative length of the string, the search begins at the end of the string. NOTCNTRL returns a value of zero when one of the following is true:
3 The character that you are searching for is not found. 3 The value of start is greater than the length of the string. 3 The value of start = 0.
Comparisons The NOTCNTRL function searches a character string for a character that is not a control character. The ANYCNTRL function searches a character string for a control character.
Examples You can execute the following program to show the control characters that are identified by the NOTCNTRL function. data test; do dec=0 to 255; byte=byte(dec); hex=put(dec,hex2.); notcntrl=notcntrl(byte); output; end; proc print data=test; run;
See Also Function: “ANYCNTRL Function” on page 371
924
4
NOTDIGIT Function
Chapter 4
NOTDIGIT Function Searches a character string for any character that is not a digit, and returns the first position at which that character is found. Category: Restriction:
Character “I18N Level 2” on page 306
Syntax NOTDIGIT(string < ,start>)
Arguments
string
is the character constant, variable, or expression to search. start
is an optional numeric constant, variable, or expression with an integer value that specifies the position at which the search should start and the direction in which to search.
Details The results of the NOTDIGIT function depend directly on the translation table that is in effect (see “TRANTAB System Option”) and indirectly on the “ENCODING System Option” and the “LOCALE System Option” in the SAS National Language Support (NLS): Reference Guide. The NOTDIGIT function searches a string for the first occurrence of any character that is not a digit. If such a character is found, NOTDIGIT returns the position in the string of that character. If no such character is found, NOTDIGIT returns a value of 0. If you use only one argument, NOTDIGIT begins the search at the beginning of the string. If you use two arguments, the absolute value of the second argument, start, specifies the position at which to begin the search. The direction in which to search is determined in the following way:
3 If the value of start is positive, the search proceeds to the right. 3 If the value of start is negative, the search proceeds to the left. 3 If the value of start is less than the negative length of the string, the search begins at the end of the string. NOTDIGIT returns a value of zero when one of the following is true:
3 The character that you are searching for is not found. 3 The value of start is greater than the length of the string. 3 The value of start = 0.
Comparisons The NOTDIGIT function searches a character string for any character that is not a digit. The ANYDIGIT function searches a character string for a digit.
Functions and CALL Routines
4
NOTE Function
925
Examples The following example uses the NOTDIGIT function to search for a character that is not a digit. data _null_; string=’Next = _n_ + 12E3;’; j=0; do until(j=0); j=notdigit(string,j+1); if j=0 then put +3 "That’s all"; else do; c=substr(string,j,1); put +3 j= c=; end; end; run;
The following lines are written to the SAS log: j=1 c=N j=2 c=e j=3 c=x j=4 c=t j=5 c= j=6 c== j=7 c= j=8 c=_ j=9 c=n j=10 c=_ j=11 c= j=12 c=+ j=13 c= j=16 c=E j=18 c=; That’s all
See Also Function: “ANYDIGIT Function” on page 373
NOTE Function Returns an observation ID for the current observation of a SAS data set. Category: SAS File I/O
Syntax NOTE(data-set-id)
926
NOTE Function
4
Chapter 4
Arguments
data-set-id
is a numeric variable that specifies the data set identifier that the OPEN function returns.
Details You can use the observation ID value to return to the current observation by using POINT. Observations can be marked by using NOTE and then returned to later by using POINT. Each observation ID is a unique numeric value. To free the memory that is associated with an observation ID, use DROPNOTE.
Examples This example calls CUROBS to display the observation number, calls NOTE to mark the observation, and calls POINT to point to the observation that corresponds to NOTEID. %let dsid=%sysfunc(open(sasuser.fitness,i)); /* Go to observation 10 in data set */ %let rc=%sysfunc(fetchobs(&dsid,10)); %if %sysfunc(abs(&rc)) %then %put FETCHOBS FAILED; %else %do; /* Display observation number */ /* in the Log */ %let cur=%sysfunc(curobs(&dsid)); %put CUROBS=&cur; /* Mark observation 10 using NOTE */ %let noteid=%sysfunc(note(&dsid)); /* Rewind pointer to beginning */ /* of data */ /* set using REWIND */ %let rc=%sysfunc(rewind(&dsid)); /* FETCH first observation into DDV */ %let rc=%sysfunc(fetch(&dsid)); /* Display first observation number */ %let cur=%sysfunc(curobs(&dsid)); %put CUROBS=&cur; /* POINT to observation 10 marked */ /* earlier by NOTE */ %let rc=%sysfunc(point(&dsid,¬eid)); /* FETCH observation into DDV */ %let rc=%sysfunc(fetch(&dsid)); /* Display observation number 10 */ /* marked by NOTE */ %let cur=%sysfunc(curobs(&dsid)); %put CUROBS=&cur; %end; %if (&dsid > 0) %then %let rc=%sysfunc(close(&dsid));
Functions and CALL Routines
4
NOTFIRST Function
927
The output produced by this program is: CUROBS=10 CUROBS=1 CUROBS=10
See Also Functions: “DROPNOTE Function” on page 643 “OPEN Function” on page 948 “POINT Function” on page 975 “REWIND Function” on page 1058
NOTFIRST Function Searches a character string for an invalid first character in a SAS variable name under VALIDVARNAME=V7, and returns the first position at which that character is found. Category: Character Restriction:
“I18N Level 2” on page 306
Syntax NOTFIRST(string )
Arguments string
is the character constant, variable, or expression to search. start
is an optional numeric constant, variable, or expression with an integer value that specifies the position at which the search should start and the direction in which to search.
Details The NOTFIRST function does not depend on the TRANTAB, ENCODING, or LOCALE options. The NOTFIRST function searches a string for the first occurrence of any character that is not valid as the first character in a SAS variable name under VALIDVARNAME=V7. These characters are any except the underscore (_) and uppercase or lowercase English letters. If such a character is found, NOTFIRST returns the position in the string of that character. If no such character is found, NOTFIRST returns a value of 0. If you use only one argument, NOTFIRST begins the search at the beginning of the string. If you use two arguments, the absolute value of the second argument, start,
928
NOTFIRST Function
4
Chapter 4
specifies the position at which to begin the search. The direction in which to search is determined in the following way:
3 If the value of start is positive, the search proceeds to the right. 3 If the value of start is negative, the search proceeds to the left. 3 If the value of start is less than the negative length of the string, the search begins at the end of the string. NOTFIRST returns a value of zero when one of the following is true:
3 The character that you are searching for is not found. 3 The value of start is greater than the length of the string. 3 The value of start = 0.
Comparisons The NOTFIRST function searches a string for the first occurrence of any character that is not valid as the first character in a SAS variable name under VALIDVARNAME=V7. The ANYFIRST function searches a string for the first occurrence of any character that is valid as the first character in a SAS variable name under VALIDVARNAME=V7.
Examples The following example uses the NOTFIRST function to search a string for any character that is not valid as the first character in a SAS variable name under VALIDVARNAME=V7. data _null_; string=’Next = _n_ + 12E3;’; j=0; do until(j=0); j=notfirst(string,j+1); if j=0 then put +3 "That’s all"; else do; c=substr(string,j,1); put +3 j= c=; end; end; run;
The following lines are written to the SAS log: j=5 c= j=6 c== j=7 c= j=11 c= j=12 c=+ j=13 c= j=14 c=1 j=15 c=2 j=17 c=3 j=18 c=; That’s all
Functions and CALL Routines
4
NOTGRAPH Function
929
See Also Function: “ANYFIRST Function” on page 374
NOTGRAPH Function Searches a character string for a non-graphical character, and returns the first position at which that character is found. Category: Character Restriction:
“I18N Level 2” on page 306
Syntax NOTGRAPH(string )
Arguments
string
is the character constant, variable, or expression to search. start
is an optional numeric constant, variable, or expression with an integer value that specifies the position at which the search should start and the direction in which to search.
Details The results of the NOTGRAPH function depend directly on the translation table that is in effect (see “TRANTAB System Option”) and indirectly on the “ENCODING System Option” and the “LOCALE System Option” in SAS National Language Support (NLS): Reference Guide. The NOTGRAPH function searches a string for the first occurrence of a non-graphical character. A graphical character is defined as any printable character other than white space. If such a character is found, NOTGRAPH returns the position in the string of that character. If no such character is found, NOTGRAPH returns a value of 0. If you use only one argument, NOTGRAPH begins the search at the beginning of the string. If you use two arguments, the absolute value of the second argument, start, specifies the position at which to begin the search. The direction in which to search is determined in the following way:
3 If the value of start is positive, the search proceeds to the right. 3 If the value of start is negative, the search proceeds to the left. 3 If the value of start is less than the negative length of the string, the search begins at the end of the string.
930
NOTGRAPH Function
4
Chapter 4
NOTGRAPH returns a value of zero when one of the following is true: 3 The character that you are searching for is not found. 3 The value of start is greater than the length of the string. 3 The value of start = 0.
Comparisons The NOTGRAPH function searches a character string for a non-graphical character. The ANYGRAPH function searches a character string for a graphical character.
Examples Example 1: Searching a String for Non-Graphical Characters The following example uses the NOTGRAPH function to search a string for a non-graphical character. data _null_; string=’Next = _n_ + 12E3;’; j=0; do until(j=0); j=notgraph(string,j+1); if j=0 then put +3 "That’s all"; else do; c=substr(string,j,1); put +3 j= c=; end; end; run;
The following lines are written to the SAS log: j=5 c= j=7 c= j=11 c= j=13 c= That’s all
Example 2: Identifying Control Characters by Using the NOTGRAPH Function
You can execute the following program to show the control characters that are identified by the NOTGRAPH function. data test; do dec=0 to 255; byte=byte(dec); hex=put(dec,hex2.); notgraph=notgraph(byte); output; end; proc print data=test; run;
See Also Function: “ANYGRAPH Function” on page 376
Functions and CALL Routines
4
NOTLOWER Function
931
NOTLOWER Function Searches a character string for a character that is not a lowercase letter, and returns the first position at which that character is found. Category: Character Restriction:
“I18N Level 2” on page 306
Syntax NOTLOWER(string )
Arguments string
is the character constant, variable, or expression to search. start
is an optional numeric constant, variable, or expression with an integer value that specifies the position at which the search should start and the direction in which to search.
Details The results of the NOTLOWER function depend directly on the translation table that is in effect (see “TRANTAB System Option”) and indirectly on the “ENCODING System Option” and the “LOCALE System Option” in SAS National Language Support (NLS): Reference Guide. The NOTLOWER function searches a string for the first occurrence of any character that is not a lowercase letter. If such a character is found, NOTLOWER returns the position in the string of that character. If no such character is found, NOTLOWER returns a value of 0. If you use only one argument, NOTLOWER begins the search at the beginning of the string. If you use two arguments, the absolute value of the second argument, start, specifies the position at which to begin the search. The direction in which to search is determined in the following way: 3 If the value of start is positive, the search proceeds to the right. 3 If the value of start is negative, the search proceeds to the left. 3 If the value of start is less than the negative length of the string, the search begins at the end of the string. NOTLOWER returns a value of zero when one of the following is true:
3 The character that you are searching for is not found. 3 The value of start is greater than the length of the string. 3 The value of start = 0.
Comparisons The NOTLOWER function searches a character string for a character that is not a lowercase letter. The ANYLOWER function searches a character string for a lowercase letter.
932
NOTNAME Function
4
Chapter 4
Examples The following example uses the NOTLOWER function to search a string for any character that is not a lowercase letter. data _null_; string=’Next = _n_ + 12E3;’; j=0; do until(j=0); j=notlower(string,j+1); if j=0 then put +3 "That’s all"; else do; c=substr(string,j,1); put +3 j= c=; end; end; run;
The following lines are written to the SAS log: j=1 c=N j=5 c= j=6 c== j=7 c= j=8 c=_ j=10 c=_ j=11 c= j=12 c=+ j=13 c= j=14 c=1 j=15 c=2 j=16 c=E j=17 c=3 j=18 c=; That’s all
See Also Function: “ANYLOWER Function” on page 378
NOTNAME Function Searches a character string for an invalid character in a SAS variable name under VALIDVARNAME=V7, and returns the first position at which that character is found. Category: Character Restriction: “I18N Level 2” on page 306
Syntax NOTNAME(string )
Functions and CALL Routines
4
NOTNAME Function
933
Arguments string
is the character constant, variable, or expression to search. start
is an optional numeric constant, variable, or expression with an integer value that specifies the position at which the search should start and the direction in which to search.
Details The NOTNAME function does not depend on the TRANTAB, ENCODING, or LOCALE options. The NOTNAME function searches a string for the first occurrence of any character that is not valid in a SAS variable name under VALIDVARNAME=V7. These characters are any except underscore (_), digits, and uppercase or lowercase English letters. If such a character is found, NOTNAME returns the position in the string of that character. If no such character is found, NOTNAME returns a value of 0. If you use only one argument, NOTNAME begins the search at the beginning of the string. If you use two arguments, the absolute value of the second argument, start, specifies the position at which to begin the search. The direction in which to search is determined in the following way: 3 If the value of start is positive, the search proceeds to the right. 3 If the value of start is negative, the search proceeds to the left. 3 If the value of start is less than the negative length of the string, the search begins at the end of the string. NOTNAME returns a value of zero when one of the following is true: 3 The character that you are searching for is not found. 3 The value of start is greater than the length of the string. 3 The value of start = 0.
Comparisons The NOTNAME function searches a string for the first occurrence of any character that is not valid in a SAS variable name under VALIDVARNAME=V7. The ANYNAME function searches a string for the first occurrence of any character that is valid in a SAS variable name under VALIDVARNAME=V7.
Examples The following example uses the NOTNAME function to search a string for any character that is not valid in a SAS variable name under VALIDVARNAME=V7. data _null_; string=’Next = _n_ + 12E3;’; j=0; do until(j=0); j=notname(string,j+1); if j=0 then put +3 "That’s all"; else do; c=substr(string,j,1); put +3 j= c=;
934
4
NOTPRINT Function
Chapter 4
end; end; run;
The following lines are written to the SAS log: j=5 c= j=6 c== j=7 c= j=11 c= j=12 c=+ j=13 c= j=18 c=; That’s all
See Also Function: “ANYNAME Function” on page 379
NOTPRINT Function Searches a character string for a nonprintable character, and returns the first position at which that character is found. Category: Restriction:
Character “I18N Level 2” on page 306
Syntax NOTPRINT(string )
Arguments string
is the character constant, variable, or expression to search. start
is an optional numeric constant, variable, or expression with an integer value that specifies the position at which the search should start and the direction in which to search.
Details The results of the NOTPRINT function depend directly on the translation table that is in effect (see “TRANTAB System Option”) and indirectly on the “ENCODING System Option” and the “LOCALE System Option” in SAS National Language Support (NLS): Reference Guide.
Functions and CALL Routines
4
NOTPUNCT Function
935
The NOTPRINT function searches a string for the first occurrence of a non-printable character. If such a character is found, NOTPRINT returns the position in the string of that character. If no such character is found, NOTPRINT returns a value of 0. If you use only one argument, NOTPRINT begins the search at the beginning of the string. If you use two arguments, the absolute value of the second argument, start, specifies the position at which to begin the search. The direction in which to search is determined in the following way: 3 If the value of start is positive, the search proceeds to the right. 3 If the value of start is negative, the search proceeds to the left. 3 If the value of start is less than the negative length of the string, the search begins at the end of the string. NOTPRINT returns a value of zero when one of the following is true: 3 The character that you are searching for is not found. 3 The value of start is greater than the length of the string. 3 The value of start = 0.
Comparisons The NOTPRINT function searches a character string for a non-printable character. The ANYPRINT function searches a character string for a printable character.
Examples You can execute the following program to show the control characters that are identified by the NOTPRINT function. data test; do dec=0 to 255; byte=byte(dec); hex=put(dec,hex2.); notprint=notprint(byte); output; end; proc print data=test; run;
See Also Function: “ANYPRINT Function” on page 381
NOTPUNCT Function Searches a character string for a character that is not a punctuation character, and returns the first position at which that character is found. Category: Character Restriction:
“I18N Level 2” on page 306
936
NOTPUNCT Function
4
Chapter 4
Syntax NOTPUNCT(string < ,start>)
Arguments string
is the character constant, variable, or expression to search. start
is an optional numeric constant, variable, or expression with an integer value that specifies the position at which the search should start and the direction in which to search.
Details The results of the NOTPUNCT function depend directly on the translation table that is in effect (see “TRANTAB System Option”) and indirectly on the “ENCODING System Option” and the “LOCALE System Option” in SAS National Language Support (NLS): Reference Guide. The NOTPUNCT function searches a string for the first occurrence of a character that is not a punctuation character. If such a character is found, NOTPUNCT returns the position in the string of that character. If no such character is found, NOTPUNCT returns a value of 0. If you use only one argument, NOTPUNCT begins the search at the beginning of the string. If you use two arguments, the absolute value of the second argument, start, specifies the position at which to begin the search. The direction in which to search is determined in the following way: 3 If the value of start is positive, the search proceeds to the right. 3 If the value of start is negative, the search proceeds to the left. 3 If the value of start is less than the negative length of the string, the search begins at the end of the string. NOTPUNCT returns a value of zero when one of the following is true:
3 The character that you are searching for is not found. 3 The value of start is greater than the length of the string. 3 The value of start = 0.
Comparisons The NOTPUNCT function searches a character string for a character that is not a punctuation character. The ANYPUNCT function searches a character string for a punctuation character.
Examples Example 1: Searching a String for Characters That Are Not Punctuation Characters The following example uses the NOTPUNCT function to search a string for characters that are not punctuation characters. data _null_; string=’Next = _n_ + 12E3;’; j=0; do until(j=0);
Functions and CALL Routines
4
NOTSPACE Function
937
j=notpunct(string,j+1); if j=0 then put +3 "That’s all"; else do; c=substr(string,j,1); put +3 j= c=; end; end; run;
The following lines are written to the SAS log: j=1 c=N j=2 c=e j=3 c=x j=4 c=t j=5 c= j=7 c= j=9 c=n j=11 c= j=13 c= j=14 c=1 j=15 c=2 j=16 c=E j=17 c=3 That’s all
Example 2: Identifying Control Characters by Using the NOTPUNCT Function You can execute the following program to show the control characters that are identified by the NOTPUNCT function. data test; do dec=0 to 255; byte=byte(dec); hex=put(dec,hex2.); notpunct=notpunct(byte); output; end; proc print data=test; run;
See Also Function: “ANYPUNCT Function” on page 383
NOTSPACE Function Searches a character string for a character that is not a white-space character (blank, horizontal and vertical tab, carriage return, line feed, and form feed), and returns the first position at which that character is found.
938
NOTSPACE Function
4
Chapter 4
Category: Character Restriction: “I18N Level 2” on page 306
Syntax NOTSPACE(string )
Arguments string
is the character constant, variable, or expression to search. start
is an optional numeric constant, variable, or expression with an integer value that specifies the position at which the search should start and the direction in which to search.
Details The results of the NOTSPACE function depend directly on the translation table that is in effect (see “TRANTAB System Option”) and indirectly on the “ENCODING System Option” and the “LOCALE System Option” in SAS National Language Support (NLS): Reference Guide. The NOTSPACE function searches a string for the first occurrence of a character that is not a blank, horizontal tab, vertical tab, carriage return, line feed, or form feed. If such a character is found, NOTSPACE returns the position in the string of that character. If no such character is found, NOTSPACE returns a value of 0. If you use only one argument, NOTSPACE begins the search at the beginning of the string. If you use two arguments, the absolute value of the second argument, start, specifies the position at which to begin the search. The direction in which to search is determined in the following way: 3 If the value of start is positive, the search proceeds to the right. 3 If the value of start is negative, the search proceeds to the left. 3 If the value of start is less than the negative length of the string, the search begins at the end of the string. NOTSPACE returns a value of zero when one of the following is true:
3 The character that you are searching for is not found. 3 The value of start is greater than the length of the string. 3 The value of start = 0.
Comparisons The NOTSPACE function searches a character string for the first occurrence of a character that is not a blank, horizontal tab, vertical tab, carriage return, line feed, or form feed. The ANYSPACE function searches a character string for the first occurrence of a character that is a blank, horizontal tab, vertical tab, carriage return, line feed, or form feed.
Examples Example 1: Searching a String for a Character That Is Not a White-Space Character The following example uses the NOTSPACE function to search a string for a character that is not a white-space character.
Functions and CALL Routines
4
NOTSPACE Function
939
data _null_; string=’Next = _n_ + 12E3;’; j=0; do until(j=0); j=notspace(string,j+1); if j=0 then put +3 "That’s all"; else do; c=substr(string,j,1); put +3 j= c=; end; end; run;
The following lines are written to the SAS log: j=1 c=N j=2 c=e j=3 c=x j=4 c=t j=6 c== j=8 c=_ j=9 c=n j=10 c=_ j=12 c=+ j=14 c=1 j=15 c=2 j=16 c=E j=17 c=3 j=18 c=; That’s all
Example 2: Identifying Control Characters by Using the NOTSPACE Function
You can execute the following program to show the control characters that are identified by the NOTSPACE function. data test; do dec=0 to 255; byte=byte(dec); hex=put(dec,hex2.); notspace=notspace(byte); output; end; proc print data=test; run;
See Also Function: “ANYSPACE Function” on page 385
940
NOTUPPER Function
4
Chapter 4
NOTUPPER Function Searches a character string for a character that is not an uppercase letter, and returns the first position at which that character is found. Character Restriction: “I18N Level 2” on page 306 Category:
Syntax NOTUPPER(string < ,start>)
Arguments string
is the character constant, variable, or expression to search. start
is an optional numeric constant, variable, or expression with an integer value that specifies the position at which the search should start and the direction in which to search.
Details The results of the NOTUPPER function depend directly on the translation table that is in effect (see “TRANTAB System Option”) and indirectly on the “ENCODING System Option” and the “LOCALE System Option” in SAS National Language Support (NLS): Reference Guide. The NOTUPPER function searches a string for the first occurrence of a character that is not an uppercase letter. If such a character is found, NOTUPPER returns the position in the string of that character. If no such character is found, NOTUPPER returns a value of 0. If you use only one argument, NOTUPPER begins the search at the beginning of the string. If you use two arguments, the absolute value of the second argument, start, specifies the position at which to begin the search. The direction in which to search is determined in the following way: 3 If the value of start is positive, the search proceeds to the right. 3 If the value of start is negative, the search proceeds to the left. 3 If the value of start is less than the negative length of the string, the search begins at the end of the string. NOTUPPER returns a value of zero when one of the following is true:
3 The character that you are searching for is not found. 3 The value of start is greater than the length of the string. 3 The value of start = 0.
Comparisons The NOTUPPER function searches a character string for a character that is not an uppercase letter. The ANYUPPER function searches a character string for an uppercase letter.
Functions and CALL Routines
4
NOTXDIGIT Function
Examples The following example uses the NOTUPPER function to search a string for any character that is not an uppercase letter. data _null_; string=’Next = _n_ + 12E3;’; j=0; do until(j=0); j=notupper(string,j+1); if j=0 then put +3 "That’s all"; else do; c=substr(string,j,1); put +3 j= c=; end; end; run;
The following lines are written to the SAS log: j=2 c=e j=3 c=x j=4 c=t j=5 c= j=6 c== j=7 c= j=8 c=_ j=9 c=n j=10 c=_ j=11 c= j=12 c=+ j=13 c= j=14 c=1 j=15 c=2 j=17 c=3 j=18 c=; That’s all
See Also Function: “ANYUPPER Function” on page 387
NOTXDIGIT Function Searches a character string for a character that is not a hexadecimal character, and returns the first position at which that character is found. Category: Character Restriction:
“I18N Level 2” on page 306
941
942
NOTXDIGIT Function
4
Chapter 4
Syntax NOTXDIGIT(string < ,start>)
Arguments string
is the character constant, variable, or expression to search. start
is an optional numeric constant, variable, or expression with an integer value that specifies the position at which the search should start and the direction in which to search.
Details The NOTXDIGIT function searches a string for the first occurrence of any character that is not a digit or an uppercase or lowercase A, B, C, D, E, or F. If such a character is found, NOTXDIGIT returns the position in the string of that character. If no such character is found, NOTXDIGIT returns a value of 0. If you use only one argument, NOTXDIGIT begins the search at the beginning of the string. If you use two arguments, the absolute value of the second argument, start, specifies the position at which to begin the search. The direction in which to search is determined in the following way: 3 If the value of start is positive, the search proceeds to the right. 3 If the value of start is negative, the search proceeds to the left. 3 If the value of start is less than the negative length of the string, the search begins at the end of the string. NOTXDIGIT returns a value of zero when one of the following is true: 3 The character that you are searching for is not found. 3 The value of start is greater than the length of the string. 3 The value of start = 0.
Comparisons The NOTXDIGIT function searches a character string for a character that is not a hexadecimal character. The ANYXDIGIT function searches a character string for a character that is a hexadecimal character.
Examples The following example uses the NOTXDIGIT function to search a string for a character that is not a hexadecimal character. data _null_; string=’Next = _n_ + 12E3;’; j=0; do until(j=0); j=notxdigit(string,j+1); if j=0 then put +3 "That’s all"; else do; c=substr(string,j,1); put +3 j= c=;
Functions and CALL Routines
4
NPV Function
943
end; end; run;
The following lines are written to the SAS log: j=1 c=N j=3 c=x j=4 c=t j=5 c= j=6 c== j=7 c= j=8 c=_ j=9 c=n j=10 c=_ j=11 c= j=12 c=+ j=13 c= j=18 c=; That’s all
See Also Function: “ANYXDIGIT Function” on page 388
NPV Function Returns the net present value with the rate expressed as a percentage. Category: Financial
Syntax NPV(r,freq,c0,c1,…,cn)
Arguments
r
is numeric, the interest rate over a specified base period of time expressed as a percentage. freq
is numeric, the number of payments during the base period of time specified with the rate r. Range: freq > 0 Exception:
The case freq = 0 is a flag to allow continuous discounting.
944
NVALID Function
4
Chapter 4
c0,c1,…,cn
are numeric cash flows that represent cash outlays (payments) or cash inflows (income) occurring at times 0, 1, ...n. These cash flows are assumed to be equally spaced, beginning-of-period values. Negative values represent payments, positive values represent income, and values of 0 represent no cash flow at a given time. The c0 argument and the c1 argument are required.
Comparisons The NPV function is identical to NETPV, except that the r argument is provided as a percentage.
NVALID Function Checks the validity of a character string for use as a SAS variable name. Character Restriction: “I18N Level 0” on page 305 Category:
Syntax NVALID(string)
Arguments string
specifies a character constant, variable, or expression which will be checked to determine whether its value can be used as a SAS variable name. Note: Tip:
Trailing blanks are ignored.
4
Enclose a literal string of characters in quotation marks.
validvarname
is a character constant, variable, or expression that specifies one of the following values: V7
determines that string is a valid SAS variable name when all three of the following are true: 3 It begins with an English letter or an underscore.
3 All subsequent characters are English letters, underscores, or digits.
3 The length is 32 or fewer alphanumeric characters. ANY
determines that string is a valid SAS variable name if it contains 32 or fewer characters of any type, including blanks.
NLITERAL
determines that string is a valid SAS variable name if it is in the form of a SAS name literal (’name’N) or if it is a valid SAS variable name when VALIDVARNAME=V7. See: V7 above in this same list.
Functions and CALL Routines
4
NVALID Function
945
Default: If no value is specified, the NVALID function determines that string is a
valid SAS variable name based on the value of the SAS system option VALIDVARNAME=.
Details The NVALID function checks the value of string to determine whether it can be used as a SAS variable name. The NVALID function returns a value of 1 or 0. Condition
Returned Value
string can be used as a SAS variable name
1
string cannot be used as a SAS variable name
0
Examples This example determines the validity of specified strings as SAS variable names. The value that is returned by the NVALID function varies with the validvarname argument. The value of 1 is returned when the string is determined to be a valid SAS variable name under the rules for the specified validvarname argument. Otherwise, the value of 0 is returned. options validvarname=v7 ls=64; data string; input string $char40.; v7=nvalid(string,’v7’); any=nvalid(string,’any’); nliteral=nvalid(string,’nliteral’); default=nvalid(string); datalines; Tooooooooooooooooooooooooooo Long OK Very_Long_But_Still_OK_for_V7 1st_char_is_a_digit Embedded blank !@#$%^&* "Very Loooong N-Literal with """N ’No closing quotation mark ; proc print noobs; title1 ’NLITERAL and Validvarname Arguments Determine’; title2 ’Invalid (0) and Valid (1) SAS Variable Names’; run;
946
NWKDOM Function
4
Chapter 4
Output 4.62
Determining the Validity of SAS Variable Names with NLITERAL NLITERAL and Validvarname Arguments Determine Invalid (0) and Valid (1) SAS Variable Names
string
v7
Tooooooooooooooooooooooooooo Long OK Very_Long_But_Still_OK_for_V7 1st_char_is_a_digit Embedded blank !@#$%^&* "Very Loooong N-Literal with """N ’No closing quotation mark
0 0 1 1 0 0 0 0 0
any 0 0 1 1 1 1 1 0 1
nliteral 0 0 1 1 1 1 1 1 0
1
default 0 0 1 1 0 0 0 0 0
See Also Functions: “COMPARE Function” on page 571 “NLITERAL Function” on page 915 System Option: “VALIDVARNAME= System Option” on page 1989 “Rules for Words and Names in the SAS Language” in SAS Language Reference: Concepts
NWKDOM Function Returns the date for the nth occurrence of a weekday for the specified month and year. Category:
Date and Time
Syntax NWKDOM(n, weekday, month, year)
Arguments n
specifies the numeric week of the month that contains the specified day. Range: 1–5 Tip: N=5 indicates that the specified day occurs in the last week of that month. Sometimes n=4 and n=5 produce the same results. weekday
specifies the number that corresponds to the day of the week. Range: 1–7
Functions and CALL Routines
Tip:
4
NWKDOM Function
947
Sunday is considered the first day of the week and has a weekday value of 1.
month
specifies the number that corresponds to the month of the year. Range: 1–12 year
specifies a four-digit calendar year.
Details The NWKDOM function returns a SAS date value for the nth weekday of the month and year that you specify. Use any valid SAS date format, such as the DATE9. format, to display a calendar date. You can specify n=5 for the last occurrence of a particular weekday in the month. Sometimes n=5 and n=4 produce the same result. These results occur when there are only four occurrences of the requested weekday in the month. For example, if the month of January begins on a Sunday, there will be five occurrences of Sunday, Monday, and Tuesday, but only four occurrences of Wednesday, Thursday, Friday, and Saturday. In this case, specifying n=5 or n=4 for Wednesday, Thursday, Friday, or Saturday will produce the same result. If February is not a leap year, the month has 28 days and there are four occurrences of each day of the week. In this case, n=5 and n=4 produce the same results for every day.
Comparisons In the NWKDOM function, the value for weekday corresponds to the numeric day of the week beginning on Sunday. This value is the same value that is used in the WEEKDAY function, where Sunday =1, and so on. The value for month corresponds to the numeric month of the year beginning in January. This value is the same value that is used in the MONTH function, where January =1, and so on. You can use the NWKDOM function to calculate events that are not defined by the HOLIDAY function. For example, if a university always schedules graduation on the first Saturday in June, then you can use the following statement to calculate the date: UnivGrad = nwkdom(1, 7, 6, year);
Examples Example 1: Returning Date Values
The following example uses the NWKDOM function and returns the date for specific occurrences of a weekday for a specified month and year. data _null_; /* Return the date of the a=nwkdom(3, 2, 5, 2000); /* Return the date of the b=nwkdom(4, 4, 11, 2007); /* Return the date of the c=nwkdom(4, 7, 11, 2007); /* Return the date of the d=nwkdom(1, 1, 1, 2007); /* Return the date of the e=nwkdom(2, 3, 9, 2007); /* Return the date of the f=nwkdom(5, 5, 12, 2007);
third Monday in May 2000. */ fourth Wednesday in November 2007. */ fourth Saturday in November 2007. */ first Sunday in January 2007. */ second Tuesday in September 2007. */ fifth Thursday in December 2007. */
948
OPEN Function
4
Chapter 4
put put put put put put run;
a= b= c= d= e= f=
Output 4.63
weekdatx.; weekdatx.; weekdatx.; weekdatx.; weekdatx.; weekdatx.;
Output from Returning Date Values
a=Monday, 15 May 2000 b=Wednesday, 28 November 2007 c=Saturday, 24 November 2007 d=Sunday, 7 January 2007 e=Tuesday, 11 September 2007 f=Thursday, 27 December 2007
Example 2: Returning the Date of the Last Monday in May The following example returns the date that corresponds to the last Monday in the month of May in the year 2007. data _null_; /* The last Monday in May. */ x=nwkdom(5,2,5,2007); put x date9.; run;
Output 4.64
Output from Calculating the Date of the Last Monday in May
28MAY2007
See Also Functions: “HOLIDAY Function” on page 778 “INTNX Function” on page 822 “MONTH Function” on page 906 “WEEKDAY Function” on page 1188
OPEN Function Opens a SAS data set. Category:
SAS File I/O
Functions and CALL Routines
4
OPEN Function
949
Syntax OPEN(>)
Arguments data-set-name
is a character constant, variable, or expression that specifies the name of the SAS data set or SAS SQL view to be opened. The value of this character string should be of the form member-name Default: The default value for data-set-name is _LAST_. Restriction: If you specify the FIRSTOBS= and OBS= data set options, they are
ignored. All other data set options are valid. mode
is a character constant, variable, or expression that specifies the type of access to the data set: I
opens the data set in INPUT mode (default). Values can be read but not modified. ’I’ uses the strongest access mode available in the engine. That is, if the engine supports random access, OPEN defaults to random access. Otherwise, the file is opened in ’IN’ mode automatically. Files are opened with sequential access and a system level warning is set.
IN
opens the data set in INPUT mode. Observations are read sequentially, and you are allowed to revisit an observation.
IS
opens the data set in INPUT mode. Observations are read sequentially, but you are not allowed to revisit an observation.
Default: I generation-number
specifies a consistently increasing number that identifies one of the historical versions in a generation group. Tip:
The generation-number argument is ignored if type = F.
type
is a character constant and can be one of the following values: D
specifies that the first argument, data-set-name, is a one-level or two-level data set name. The following example shows how the D type value can be used: rc = open(’lib.mydata’, , , ’D’);
Tip: D is the default if there is no fourth argument.
F
specifies that the first argument, data-set-name, is a filename, a physical path to a file. The following examples show how the F type value can be used: rc = open(’c:\data\mydata.sas7bdat’, , , ’F’); rc = open(’c:\data\mydata’, , , ’F’);
950
OPEN Function
4
Chapter 4
Tip: If you use the F value, then the third argument,
generation-number, is ignored. Note: If an argument is invalid, OPEN returns 0. You can obtain the text of the corresponding error message from the SYSMSG function. Invalid arguments do not produce a message in the SAS log and do not set the _ERROR_ automatic variable.
4
Details The OPEN function opens a SAS data set, DATA step, or a SAS SQL view and returns a unique numeric data set identifier, which is used in most other data set access functions. OPEN returns 0 if the data set could not be opened. If you call the OPEN function from a macro, then the result of the call is valid only when the result is passed to functions in a macro. If you call the OPEN function from the DATA step, then the result is valid only when the result is passed to functions in the same DATA step. By default, a SAS data set is opened with a control level of RECORD. For details, see the “CNTLLEV= Data Set Option” on page 18 . An open SAS data set should be closed when it is no longer needed. If you open a data set within a DATA step, it will be closed automatically when the DATA step ends. OPEN defaults to the strongest access mode available in the engine. That is, if the engine supports random access, OPEN defaults to random access. Otherwise, data sets are opened with sequential access, and a system-level warning is set.
Examples 3 This example opens the data set PRICES in the library MASTER using INPUT mode. Note that in a macro statement you do not enclose character strings in quotation marks. %let dsid=%sysfunc(open(master.prices,i)); %if (&dsid = 0) %then %put %sysfunc(sysmsg()); %else %put PRICES data set has been opened;
3 This example passes values from macro or DATA step variables to be used on data set options. It opens the data set SASUSER.HOUSES, and uses the WHERE= data set option to apply a permanent WHERE clause. Note that in a macro statement you do not enclose character strings in quotation marks. %let choice = style="RANCH"; %let dsid=%sysfunc(open(sasuser.houses (where=(&choice)),i));
3 This example shows how to check the returned value for errors and to write an error message from the SYSMSG function. data _null_; d=open(’bad’,’?’); if not d then do; m=sysmsg(); put m; abort; end;
Functions and CALL Routines
4
ORDINAL Function
951
... more SAS statements ...; run;
See Also Functions: “CLOSE Function” on page 563 “SYSMSG Function” on page 1114
ORDINAL Function Returns the kth smallest of the missing and nonmissing values. Category: Descriptive Statistics
Syntax ORDINAL(k,argument-1,argument-2< ,…argument-n>)
Arguments
k
is a numeric constant, variable, or expression with an integer value that is less than or equal to the number of subsequent elements in the list of arguments. argument
specifies a numeric constant, variable, or expression. At least two arguments are required. An argument can consist of a variable list, preceded by OF.
Details The ORDINAL function returns the kth smallest value, either missing or nonmissing, among the second through the last arguments.
Comparisons The ORDINAL function counts both missing and nonmissing values, whereas the SMALLEST function counts only nonmissing values.
Examples SAS Statements
Results
x1=ordinal(4,1,2,3,-4,5,6,7);
3
952
PATHNAME Function
4
Chapter 4
PATHNAME Function Returns the physical name of an external file or a SAS library, or returns a blank. Category:
SAS File I/O
Category:
External Files
See:
PATHNAME Function in the documentation for your operating environment.
Syntax PATHNAME((fileref | libref) )
Arguments
fileref
is a character constant, variable, or expression that specifies the fileref that is assigned to an external file. libref
is a character constant, variable, or expression that specifies the libref that is assigned to a SAS library. search-ref
is a character constant, variable, or expression that specifies whether to search for a fileref or a libref. F
specifies a search for a fileref.
L
specifies a search for a libref.
Details PATHNAME returns the physical name of an external file or SAS library, or blank if fileref or libref is invalid. If the name of a fileref is identical to the name of a libref, you can use the search-ref argument to choose which reference you want to search. If you specify a value of F, SAS searches for a fileref. If you specify a value of L, SAS searches for a libref. If you do not specify a search-ref argument, and the name of a fileref is identical to the name of a libref, PATHNAME searches first for a fileref. If a fileref does not exist, PATHNAME then searches for a libref. The default length of the target variable in the DATA step is 200 characters. You can assign a fileref to an external file by using the FILENAME statement or the FILENAME function. You can assign a libref to a SAS library using the LIBNAME statement or the LIBNAME function. Some operating environments allow you to assign a libref using system commands. Operating Environment Information: Under some operating environments, filerefs can also be assigned by using system commands. For details, see the SAS documentation for your operating environment. 4
Functions and CALL Routines
4
PCTL Function
953
Examples This example uses the FILEREF function to verify that the fileref MYFILE is associated with an external file. Then it uses PATHNAME to retrieve the actual name of the external file: data _null_; length fname $ 100; rc=fileref(’myfile’); if (rc=0) then do; fname=pathname(’myfile’); put fname=; end; run;
See Also Functions: “FEXIST Function” on page 663 “FILEEXIST Function” on page 665 “FILENAME Function” on page 666 “FILEREF Function” on page 669 Statements: “LIBNAME Statement” on page 1606 “FILENAME Statement” on page 1470
PCTL Function Returns the percentile that corresponds to the percentage. Category: Descriptive Statistics
Syntax PCTL(percentage, value1< , value2, ...>)
Arguments n
is a digit from 1 to 5 which specifies the definition of the percentile to be computed. Default: definition 5 percentage
is a numeric constant, variable, or expression that specifies the percentile to be computed. Requirement: is numeric where, 0 percentage 100.
954
PDF Function
4
Chapter 4
value
is a numeric variable, constant, or expression.
Details The PCTL function returns the percentile of the nonmissing values corresponding to the percentage. If percentage is missing, less than zero, or greater than 100, the PCTL function generates an error message. Note: The formula that is used in the PCTL function is the same formula that used in PROC UNIVARIATE. For more information, see “SAS Elementary Statistics Procedures” in Base SAS Procedures Guide. 4
Examples SAS Statements
Results
lower_quartile=PCTL(25,2,4,1,3); put lower_quartile;
1.5
percentile_def2=PCTL2(25,2,4,1,3); put percentile_def2;
1
lower_tertile=PCTL(100/3,2,4,1,3); put lower_tertile;
2
percentile_def3=PCTL3(100/3,2,4,1,3); put percentile_def3;
2
median=PCTL(50,2,4,1,3); put median;
2.5
upper_tertile=PCTL(200/3,2,4,1,3); put upper_tertile;
3
upper_quartile=PCTL(75,2,4,1,3); put upper_quartile;
3.5
PDF Function Returns a value from a probability density (mass) distribution. Category: Alias:
Probability
PMF
Syntax PDF (dist,quantile)
Functions and CALL Routines
4
PDF Function
955
Arguments
dist
is a character constant, variable, or expression that identifies the distribution. Valid distributions are as follows: Distribution
Argument
Bernoulli
BERNOULLI
Beta
BETA
Binomial
BINOMIAL
Cauchy
CAUCHY
Chi-Square
CHISQUARE
Exponential
EXPONENTIAL
F
F
Gamma
GAMMA
Geometric
GEOMETRIC
Hypergeometric
HYPERGEOMETRIC
Laplace
LAPLACE
Logistic
LOGISTIC
Lognormal
LOGNORMAL
Negative binomial
NEGBINOMIAL
Normal
NORMAL|GAUSS
Normal mixture
NORMALMIX
Pareto
PARETO
Poisson
POISSON
T
T
Uniform
UNIFORM
Wald (inverse Gaussian)
WALD|IGAUSS
Weibull
WEIBULL
Note: Except for T, F, and NORMALMIX, you can minimally identify any distribution by its first four characters. 4 quantile
is a numeric constant, variable, or expression that specifies the value of the random variable. parm-1…,parm-k
are optional numeric constants, variables, or expressions that specify the values of shape, location, or scale parameters that are appropriate for the specific distribution. See:
“Details” on page 956 for complete information about these parameters
956
PDF Function
4
Chapter 4
Details Bernoulli Distribution PDF(’BERNOULLI’,x,p) where x is a numeric random variable. p is a numeric probability of success. Range: 0 ≤ p ≤ 1 The PDF function for the Bernoulli distribution returns the probability density function of a Bernoulli distribution, with probability of success equal to p. The PDF function is evaluated at the value x. The equation follows:
8 > > <
0 10p P DF BERN ; x; p = 0 > > :p 0 00
Note:
1
0
x 0 l is the numeric left location parameter. Default: 0 r is the right location parameter. Default: 0 Range: r > l The PDF function for the beta distribution returns the probability density function of a beta distribution, with shape parameters a and b. The PDF function is evaluated at the value x. The equation follows:
00
P DF BET A ; x; a; b; l; r 0
1
8 <
0 1 = : a;b 0
(
0l)a01 (r0x)b01 a+b01 (r 0l )
(x )
x r
r
Functions and CALL Routines
Note:
0l is forced to be The quantity x r 0l
0 0
x
l
r
l
1 0 2.
4
PDF Function
957
4
Binomial Distribution PDF(’BINOMIAL’,m,p,n) where m is an integer random variable that counts the number of successes. Range: m = 0, 1, ... p is a numeric probability of success. Range: 0 ≤ p ≤ 1
n is an integer parameter that counts the number of independent Bernoulli trials. Range: n = 0, 1, ... The PDF function for the binomial distribution returns the probability density function of a binomial distribution, with parameters p and n, which is evaluated at the value m. The equation follows:
P DF
00
Note:
0
BIN OM ; m; p; n
1
=
8 n
There are no location or scale parameters for the binomial distribution.
Cauchy Distribution PDF(’CAUCHY’,x< ,, >) where x is a numeric random variable.
is a numeric location parameter. Default: 0
is a numeric scale parameter. Default: 1 Range: > 0 The PDF function for the Cauchy distribution returns the probability density function of a Cauchy distribution, with the location parameter and the scale parameter . The PDF function is evaluated at the value x. The equation follows:
P DF
00
CAU CHY
0
; x; ;
Chi-Square Distribution PDF(’CHISQUARE’,x,df )
1
=1
!
2
+ ( 0 )2 x
4
958
PDF Function
4
Chapter 4
where x is a numeric random variable. df is a numeric degrees of freedom parameter. Range: df > 0 nc is an optional numeric non-centrality parameter. Range: nc ≥ 0 The PDF function for the chi-square distribution returns the probability density function of a chi-square distribution, with df degrees of freedom and non-centrality parameter nc. The PDF function is evaluated at the value x. This function accepts non-integer degrees of freedom. If nc is omitted or equal to zero, the value returned is from the central chi-square distribution. The following equation describes the PDF function of the chi–square distribution,
0P x 0
The PDF function for the exponential distribution returns the probability density function of an exponential distribution, with the scale parameter . The PDF function is evaluated at the value x. The equation follows:
0
P DF EXP O ; x; 0
0
1
=
n0
1
exp
0
0 x
1 x 0
ddf is a numeric denominator degrees of freedom parameter. Range: ddf > 0
nc is a numeric non-centrality parameter. Range: nc ≥ 0
The PDF function for the F distribution returns the probability density function of an F distribution, with ndf numerator degrees of freedom, ddf denominator degrees of freedom, and non-centrality parameter nc. The PDF function is evaluated at the value x. This PDF function accepts non-integer degrees of freedom for ndf and ddf. If nc is omitted or equal to zero, the value returned is from a central F distribution. In the following equation, let $\nu_1$ = ndf, let $\nu_2$ = ddf, and let $\lambda$ = nc. The following equation describes the PDF function of the F distribution.
0P 1 0 ( )j P DF F ; x; v1 ; v2 ; = e j pf (f; v 00
0
1
x 0
is a numeric scale parameter.
4
960
PDF Function
4
Chapter 4
Default: 1 Range: > 0 The PDF function for the gamma distribution returns the probability density function of a gamma distribution, with the shape parameter a and the scale parameter . The PDF function is evaluated at the value x. The equation follows:
00
P DF GAMMA ; x; a; 0
1
> > > > > <
0
0
1
=
x < max (0; R + n 0 N )
N 0 R ox n0x max (0; R + n 0 N ) x min (R; n) min R;n P > R N 0 R > oj > > j n0j > > : j max ;R n0N 0 x > min (R; n) R x
(
=
(0
)
+
)
Laplace Distribution PDF(’LAPLACE’,x< ,, >) where x is a numeric random variable.
is a numeric location parameter. Default: 0
is a numeric scale parameter. Default: 1 Range: > 0 The PDF function for the Laplace distribution returns the probability density function of the Laplace distribution, with the location parameter and the scale parameter . The PDF function is evaluated at the value x. The equation follows:
P DF
00
LAP LACE ; x; ; 0
Logistic Distribution PDF(’LOGISTIC’,x< ,, >) where x is a numeric random variable.
is a numeric location parameter. Default: 0
is a numeric scale parameter. Default: 1
1
x 1 = 2 exp 0 j
0
j
962
PDF Function
4
Chapter 4
Range:
>0
The PDF function for the logistic distribution returns the probability density function of a logistic distribution, with the location parameter and the scale parameter . The PDF function is evaluated at the value x. The equation follows:
P DF
00
1 LOGIST IC ; x; ; 0
=
0 1 exp x0 0 0 x0 112 exp
1+
Lognormal Distribution PDF(’LOGNORMAL’,x) where x is a numeric random variable.
is a numeric location parameter. Default: 0
is a numeric scale parameter. Default: 1 Range: > 0
The PDF function for the lognormal distribution returns the probability density function of a lognormal distribution, with the location parameter and the scale parameter . The PDF function is evaluated at the value x. The equation follows:
P DF
00
1 LOGN 0 ; x; ; =
0
p1 x 2
exp
0
(log(x)0)2 22
0 x>0 x
Negative Binomial Distribution PDF(’NEGBINOMIAL’,m,p,n) where m is a positive integer random variable that counts the number of failures. Range: m= 0, 1, ...
p is a numeric probability of success. Range: 0 ≤ p ≤ 1
n is a numeric value that counts the number of successes. Range: n > 0
The PDF function for the negative binomial distribution returns the probability density function of a negative binomial distribution, with probability of success p and number of successes n. The PDF function is evaluated at the value m. The equation follows:
Functions and CALL Routines
00
P DF NEGB ; m; p; n 0
1
=
(
0
4
PDF Function
963
m) where x is a numeric random variable.
is a numeric location parameter. Default: 0
is a numeric scale parameter. Default: 1
>0
Range:
The PDF function for the normal distribution returns the probability density function of a normal distribution, with the location parameter and the scale parameter . The PDF function is evaluated at the value x. The equation follows:
1 ( x 0 )2 P DF NORMAL ; x; ; = p exp 0 22 2 00
0
1
Normal Mixture Distribution PDF(’NORMALMIX’,x,n,p,m,s) where x is a numeric random variable. n is the integer number of mixtures. Range: n = 1, 2, ...
p is the n proportions, p1 ; p2 ; . . . ; pn , where
i=1
Range: p = 0, 1, ...
m is the n means s
Pp
i=n
m1 ; m 2 ; . . . ; m n .
is the n standard deviations s1 ; s2 ; . . . ; sn . Range: s > 0
i = 1.
!
964
PDF Function
4
Chapter 4
The PDF function for the normal mixture distribution returns the probability that an observation from a mixture of normal distribution is less than or equal to x. The equation follows:
00
P DF NORMALMIX ; x; n; p; m; s 0
1
=
i=n X
=1
00
pi P DF NORMAL ; x; mi ; si 0
1
i
Note: There are no location or scale parameters for the normal mixture distribution. 4
Pareto Distribution PDF(’PARETO’,x,a) where x is a numeric random variable. a is a numeric shape parameter. Range: a > 0 k is a numeric scale parameter. Default: 1 Range: k > 0 The PDF function for the Pareto distribution returns the probability density function of a Pareto distribution, with the shape parameter a and the scale parameter k. The PDF function is evaluated at the value x. The equation follows:
00
P DF P ARET O ; x; a; k 0
1
= 00 a k
k x
1a+1
x 0 The PDF function for the Poisson distribution returns the probability density function of a Poisson distribution, with mean m. The PDF function is evaluated at the value n. The equation follows:
00
1
P DF P OISSON ; n; m 0
n = 0e0
n
mm
n!
n 0
nc is an optional numeric non-centrality parameter. The PDF function for the T distribution returns the probability density function of a T distribution, with degrees of freedom df and non-centrality parameter nc. The PDF function is evaluated at the value x. This PDF function accepts non-integer degrees of freedom. If nc is omitted or equal to zero, the value returned is from the central T distribution. In the following equation, let $\nu$ = df and let $\delta$ = nc.
1
Z x 1 1 x P DF T ; t; v; = v01 0 1 1 xv01 e0 x p e0 ( p 0 ) p dx v 2 2( ) 0 2 v 0 00
1
0
1 2 2
1 2
Note:
1 2
t
v
2
There are no location or scale parameters for the T distribution.
4
Uniform Distribution PDF(’UNIFORM’,x) where x is a numeric random variable. l is the numeric left location parameter. Default: 0
r is the numeric right location parameter. Default: 1 Range: r > l
The PDF function for the uniform distribution returns the probability density function of a uniform distribution, with the left location parameter l and the right location parameter r. The PDF function is evaluated at the value x. The equation follows:
P DF
00
U N IF ORM 0 ; x; l; r
1
=
(
01 r l 0 0
xr
r
966
PDF Function
4
Chapter 4
Wald (Inverse Gaussian) Distribution PDF(’WALD’,x,d) PDF(’IGAUSS’,x,d) where x is a numeric random variable. d is a numeric shape parameter. Range: d > 0 The PDF function for the Wald distribution returns the probability density function of a Wald distribution, with shape parameter d, which is evaluated at the value x. The equation follows:
00
P DF W ALD ; x; d Note:
0
1
=
(
0q
d
2x3
0 exp d x
02 + d 0
d1 2x
x0 x>0
There are no location or scale parameters for the Wald distribution.
4
Weibull Distribution PDF(’WEIBULL’,x,a) where x is a numeric random variable. a is a numeric shape parameter. Range: a > 0
is a numeric scale parameter. Default: 1 Range: > 0 The PDF function for the Weibull distribution returns the probability density function of a Weibull distribution, with the shape parameter a and the scale parameter . The PDF function is evaluated at the value x. The equation follows:
00
1
P DF WEIBULL ; x; a; 0
0 0 0 1a 1 0 1a01 x < 0 = exp x0 0 x a x
Examples SAS Statements
Results
y=pdf(’BERN’,0,.25);
0.75
y=pdf(’BERN’,1,.25);
0.25
y=pdf(’BETA’,0.2,3,4);
1.2288
Functions and CALL Routines
SAS Statements
Results
y=pdf(’BINOM’,4,.5,10);
0.20508
y=pdf(’CAUCHY’,2);
0.063662
y=pdf(’CHISQ’,11.264,11);
0.081686
y=pdf(’EXPO’,1);
0.36788
y=pdf(’F’,3.32,2,3);
0.054027
y=pdf(’GAMMA’,1,3);
0.18394
y=pdf(’HYPER’,2,200,50,10);
0.28685
y=pdf(’LAPLACE’,1);
0.18394
y=pdf(’LOGISTIC’,1);
0.19661
y=pdf(’LOGNORMAL’,1);
0.39894
y=pdf(’NEGB’,1,.5,2);
0.25
y=pdf(’NORMAL’,1.96);
0.058441
y=pdf(’NORMALMIX’,2.3,3,.33,.33,.34, .5,1.5,2.5,.79,1.6,4.3);
0.1166
y=pdf(’PARETO’,1,1);
1
y=pdf(’POISSON’,2,1);
0.18394
y=pdf(’T’,.9,5);
0.24194
y=pdf(’UNIFORM’,0.25);
1
y=pdf(’WALD’,1,2);
0.56419
y=pdf(’WEIBULL’,1,2);
0.73576
4
PEEK Function
See Also Functions: “LOGCDF Function” on page 879 “LOGPDF Function” on page 881 “LOGSDF Function” on page 882 “CDF Function” on page 540 “SDF Function” on page 1081 “QUANTILE Function” on page 1028
PEEK Function Stores the contents of a memory address in a numeric variable on a 32–bit platform. Category: Special Restriction:
Use on 32–bit platforms only.
967
968
PEEK Function
4
Chapter 4
Syntax PEEK(address)
Arguments
address
is a numeric constant, variable, or expression that specifies the memory address. length
is a numeric constant, variable, or expression that specifies the data length. Default: a 4-byte address pointer Range:
2 to 8
Details If you do not have access to the memory storage location that you are requesting, the PEEK function returns an "Invalid argument" error. You cannot use the PEEK function on 64-bit platforms. If you attempt to use it, SAS writes a message to the log stating that this restriction applies. If you have legacy applications that use PEEK, change the applications and use PEEKLONG instead. You can use PEEKLONG on both 32–bit and 64–bit platforms.
Comparisons The PEEK function stores the contents of a memory address into a numeric variable. The PEEKC function stores the contents of a memory address into a character variable. Note: SAS recommends that you use PEEKLONG instead of PEEK because PEEKLONG can be used on both 32–bit and 64–bit platforms. 4
Examples The following example, specific to the z/OS operating environment, returns a numeric value that represents the address of the Communication Vector Table (CVT). data _null_; /* 16 is the location of the CVT address */ y=16; x=peek(y); put ’x= ’ x hex8.; run;
See Also Functions: “ADDR Function” on page 360 “PEEKC Function” on page 969 CALL Routine: “CALL POKE Routine” on page 462
Functions and CALL Routines
4
PEEKC Function
969
PEEKC Function Stores the contents of a memory address in a character variable on a 32–bit platform. Category: Special Restriction:
Use on 32–bit platforms only.
Syntax PEEKC(address)
Arguments address
is a numeric constant, variable, or expression that specifies the memory address. length
is a numeric constant, variable, or expression that specifies the data length. Default: 8, unless the variable length has already been set (by the LENGTH statement, for example) Range: 1 to 32,767
Details If you do not have access to the memory storage location that you are requesting, the PEEKC function returns an "Invalid argument" error. You cannot use the PEEKC function on 64-bit platforms. If you attempt to use it, SAS writes a message to the log stating that this restriction applies. If you have legacy applications that use PEEKC, change the applications and use PEEKCLONG instead. You can use PEEKCLONG on both 32–bit and 64–bit platforms.
Comparisons The PEEKC function stores the contents of a memory address into a character variable. The PEEK function stores the contents of a memory address into a numeric variable. Note: SAS recommends that you use PEEKCLONG instead of PEEKC because PEEKCLONG can be used on both 32–bit and 64–bit platforms. 4
Examples Example 1: Listing ASCB Bytes
The following example, specific to the z/OS operating environment, uses both PEEK and PEEKC, and prints the first four bytes of the Address Space Control Block (ASCB). data _null_; length y $4; /* 220x is the location of the ASCB pointer */ x=220x; y=peekc(peek(x));
970
PEEKC Function
4
Chapter 4
put ’y= ’ y; run;
Example 2: Creating a DATA Step View This example, specific to the z/OS operating environment, also uses both the PEEK and PEEKC functions. It creates a DATA step view that accesses the entries in the Task Input Output Table (TIOT). The PRINT procedure is then used to print the entries. Entries in the TIOT include the three components outlined in the following list. In this example, TIOT represents the starting address of the TIOT entry. TIOT+4
is the ddname. This component takes up 8 bytes.
TIOT+12
is a 3-byte pointer to the Job File Control Block (JFCB).
TIOT+134
is the volume serial number (volser) of the data set. This component takes up 6 bytes.
Here is the program: /* /* /* /* /* /*
Create a DATA step view of the contents of the TIOT. The code steps through each TIOT entry to extract the ddname, JFCB, and volser of each ddname that has been allocated for the current task. The data set name is also extracted from the JFCB.
data save.tiot/view=save.tiot; length ddname $8 volser $6 dsname $44; /* Get the TCB (Task Control Block)address /* from the PSATOLD variable in the PSA /* (Prefixed Save Area). The address of /* the PSA is 21CX. Add 12 to the address /* of the TCB to get the address of the /* TIOT. Add 24 to bypass the 24-byte /* header, so that TIOTVAR represents the /* start of the TIOT entries.
*/ */ */ */ */ */
*/ */ */ */ */ */ */ */
tiotvar=peek(peek(021CX)+12)+24; /* Loop through all TIOT entries until the */ /* TIOT entry length (indicated by the */ /* value of the first byte) is 0. */ do while(peek(tiotvar,1)); /* /* /* /* /*
Check to see whether the current TIOT entry is a freed TIOT entry (indicated by the high order bit of the second byte of the TIOT entry). If it is not freed, then proceed.
*/ */ */ */ */
if peek(tiotvar+1,1)NE’l.......’B then do; ddname=peekc(tiotvar+4); jfcb=peek(tiotvar+12,3); volser=peekc(jfcb+134); /* Add 16 to the JFCB value to get */ /* the data set name. The data set */
Functions and CALL Routines
/* name is 44 bytes.
4
PEEKCLONG Function
971
*/
dsname=peekc(jfcb+16); output; end; /* /* /* /* /* /*
Increment the TIOTVAR value to point to the next TIOT entry. This is done by adding the length of the current TIOT entry (indicated by first byte of the entry) to the current value of TIOTVAR.
*/ */ */ */ */ */
tiotvar+peek(tiotvar,1); end; /* The final DATA step view does not */ /* contain the TIOTVAR and JFCB variables. */ keep ddname volser dsname; run; /* Print the TIOT entries. */ proc print data=save.tiot uniform width=minimum; run;
In the PROC PRINT statement, the UNIFORM option ensures that each page of the output is formatted exactly the same way. WIDTH=MINIMUM causes the PRINT procedure to use the minimum column width for each variable on the page. The column width is defined by the longest data value in that column.
See Also CALL Routine: “CALL POKE Routine” on page 462 Functions: “ADDR Function” on page 360 “PEEK Function” on page 967
PEEKCLONG Function Stores the contents of a memory address in a character variable on 32-bit and 64-bit platforms. Category: Special See:
PEEKCLONG Function in the documentation for your operating environment.
Syntax PEEKCLONG(address< ,length>)
972
PEEKCLONG Function
4
Chapter 4
Arguments address
specifies a character constant, variable, or expression that contains the binary pointer address. length
is a numeric constant, variable, or expression that specifies the length of the character data. Default: 8 Range:
1 to 32,767
Details If you do not have access to the memory storage location that you are requesting, the PEEKCLONG function returns an “Invalid argument” error.
Comparisons The PEEKCLONG function stores the contents of a memory address in a character variable. The PEEKLONG function stores the contents of a memory address in a numeric variable. It assumes that the input address refers to an integer in memory.
Examples Example 1: Example for a 32-bit Platform
The following example returns the pointer
address for the character variable Z. data _null_; x=’ABCDE’; y=addrlong(x); z=peekclong(y,2); put z=; run;
The output from the SAS log is: z=AB
Example 2: Example for a 64-bit Platform The following example, specific to the z/OS operating environment, returns the pointer address for the character variable Y. data _null_; length y $4; x220addr=put(220x,pib4.); ascb=peeklong(x220addr); ascbaddr=put(ascb,pib4.); y=peekclong(ascbaddr); run;
The output from the SAS log is: y=’ASCB’
Functions and CALL Routines
4
PEEKLONG Function
973
See Also Function: “PEEKLONG Function” on page 973
PEEKLONG Function Stores the contents of a memory address in a numeric variable on 32-bit and 64-bit platforms. Category: Special See:
PEEKLONG Function in the documentation for your operating environment
Syntax PEEKLONG(address)
Arguments address
specifies a character constant, variable, or expression that contains the binary pointer address. length
is a numeric constant, variable, or expression that specifies the length of the character data. Default: 4 on 32-bit computers; 8 on 64-bit computers. Range: 1-4 on 32-bit computers; 1-8 on 64-bit computers.
Details If you don’t have access to the memory storage location that you are requesting, the PEEKLONG function returns an “Invalid argument” error.
Comparisons The PEEKLONG function stores the contents of a memory address in a numeric variable. It assumes that the input address refers to an integer in memory. The PEEKCLONG function stores the contents of a memory address in a character variable. It assumes that the input address refers to character data.
Examples Example 1: Example for a 32-bit Platform address for the numeric variable Z. data _null_; length y $4; y=put(1,IB4.);
The following example returns the pointer
974
PERM Function
4
Chapter 4
addry=addrlong(y); z=peeklong(addry,4); put z=; run;
The output from the SAS log is: z=1
Example 2: Example for a 64-bit Platform The following example, specific to the z/OS operating environment, returns the pointer address for the numeric variable X. data _null_; x=peeklong(put(16,pib4.)); put x=hex8.; run;
The output from the SAS log is: x=00FCFCB0
See Also Function: “PEEKCLONG Function” on page 971
PERM Function Computes the number of permutations of n items that are taken r at a time. Category:
Combinatorial
Syntax PERM(n)
Arguments n
is an integer that represents the total number of elements from which the sample is chosen. r
is an integer value that represents the number of chosen elements. If r is omitted, the function returns the factorial of n. Restriction: r ≤ n
Functions and CALL Routines
4
POINT Function
Details The mathematical representation of the PERM function is given by the following equation:
P ERM (n; r) =
(n
n!
0 r )!
with n ≥ 0, r ≥ 0, and n≥ r. If the expression cannot be computed, a missing value is returned. For moderately large values, it is sometimes not possible to compute the PERM function.
Examples SAS Statements
Results
x=perm(5,1);
5
x=perm(5);
120
x=perm(5,2)
20
See Also Functions: “COMB Function” on page 570 “FACT Function” on page 654 “LPERM Function” on page 885
POINT Function Locates an observation that is identified by the NOTE function. Category: SAS File I/O
Syntax POINT(data-set-id,note-id)
Arguments data-set-id
is a numeric variable that specifies the data set identifier that the OPEN function returns.
975
976
POISSON Function
4
Chapter 4
note-id
is a numeric variable that specifies the identifier assigned to the observation by the NOTE function.
Details POINT returns 0 if the operation was successful, ≠0 if it was not successful. POINT prepares the program to read from the SAS data set. The Data Set Data Vector is not updated until a read is done using FETCH or FETCHOBS.
Examples This example calls NOTE to obtain an observation ID for the most recently read observation of the SAS data set MYDATA. It calls POINT to point to that observation, and calls FETCH to return the observation marked by the pointer. %let dsid=%sysfunc(open(mydata,i)); %let rc=%sysfunc(fetch(&dsid)); %let noteid=%sysfunc(note(&dsid)); ...more macro statements... %let rc=%sysfunc(point(&dsid,¬eid)); %let rc=%sysfunc(fetch(&dsid)); ...more macro statements... %let rc=%sysfunc(close(&dsid));
See Also Functions: “DROPNOTE Function” on page 643 “NOTE Function” on page 925 “OPEN Function” on page 948
POISSON Function Returns the probability from a Poisson distribution. Probability “CDF Function” on page 540
Category: See:
Syntax POISSON(m,n)
Arguments m
is a numeric mean parameter.
Functions and CALL Routines
4
PROBBETA Function
977
Range: m ≥ 0 n
is an integer random variable. Range: n ≥ 0
Details The POISSON function returns the probability that an observation from a Poisson distribution, with mean m, is less than or equal to n. To compute the probability that an observation is equal to a given value, n, compute the difference of two probabilities from the Poisson distribution for n and n−1.
Examples SAS Statements
Results
x=poisson(1,2);
0.9196986029
See Also Functions: “CDF Function” on page 540 “LOGCDF Function” on page 879 “LOGPDF Function” on page 881 “LOGSDF Function” on page 882 “PDF Function” on page 954 “SDF Function” on page 1081
PROBBETA Function Returns the probability from a beta distribution. Category: Probability See:
“CDF Function” on page 540
Syntax PROBBETA(x,a,b)
Arguments x
is a numeric random variable.
978
PROBBNML Function
4
Range:
Chapter 4
0≤x≤1
a
is a numeric shape parameter. Range:
a>0
b
is a numeric shape parameter. Range:
b>0
Details The PROBBETA function returns the probability that an observation from a beta distribution, with shape parameters a and b, is less than or equal to x.
Examples SAS Statements
Results
x=probbeta(.2,3,4);
0.09888
See Also Functions: “CDF Function” on page 540 “LOGCDF Function” on page 879 “LOGPDF Function” on page 881 “LOGSDF Function” on page 882 “PDF Function” on page 954 “SDF Function” on page 1081
PROBBNML Function Returns the probability from a binomial distribution. Category: See:
Probability
“CDF Function” on page 540, “PDF Function” on page 954
Syntax PROBBNML(p,n,m)
Functions and CALL Routines
4
PROBBNRM Function
979
Arguments
p
is a numeric probability of success parameter. RANGE: 0 ≤ p ≤ 1 n
is an integer number of independent Bernoulli trials parameter. RANGE: n > 0 m
is an integer number of successes random variable. RANGE: 0 ≤ m ≤ n
Details The PROBBNML function returns the probability that an observation from a binomial distribution, with probability of success p, number of trials n, and number of successes m, is less than or equal to m. To compute the probability that an observation is equal to a given value m, compute the difference of two probabilities from the binomial distribution for m and m−1 successes.
Examples SAS Statements
Results
x=probbnml(0.5,10,4);
0.376953125
See Also Functions: “CDF Function” on page 540 “LOGCDF Function” on page 879 “LOGPDF Function” on page 881 “LOGSDF Function” on page 882 “PDF Function” on page 954 “SDF Function” on page 1081
PROBBNRM Function Returns a probability from a bivariate normal distribution. Category: Probability
980
4
PROBBNRM Function
Chapter 4
Syntax PROBBNRM(x, y, r)
Arguments x
specifies a numeric constant, variable, or expression. y
specifies a numeric constant, variable, or expression. r
is a numeric correlation coefficient. Range:
-1 ≤ r ≤ 1
Details The PROBBNRM function returns the probability that an observation (X, Y) from a standardized bivariate normal distribution with mean 0, variance 1, and a correlation coefficient r, is less than or equal to (x, y). That is, it returns the probability that X≤x and Y≤y. The following equation describes the PROBBNRM function, where u and v represent the random variables x and y, respectively:
Zx Zy
1 u 2 0 2ruv + v 2 PROBBNRM (x; y; r ) = p exp 0 2 (1 0 r 2 ) 2 1 0 r2 0101 Examples SAS Statements
Result
p=probbnrm(.4, -.3, .2); put p;
0.2783183345
See Also Functions: “CDF Function” on page 540 “LOGCDF Function” on page 879 “LOGPDF Function” on page 881 “LOGSDF Function” on page 882 “PDF Function” on page 954 “SDF Function” on page 1081
dv du
Functions and CALL Routines
4
PROBCHI Function
981
PROBCHI Function Returns the probability from a chi-square distribution. Category: Probability See:
“CDF Function” on page 540
Syntax PROBCHI(x,df< ,nc>)
Arguments x
is a numeric random variable. Range: x ≥ 0 df
is a numeric degrees of freedom parameter. Range: df > 0 nc
is an optional numeric noncentrality parameter. Range: nc ≥ 0
Details The PROBCHI function returns the probability that an observation from a chi-square distribution, with degrees of freedom df and noncentrality parameter nc, is less than or equal to x. This function accepts a noninteger degrees of freedom parameter df. If the optional parameter nc is not specified or has the value 0, the value returned is from the central chi-square distribution.
Examples SAS Statements
Results
x=probchi(11.264,11);
0.5785813293
See Also Functions: “CDF Function” on page 540 “LOGCDF Function” on page 879 “LOGPDF Function” on page 881 “LOGSDF Function” on page 882
982
PROBF Function
4
Chapter 4
“PDF Function” on page 954 “SDF Function” on page 1081
PROBF Function Returns the probability from an F distribution. Probability “CDF Function” on page 540
Category: See:
Syntax PROBF(x,ndf,ddf)
Arguments x
is a numeric random variable. Range: x ≥ 0 ndf
is a numeric numerator degrees of freedom parameter. Range: ndf > 0 ddf
is a numeric denominator degrees of freedom parameter. Range: ddf > 0 nc
is an optional numeric noncentrality parameter. Range: nc ≥ 0
Details The PROBF function returns the probability that an observation from an F distribution, with numerator degrees of freedom ndf, denominator degrees of freedom ddf, and noncentrality parameter nc, is less than or equal to x. The PROBF function accepts noninteger degrees of freedom parameters ndf and ddf. If the optional parameter nc is not specified or has the value 0, the value returned is from the central F distribution. The significance level for an F test statistic is given by p=1-probf(x,ndf,ddf);
Examples SAS Statements
Results
x=probf(3.32,2,3);
0.8263933602
Functions and CALL Routines
4
PROBGAM Function
See Also Functions: “CDF Function” on page 540 “LOGCDF Function” on page 879 “LOGPDF Function” on page 881 “LOGSDF Function” on page 882 “PDF Function” on page 954 “SDF Function” on page 1081
PROBGAM Function Returns the probability from a gamma distribution. Category: Probability See:
“CDF Function” on page 540
Syntax PROBGAM(x,a)
Arguments
x
is a numeric random variable. Range:
x≥0
a
is a numeric shape parameter. Range: a > 0
Details The PROBGAM function returns the probability that an observation from a gamma distribution, with shape parameter a, is less than or equal to x.
Examples SAS Statements
Results
x=probgam(1,3);
0.0803013971
983
984
PROBHYPR Function
4
Chapter 4
See Also Functions: “CDF Function” on page 540 “LOGCDF Function” on page 879 “LOGPDF Function” on page 881 “LOGSDF Function” on page 882 “PDF Function” on page 954 “SDF Function” on page 1081
PROBHYPR Function Returns the probability from a hypergeometric distribution. Category: See:
Probability
“CDF Function” on page 540
Syntax PROBHYPR(N,K,n,x)
Arguments N
is an integer population size parameter, with N ≥ 1. Range: K
is an integer number of items in the category of interest parameter. Range:
0≤K≤N
n
is an integer sample size parameter. Range:
0≤n≤N
x
is an integer random variable. Range:
max(0, K + n−N) ≤ x ≤ min(K,n)
r
is an optional numeric odds ratio parameter. Range: r ≥ 0
Details The PROBHYPR function returns the probability that an observation from an extended hypergeometric distribution, with population size N, number of items K, sample size n,
Functions and CALL Routines
4
PROBIT Function
985
and odds ratio r, is less than or equal to x. If the optional parameter r is not specified or is set to 1, the value returned is from the usual hypergeometric distribution.
Examples SAS Statements
Results
x=probhypr(200,50,10,2);
0.5236734081
See Also Functions: “CDF Function” on page 540 “LOGCDF Function” on page 879 “LOGPDF Function” on page 881 “LOGSDF Function” on page 882 “PDF Function” on page 954 “SDF Function” on page 1081
PROBIT Function Returns a quantile from the standard normal distribution. Category: Quantile
Syntax PROBIT(p)
Arguments p
is a numeric probability. Range: 0 < p < 1
Details th
The PROBIT function returns the p quantile from the standard normal distribution. The probability that an observation from the standard normal distribution is less than or equal to the returned quantile is p. CAUTION:
4 PROBIT is the inverse of the PROBNORM function. 4
The result could be truncated to lie between -8.222 and 7.941. Note:
986
PROBMC Function
4
Chapter 4
Examples SAS Statements
Results
x=probit(.025);
-1.959963985
x=probit(1.e-7);
-5.199337582
See Also Functions: “CDF Function” on page 540 “LOGCDF Function” on page 879 “LOGPDF Function” on page 881 “LOGSDF Function” on page 882 “PDF Function” on page 954 “SDF Function” on page 1081
PROBMC Function Returns a probability or a quantile from various distributions for multiple comparisons of means. Category:
Probability
Syntax PROBMC(distribution, q, prob, df, nparms)
Arguments distribution
is a character constant, variable, or expression that identifies the distribution. The following are valid distributions:
Distribution
Argument
Analysis of Means
ANOM
One-sided Dunnett
DUNNETT1
Two-sided Dunnett
DUNNETT2
Maximum Modulus
MAXMOD
Partitioned Range
PARTRANGE
4
Functions and CALL Routines
Distribution
Argument
Studentized Range
RANGE
Williams
WILLIAMS
PROBMC Function
987
q
is the quantile from the distribution. Restriction: Either q or prob can be specified, but not both. prob
is the left probability from the distribution. Restriction: Either prob or q can be specified, but not both. df
is the degrees of freedom. Note: A missing value is interpreted as an infinite value.
4
nparms
is the number of treatments. Note: For DUNNETT1 and DUNNETT2, the control group is not counted.
4
parameters
is an optional set of nparms parameters that must be specified to handle the case of unequal sample sizes. The meaning of parameters depends on the value of distribution. If parameters is not specified, equal sample sizes are assumed, which is usually the case for a null hypothesis.
Details The PROBMC function returns the probability or the quantile from various distributions with finite and infinite degrees of freedom for the variance estimate. The prob argument is the probability that the random variable is less than q. Therefore, p-values can be computed as 1– prob. For example, to compute the critical value for a 5% significance level, set prob= 0.95. The precision of the computed —8 —5 probability is O(10 ) (absolute error); the precision of computed quantile is O(10 ). Note: The studentized range is not computed for finite degrees of freedom and unequal sample sizes. 4 Note:
Williams’ test is computed only for equal sample sizes.
Formulas and Parameters
4
The equations listed here define expressions used in equations that relate the probability, prob, and the quantile, q, for different distributions and different situations within each distribution. For these equations, let be the degrees of freedom, df.
d (x) =
0
2 0 1 2
(x) =
22
01 0 01 x e
p12 e0
x2 2
x2 2
dx
988
PROBMC Function
4
Chapter 4
8 (x) =
Zx
(u) du
01
p
Computing the Analysis of Means
Analysis of Means (ANOM) applies to data that is th . The organized as k (Gaussian) samples, the i sample being of size ni. Let I distribution function [1, 2, 3, 4, 5] is the CDF for the maximum absolute of a k-dimensional multivariate vector, with degrees of freedom, and an associated correlation matrix ij i j . This equation can be written as
= 01
T
=0
prob = Pr 8 (jt1j < h; jt2 j < h; :::; jtk j < h9) 1 Z d, with d being any real number.
Pr (Yk > d) = Pr (U1 > d) + Pr (U2 > d; U1 < d) + Pr (U3 > d; U2 < d; U1 < d) + ... + Pr (Uk > d; Uk01 < d; ; U1 < d Yk01 > d Xk k 0 Uk01 > kd ...
= Pr (
) + Pr (
+(
)
1)
)
To compute this probability, start from an N(0,1) density function
D (U1
=
x) = (x)
and recursively compute the convolution
994
PROBMC Function
4
Chapter 4
D (Uk = x; Uk01 < d; . . . ;U1 < d) = Zd
01
D (Uk01 = y;Uk02 < d; . . . ;U1 < d) (k 0 1) (kx 0 (k 0 1) y) dy
From this sequential convolution, it is possible to compute all the elements of , shown previously. the recursive equation for k 2 Compute the distribution of Yk – Z. This computation involves another convolution to compute the probability
Pr (Y < d)
Pr ((Yk 0 Z ) > d) =
Z1
01
p
Pr Yk > 2d + y (y) dy
3 Compute the distribution of (Yk – Z)/S. This computation involves another
convolution to compute the probability
Z1
Pr ((Yk 0 Z ) > tS ) = Pr ((Yk 0 Z ) > ty) d (y) 0
The third stage is not needed when = ∞. Due to the complexity of the operations, this lengthy algorithm is replaced by a much faster one when k ≤ 15 for both finite and infinite degrees of freedom . For k ≥ 16, the lengthy computation is carried out. It is extremely expensive and very slow due to the complexity of the algorithm.
Comparisons The MEANS statement in the GLM Procedure of SAS/STAT Software computes the following tests: 3 Dunnett’s one-sided test 3 Dunnett’s two-sided test 3 Studentized Range
Examples Example 1: Computing Probabilities by Using PROBMC compute probabilities. data probs; array par{5}; par{1}=.5; par{2}=.51; par{3}=.55; par{4}=.45; par{5}=.2; df=40; q=1; do test="dunnett1","dunnett2", "maxmod";
This example shows how to
Functions and CALL Routines
4
PROBMC Function
prob=probmc(test, q, ., df, 5, of par1--par5); put test $10. df q e18.13 prob e18.13; end; run;
SAS writes the following results to the log: Output 4.65 DUNNETT1 DUNNETT2 MAXMOD
Probabilities from PROBMC 40 40 40
1.00000000000E+00 4.82992196083E-01 1.00000000000E+00 1.64023105316E-01 1.00000000000E+00 8.02784203408E-01
Example 2: Computing the Analysis of Means data _null_; q1=probmc(’anom’,.,0.9,.,20); q2=probmc(’anom’,.,0.9,20,5,0.1,0.1,0.1,0.1,0.1); q3=probmc(’anom’,.,0.9,20,5,0.5,0.5,0.5,0.5,0.5); q4=probmc(’anom’,.,0.9,20,5,0.1,0.2,0.3,0.4,0.5); run;
put put put put
q1=; q2=; q3=; q4=;
SAS writes the following output to the log: q1=2.7892895753 q2=2.4549773558 q3=2.4549773558 q4=2.4532130238
Example 3: Comparing Means
This example shows how to compare group means to find where the significant differences lie. The data for this example is taken from a paper by Duncan (1955) (See “References” on page 1213) and can also be found in Hochberg and Tamhane (1987) (See “References” on page 1213). The following values are the group means: 49.6 71.2 67.6 61.5 71.3 58.1 61.0 For this data, the mean square error is s = 79.64 (s = 8.924) with = 30. 2
data duncan; array tr{7}$; array mu{7}; n=7; do i=1 to n; input tr{i} $1. mu{i}; end; input df s alpha; prob= 1-alpha; /* compute the interval */ x = probmc("RANGE", ., prob, df, 7);
995
996
PROBMC Function
4
Chapter 4
w = x * s / sqrt(6); /* compare the means */ do i = 1 to n; do j = i + 1 to n; dmean = abs(mu{i} - mu{j}); if dmean >= w then do; put tr{i} tr{j} dmean; end; end; end; datalines; A 49.6 B 71.2 C 67.6 D 61.5 E 71.3 F 58.1 G 61.0 30 8.924 .05 ; run;
SAS writes the following output to the log: Output 4.66
Group Differences
A B 21.6 A C 18 A E 21.7
Example 4: Computing the Partitioned Range data _null_; q1=probmc(’partrange’,.,0.9,.,4,3,4,5,6); put q1=; q2=probmc(’partrange’,.,0.9,12,4,3,4,5,6); put q2=; run;
SAS writes the following output to the log: q1=4.1022395729 q2=4.788862411
Example 5: Computing Confidence Intervals This example shows how to compute 95% one-sided and two-sided confidence intervals of Dunnett’s test. This example and the data come from Dunnett (1955) (See “References” on page 1213) and can also be found in Hochberg and Tamhane (1987) (See “References” on page 1213). The data are blood count measurements on three groups of animals. As shown in the following table, the third group serves as the control, while the first two groups were treated with different drugs. The numbers of animals in these three groups are unequal. Treatment Group:
Drug A
Drug B
Control
9.76
12.80
7.40
8.80
9.68
8.50
7.68
12.16
7.20
Functions and CALL Routines
Treatment Group:
4
PROBMC Function
Drug A
Drug B
Control
9.36
9.20
8.24
10.55
9.84 8.32
Group Mean
8.90
10.88
8.25
n
4
5
6
2
The mean square error s = 1.3805 (s = 1.175) with = 12. data a; array array array array array array array
drug{3}$; count{3}; mu{3}; lambda{2}; delta{2}; left{2}; right{2};
/* input the table */ do i = 1 to 3; input drug{i} count{i} mu{i}; end; /* /* /* input
input the alpha level, */ the degrees of freedom, */ and the mean square error */ alpha df s;
/* from the sample size, */ /* compute the lambdas */ do i = 1 to 2; lambda{i} = sqrt(count{i}/ (count{i} + count{3})); end; /* run the one-sided Dunnett’s test */ test="dunnett1"; x = probmc(test, ., 1 - alpha, df, 2, of lambda1-lambda2); do i = 1 to 2; delta{i} = x * s * sqrt(1/count{i} + 1/count{3}); left{i} = mu{i} - mu{3} - delta{i}; end; put test $10. x left{1} left{2}; /* run the two-sided Dunnett’s test */ test="dunnett2"; x = probmc(test, ., 1 - alpha, df, 2, of lambda1-lambda2); do i=1 to 2;
997
998
PROBMC Function
4
Chapter 4
delta{i} = x * s * sqrt(1/count{i} + 1/count{3}); left{i} = mu{i} - mu{3} - delta{i}; right{i} = mu{i} - mu{3} + delta{i}; end; put test $10. left{1} right{1}; put test $10. left{2} right{2}; datalines; A 4 8.90 B 5 10.88 C 6 8.25 0.05 12 1.175 ; run;
SAS writes the following output to the log: Output 4.67 DUNNETT1 DUNNETT2 DUNNETT2
Confidence Intervals 2.1210786586 -0.958751705 1.1208571303 -1.256411895 2.5564118953 0.8416271203 4.4183728797
Example 6: Computing Williams’ Test
In the following example, a substance has been tested at seven levels in a randomized block design of eight blocks. The observed treatment means are as follows:
Treatment
Mean
X0
10.4
X1
9.9
X2
10.0
X3
10.6
X4
11.4
X5
11.9
X6
11.7
2
The mean square, with (7 – 1)(8 – 1) = 42 degrees of freedom, is s = 1.16. Determine the maximum likelihood estimates Mi through the averaging process. 3 Because X0 > X1, form X0,1 = (X0 + X1)/2 = 10.15. 3 Because X0,1 > X2, form X0,1,2 = (X0 + X1 + X2)/3 = (2X0,1 + X2)/3 = 10.1. 3 X0,1,2 < X3 < X4 < X5 3 Because X5 > X6, form X5,6 = (X5 + X6)/2 = 11.8. Now the order restriction is satisfied. The maximum likelihood estimates under the alternative hypothesis are: M0 = M1 = M2 = X0,1,2 = 10.1 M3 = X3 = 10.6 M4 = X4 = 11.4
Functions and CALL Routines
M5 = M6 = X5,6 = 11.8
= (11 8 0 10 4)
p2
4
PROBNEGB Function
999
8 = 2 60
Now compute t : : = s2 = : , and the probability that corresponds to k = 6, = 42, and t = 2.60 is .9924467341, which shows strong evidence that there is a response to the substance. You can also compute the quantiles for the upper 5% and 1% tails, as shown in the following table.
SAS Statements
Results
prob=probmc("williams",2.6,.,42,6);
0.99244673
quant5=probmc("williams",.,.95,42,6);
1.80654052
quant1=probmc("williams",.,.99,42,6);
2.49087829
See Also Functions: “CDF Function” on page 540 “LOGCDF Function” on page 879 “LOGPDF Function” on page 881 “LOGSDF Function” on page 882 “PDF Function” on page 954 “SDF Function” on page 1081
References Guirguis, G. H. and R. D. Tobias. 2004. “On the computation of the distribution for the analysis of means.” Communications in Statistics: Simulation and Computation 33: 861–887. Nelson, P. R. 1981. “Numerical evaluation of an equicorrelated multivariate non-central t distribution.” Communications in Statistics: Part B - Simulation and Computation 10: 41–50. Nelson, P. R. 1982. “Exact critical points for the analysis of means.” Communications in Statistics: Part A - Theory and Methods 11: 699–709. Nelson, P. R. 1982a. “An Approximation for the Complex Normal Probability Integral.” BIT 22(1): 94–100. Nelson, P. R. 1988. “Application of the analysis of means.” Proceedings of the SAS Users Group International Conference 13: 225–230. Nelson, P. R. 1991. “Numerical evaluation of multivariate normal integrals with correlations lj =0l j .” The Frontiers of Statistical Scientific Theory and Industrial Applications 2: 97–114. Nelson, P. R. 1993. “Additional Uses for the Analysis of Means and Extended Tables of Critical Values.” Technometrics 35: 61–71.
PROBNEGB Function Returns the probability from a negative binomial distribution.
1000
4
PROBNEGB Function
Probability
Category: See:
Chapter 4
“CDF Function” on page 540
Syntax PROBNEGB(p,n,m)
Arguments
p
is a numeric probability of success parameter. Range:
0≤p≤1
n
is an integer number of successes parameter. Range:
n≥1
m
is a positive integer random variable, the number of failures. Range:
m≥0
Details The PROBNEGB function returns the probability that an observation from a negative binomial distribution, with probability of success p and number of successes n, is less than or equal to m. To compute the probability that an observation is equal to a given value m, compute the difference of two probabilities from the negative binomial distribution for m and m−1.
Examples SAS Statements
Results
x=probnegb(0.5,2,1);
0.5
See Also Functions: “CDF Function” on page 540 “LOGCDF Function” on page 879 “LOGPDF Function” on page 881 “LOGSDF Function” on page 882 “PDF Function” on page 954 “SDF Function” on page 1081
Functions and CALL Routines
4
PROBT Function
PROBNORM Function Returns the probability from the standard normal distribution. Category: Probability See:
“CDF Function” on page 540
Syntax PROBNORM(x)
Arguments x
is a numeric random variable.
Details The PROBNORM function returns the probability that an observation from the standard normal distribution is less than or equal to x. Note:
PROBNORM is the inverse of the PROBIT function.
Examples SAS Statements
Results
x=probnorm(1.96);
0.9750021049
See Also Functions: “CDF Function” on page 540 “LOGCDF Function” on page 879 “LOGPDF Function” on page 881 “LOGSDF Function” on page 882 “PDF Function” on page 954 “SDF Function” on page 1081
PROBT Function Returns the probability from a t distribution.
4
1001
1002
PROBT Function
4
Chapter 4
Probability
Category: See:
“CDF Function” on page 540, “PDF Function” on page 954
Syntax PROBT(x,df< ,nc>)
Arguments
x
is a numeric random variable. df
is a numeric degrees of freedom parameter. Range:
df > 0
nc
is an optional numeric noncentrality parameter.
Details The PROBT function returns the probability that an observation from a Student’s t distribution, with degrees of freedom df and noncentrality parameter nc, is less than or equal to x. This function accepts a noninteger degree of freedom parameter df. If the optional parameter, nc, is not specified or has the value 0, the value that is returned is from the central Student’s t distribution. The significance level of a two-tailed t test is given by p=(1-probt(abs(x),df))*2;
Examples SAS Statements
Results
x=probt(0.9,5);
0.7953143998
See Also Functions: “CDF Function” on page 540 “LOGCDF Function” on page 879 “LOGPDF Function” on page 881 “LOGSDF Function” on page 882 “PDF Function” on page 954 “SDF Function” on page 1081
Functions and CALL Routines
4
PROPCASE Function
1003
PROPCASE Function Converts all words in an argument to proper case. Category: Character Restriction:
“I18N Level 2” on page 306
Syntax PROPCASE(argument )
Arguments argument
specifies a character constant, variable, or expression. delimiter
specifies one or more delimiters that are enclosed in quotation marks. The default delimiters are blank, forward slash, hyphen, open parenthesis, period, and tab. If you use this argument, then the default delimiters, including the blank, are no longer in effect.
Tip:
Details Length of Returned Variable In a DATA step, if the PROPCASE function returns a value to a variable that has not previously been assigned a length, then that variable is given a length of 200 bytes. The Basics The PROPCASE function copies a character argument and converts all uppercase letters to lowercase letters. It then converts to uppercase the first character of a word that is preceded by a blank, forward slash, hyphen, open parenthesis, period, or tab. PROPCASE returns the value that is altered. If you use the second argument, then the default delimiters are no longer in effect. The results of the PROPCASE function depend directly on the translation table that is in effect (see “TRANTAB System Option”) and indirectly on the "ENCODING System Option" and the "LOCALE System Option" in SAS National Language Support (NLS): Reference Guide.
Examples Example 1: Changing the Case of Words
The following example shows how
PROPCASE handles the case of words: data _null_; input place $ 1-40; name=propcase(place); put name; datalines; INTRODUCTION TO THE SCIENCE OF ASTRONOMY VIRGIN ISLANDS (U.S.)
1004
PROPCASE Function
4
Chapter 4
SAINT KITTS/NEVIS WINSTON-SALEM, N.C. ; run;
SAS writes the following output to the log: Introduction To The Science Of Astronomy Virgin Islands (U.S.) Saint Kitts/Nevis Winston-Salem, N.C.
Example 2: Using PROPCASE with a Second Argument The following example uses a blank, a hyphen and a single quotation mark as the second argument so that names such as O’Keeffe and Burne-Jones are written correctly. options pageno=1 nodate ls=80 ps=64; data names; infile datalines dlm=’#’; input CommonName : $20. CapsName : $20.; PropcaseName=propcase(capsname, " -’"); datalines; Delacroix, Eugene# EUGENE DELACROIX O’Keeffe, Georgia# GEORGIA O’KEEFFE Rockwell, Norman# NORMAN ROCKWELL Burne-Jones, Edward# EDWARD BURNE-JONES ; proc print data=names noobs; title ’Names of Artists’; run;
Output 4.68
Output Showing the Results of Using PROPCASE with a Second Argument Names of Artists CommonName Delacroix, Eugene O’Keeffe, Georgia Rockwell, Norman Burne-Jones, Edward
1
CapsName
PropcaseName
EUGENE DELACROIX GEORGIA O’KEEFFE NORMAN ROCKWELL EDWARD BURNE-JONES
Eugene Delacroix Georgia O’Keeffe Norman Rockwell Edward Burne-Jones
See Also Functions: “UPCASE Function” on page 1136 “LOWCASE Function” on page 884
Functions and CALL Routines
4
PRXCHANGE Function
1005
PRXCHANGE Function Performs a pattern-matching replacement. Category: Character String Matching
Syntax PRXCHANGE(perl-regular-expression | regular-expression-id, times, source)
Arguments perl-regular-expression
specifies a character constant, variable, or expression with a value that is a Perl regular expression. regular-expression-id
specifies a numeric variable with a value that is a pattern identifier that is returned from the PRXPARSE function. Restriction:
If you use this argument, you must also use the PRXPARSE function.
times
is a numeric constant, variable, or expression that specifies the number of times to search for a match and replace a matching pattern. If the value of times is –1, then matching patterns continue to be replaced until the end of source is reached.
Tip:
source
specifies a character constant, variable, or expression that you want to search.
Details The Basics If you use regular-expression-id, the PRXCHANGE function searches the variable source with the regular-expression-id that is returned by PRXPARSE. It returns the value in source with the changes that were specified by the regular expression. If there is no match, PRXCHANGE returns the unchanged value in source. If you use perl-regular-expression, PRXCHANGE searches the variable source with the perl-regular-expression, and you do not need to call PRXPARSE. You can use PRXCHANGE with a perl-regular-expression in a WHERE clause and in PROC SQL. For more information about pattern matching, see “Pattern Matching Using Perl Regular Expressions (PRX)” on page 322. Compiling a Perl Regular Expression If perl-regular-expression is a constant or if it uses the /o option, then the Perl regular expression is compiled once and each use of PRXCHANGE reuses the compiled expression. If perl-regular-expression is not a constant and if it does not use the /o option, then the Perl regular expression is recompiled for each call to PRXCHANGE. Note: The compile-once behavior occurs when you use PRXCHANGE in a DATA step, in a WHERE clause, or in PROC SQL. For all other uses, the perl-regular-expression is recompiled for each call to PRXCHANGE. 4
1006
PRXCHANGE Function
4
Chapter 4
Performing a Match
Perl regular expressions consist of characters and special characters that are called metacharacters. When performing a match, SAS searches a source string for a substring that matches the Perl regular expression that you specify. To view a short list of Perl regular expression metacharacters that you can use when you build your code, see the table “Tables of Perl Regular Expression (PRX) Metacharacters” on page 2145. You can find a complete list of metacharacters at http://www.perl.com.
Comparisons The PRXCHANGE function is similar to the CALL PRXCHANGE routine except that the function returns the value of the pattern-matching replacement as a return argument instead of as one of its parameters. The Perl regular expression (PRX) functions and CALL routines work together to manipulate strings that match patterns. To see a list and short description of these functions and CALL routines, see the Character String Matching category in “Functions and CALL Routines by Category” on page 333.
Examples Example 1: Changing the Order of First and Last Names Changing the Order of First and Last Names by Using the DATA Step
The following example uses the DATA step to change the order of first and last names. /* Create a data set that contains a list of names. */ data ReversedNames; input name & $32.; datalines; Jones, Fred Kavich, Kate Turley, Ron Dulix, Yolanda ; /* Reverse last and first names with a DATA step. */ options pageno=1 nodate ls=80 ps=64; data names; set ReversedNames; name = prxchange(’s/(\w+), (\w+)/$2 $1/’, -1, name); run; proc print data=names; run;
Output 4.69
Output from the DATA Step The SAS System Obs 1 2 3 4
name Fred Jones Kate Kavich Ron Turley Yolanda Dulix
1
Functions and CALL Routines
4
PRXCHANGE Function
1007
Changing the Order of First and Last Names by Using PROC SQL The following example uses PROC SQL to change the order of first and last names. data ReversedNames; input name & $32.; datalines; Jones, Fred Kavich, Kate Turley, Ron Dulix, Yolanda ; proc sql; create table names as select prxchange(’s/(\w+), (\w+)/$2 $1/’, -1, name) as name from ReversedNames; quit; options pageno=1 nodate ls=80 ps=64; proc print data=names; run;
Output 4.70
Output from PROC SQL The SAS System Obs 1 2 3 4
1
name Fred Jones Kate Kavich Ron Turley Yolanda Dulix
Example 2: Matching Rows That Have the Same Name
The following example compares the names in two data sets, and writes those names that are common to both data sets. data names; input name & $32.; datalines; Ron Turley Judy Donnelly Kate Kavich Tully Sanchez ; data ReversedNames; input name & $32.; datalines; Jones, Fred Kavich, Kate Turley, Ron Dulix, Yolanda ;
1008
PRXCHANGE Function
4
Chapter 4
options pageno=1 nodate ls=80 ps=64; proc sql; create table NewNames as select a.name from names as a, ReversedNames as b where a.name = prxchange(’s/(\w+), (\w+)/$2 $1/’, -1, b.name); quit; proc print data=NewNames; run;
Output 4.71
Output from Matching Rows That Have the Same Names The SAS System Obs
name
1 2
Ron Turley Kate Kavich
1
Example 3: Changing Lowercase Text to Uppercase
The following example uses the \U, \L and \E metacharacters to change the case of a string of text. Case modifications do not nest. In this example, note that “bear” does not convert to uppercase letters because the \E metacharacter ends all case modifications. data _null_; length txt $32; txt = prxchange (’s/(big)(black)(bear)/\U$1\L$2\E$3/’, 1, ’bigblackbear’); put txt=; run;
SAS returns the following output to the log: txt=BIGblackbear
Example 4: Changing a Matched Pattern to a Fixed Value This example locates a pattern in a variable and replaces the variable with a predefined value. The example uses a DATA step to find phone numbers and replace them with an informational message. options nodate nostimer ls=78 ps=60; /* Create data set that contains confidential information. */ data a; input text $80.; datalines; The phone number for Ed is (801)443-9876 but not until tonight. He can be reached at (910)998-8762 tomorrow for testing purposes. ; run; /* Locate confidential phone numbers and replace them with message */ /* indicating that they have been removed. */ data b; set a; text = prxchange(’s/\([2-9]\d\d\) ?[2-9]\d\d-\d\d\d\d/*PHONE NUMBER REMOVED*/’, -1, text);
Functions and CALL Routines
4
PRXMATCH Function
put text=; run; proc print data = b; run;
Output 4.72
Output from Changing a Matched Pattern to a Fixed Value The SAS System
1
Obs
text
1 2
The phone number for Ed is *PHONE NUMBER REMOVED* but not until tonight. He can be reached at *PHONE NUMBER REMOVED* tomorrow for testing purposes.
See Also Functions and CALL routines: “CALL PRXCHANGE Routine” on page 464 “CALL PRXDEBUG Routine” on page 466 “CALL PRXFREE Routine” on page 469 “CALL PRXNEXT Routine” on page 470 “CALL PRXPOSN Routine” on page 472 “CALL PRXSUBSTR Routine” on page 474 “PRXMATCH Function” on page 1009 “PRXPAREN Function” on page 1013 “PRXPARSE Function” on page 1015 “PRXPOSN Function” on page 1017
PRXMATCH Function Searches for a pattern match and returns the position at which the pattern is found. Category: Character String Matching
Syntax PRXMATCH (regular-expression-id | perl-regular-expression, source)
1009
1010
PRXMATCH Function
4
Chapter 4
Arguments regular-expression-id
specifies a numeric variable with a value that is a pattern identifier that is returned from the PRXPARSE function. Restriction:
If you use this argument, you must also use the PRXPARSE function.
perl-regular-expression
specifies a character constant, variable, or expression with a value that is a Perl regular expression. source
specifies a character constant, variable, or expression that you want to search.
Details The Basics If you use regular-expression-id, then the PRXMATCH function searches source with the regular-expression-id that is returned by PRXPARSE, and returns the position at which the string begins. If there is no match, PRXMATCH returns a zero. If you use perl-regular-expression, PRXMATCH searches source with the perl-regular-expression, and you do not need to call PRXPARSE. You can use PRXMATCH with a Perl regular expression in a WHERE clause and in PROC SQL. For more information about pattern matching, see “Pattern Matching Using Perl Regular Expressions (PRX)” on page 322. Compiling a Perl Regular Expression If perl-regular-expression is a constant or if it uses the /o option, then the Perl regular expression is compiled once and each use of PRXMATCH reuses the compiled expression. If perl-regular-expression is not a constant and if it does not use the /o option, then the Perl regular expression is recompiled for each call to PRXMATCH. Note: The compile-once behavior occurs when you use PRXMATCH in a DATA step, in a WHERE clause, or in PROC SQL. For all other uses, the perl-regular-expression is recompiled for each call to PRXMATCH. 4
Comparisons The Perl regular expression (PRX) functions and CALL routines work together to manipulate strings that match patterns. To see a list and short description of these functions and CALL routines, see the Character String Matching category in “Functions and CALL Routines by Category” on page 333.
Examples Example 1: Finding the Position of a Substring in a String Finding the Position of a Substring by Using PRXPARSE
The following example searches a string for a substring, and returns its position in the string. data _null_; /* Use PRXPARSE to compile the Perl regular expression. */ patternID = prxparse(’/world/’); /* Use PRXMATCH to find the position of the pattern match. */
Functions and CALL Routines
4
PRXMATCH Function
1011
position=prxmatch(patternID, ’Hello world!’); put position=; run;
SAS writes the following line to the log: position=7
Finding the Position of a Substring by Using a Perl Regular Expression
The following example uses a Perl regular expression to search a string (Hello world) for a substring (world) and to return the position of the substring in the string. data _null_; /* Use PRXMATCH to find the position of the pattern match. */ position=prxmatch(’/world/’, ’Hello world!’); put position=; run;
SAS writes the following line to the log: position=7
Example 2: Finding the Position of a Substring in a String: A Complex Example
The following example uses several Perl regular expression functions and a CALL routine to find the position of a substring in a string. data _null_; if _N_ = 1 then do; retain PerlExpression; pattern = "/(\d+):(\d\d)(?:\.(\d+))?/"; PerlExpression = prxparse(pattern); end; array match[3] $ 8; input minsec $80.; position = prxmatch(PerlExpression, minsec); if position ^= 0 then do; do i = 1 to prxparen(PerlExpression); call prxposn(PerlExpression, i, start, length); if start ^= 0 then match[i] = substr(minsec, start, length); end; put match[1] "minutes, " match[2] "seconds" @; if ^missing(match[3]) then put ", " match[3] "milliseconds"; end; datalines; 14:56.456 45:32 ; run;
1012
PRXMATCH Function
4
Chapter 4
The following lines are written to the SAS log: 14 minutes, 56 seconds, 456 milliseconds 45 minutes, 32 seconds
Example 3: Extracting a Zip Code from a Data Set Extracting a Zip Code by Using the DATA Step The following example uses a DATA step to search each observation in a data set for a nine-digit zip code, and writes those observations to the data set ZipPlus4. data ZipCodes; input name: $16. zip:$10.; datalines; Johnathan 32523-2343 Seth 85030 Kim 39204 Samuel 93849-3843 ; /* Extract ZIP+4 ZIP codes with the DATA step. */ data ZipPlus4; set ZipCodes; where prxmatch(’/\d{5}-\d{4}/’, zip); run; options nodate pageno=1 ls=80 ps=64; proc print data=ZipPlus4; run;
Output 4.73
Zip Code Output from the DATA Step The SAS System Obs 1 2
name Johnathan Samuel
1 zip
32523-2343 93849-3843
Extracting a Zip Code by Using PROC SQL
The following example searches each observation in a data set for a nine-digit zip code, and writes those observations to the data set ZipPlus4. data ZipCodes; input name: $16. zip:$10.; datalines; Johnathan 32523-2343 Seth 85030 Kim 39204 Samuel 93849-3843 ; /* Extract ZIP+4 ZIP codes with PROC SQL. */ proc sql;
Functions and CALL Routines
4
PRXPAREN Function
create table ZipPlus4 as select * from ZipCodes where prxmatch(’/\d{5}-\d{4}/’, zip); run; options nodate pageno=1 ls=80 ps=64; proc print data=ZipPlus4; run;
Output 4.74
Zip Code Output from PROC SQL The SAS System Obs 1 2
name Johnathan Samuel
1 zip
32523-2343 93849-3843
See Also Functions and CALL routines: “CALL PRXCHANGE Routine” on page 464 “CALL PRXDEBUG Routine” on page 466 “CALL PRXFREE Routine” on page 469 “CALL PRXNEXT Routine” on page 470 “CALL PRXPOSN Routine” on page 472 “CALL PRXSUBSTR Routine” on page 474 “CALL PRXCHANGE Routine” on page 464 “PRXCHANGE Function” on page 1005 “PRXPAREN Function” on page 1013 “PRXPARSE Function” on page 1015 “PRXPOSN Function” on page 1017
PRXPAREN Function Returns the last bracket match for which there is a match in a pattern. Category: Character String Matching Restriction:
Use with the PRXPARSE function.
Syntax PRXPAREN (regular-expression-id)
1013
1014
4
PRXPAREN Function
Chapter 4
Arguments regular-expression-id
specifies a numeric variable with a value that is an identification number that is returned by the PRXPARSE function.
Details The PRXPAREN function is useful in finding the largest capture-buffer number that can be passed to the CALL PRXPOSN routine, or in identifying which part of a pattern matched. For more information about pattern matching, see “Pattern Matching Using Perl Regular Expressions (PRX)” on page 322.
Comparisons The Perl regular expression (PRX) functions and CALL routines work together to manipulate strings that match patterns. To see a list and short description of these functions and CALL routines, see the Character String Matching category in “Functions and CALL Routines by Category” on page 333.
Examples The following example uses Perl regular expressions and writes the results to the SAS log. data _null_; ExpressionID = prxparse(’/(magazine)|(book)|(newspaper)/’); position = prxmatch(ExpressionID, ’find book here’); if position then paren = prxparen(ExpressionID); put ’Matched paren ’ paren; position = prxmatch(ExpressionID, ’find magazine here’); if position then paren = prxparen(ExpressionID); put ’Matched paren ’ paren; position = prxmatch(ExpressionID, ’find newspaper here’); if position then paren = prxparen(ExpressionID); put ’Matched paren ’ paren; run;
The following lines are written to the SAS log: Matched paren 2 Matched paren 1 Matched paren 3
See Also Functions and CALL routines: “CALL PRXCHANGE Routine” on page 464 “CALL PRXDEBUG Routine” on page 466
Functions and CALL Routines
4
PRXPARSE Function
1015
“CALL PRXFREE Routine” on page 469 “CALL PRXNEXT Routine” on page 470 “CALL PRXPOSN Routine” on page 472 “CALL PRXSUBSTR Routine” on page 474 “CALL PRXCHANGE Routine” on page 464 “PRXCHANGE Function” on page 1005 “PRXMATCH Function” on page 1009 “PRXPARSE Function” on page 1015 “PRXPOSN Function” on page 1017
PRXPARSE Function Compiles a Perl regular expression (PRX) that can be used for pattern matching of a character value. Category: Character String Matching Restriction:
Use with other Perl regular expressions.
Syntax regular-expression-id=PRXPARSE (perl-regular-expression)
Arguments regular-expression-id
is a numeric pattern identifier that is returned by the PRXPARSE function. perl-regular-expression
specifies a character value that is a Perl regular expression.
Details The Basics The PRXPARSE function returns a pattern identifier number that is used by other Perl functions and CALL routines to match patterns. If an error occurs in parsing the regular expression, SAS returns a missing value. PRXPARSE uses metacharacters in constructing a Perl regular expression. To view a table of common metacharacters, see “Tables of Perl Regular Expression (PRX) Metacharacters” on page 2145. For more information about pattern matching, see “Pattern Matching Using Perl Regular Expressions (PRX)” on page 322. Compiling a Perl Regular Expression If perl-regular-expression is a constant or if it uses the /o option, the Perl regular expression is compiled only once. Successive calls to PRXPARSE will not cause a recompile, but will return the regular-expression-id for the regular expression that was already compiled. This behavior simplifies the code because you do not need to use an initialization block (IF _N_ =1) to initialize Perl regular expressions.
1016
PRXPARSE Function
4
Chapter 4
Note: If you have a Perl regular expression that is a constant, or if the regular expression uses the /o option, then calling PRXFREE to free the memory allocation results in the need to recompile the regular expression the next time that it is called by PRXPARSE. The compile-once behavior occurs when you use PRXPARSE in a DATA step. For all other uses, the perl-regular-expression is recompiled for each call to PRXPARSE. 4
Comparisons The Perl regular expression (PRX) functions and CALL routines work together to manipulate strings that match patterns. To see a list and short description of these functions and CALL routines, see the Character String Matching category in “Functions and CALL Routines by Category” on page 333.
Examples The following example uses metacharacters and regular characters to construct a Perl regular expression. The example parses addresses and writes formatted results to the SAS log. data _null_; if _N_ = 1 then do; retain patternID; /* The i option specifies a case insensitive search. */ pattern = "/ave|avenue|dr|drive|rd|road/i"; patternID = prxparse(pattern); end; input street $80.; call prxsubstr(patternID, street, position, length); if position ^= 0 then do; match = substr(street, position, length); put match:$QUOTE. "found in " street:$QUOTE.; end; datalines; 153 First Street 6789 64th Ave 4 Moritz Road 7493 Wilkes Place ;
The following lines are written to the SAS log: "Ave" found in "6789 64th Ave" "Road" found in "4 Moritz Road"
See Also Functions and CALL routines: “CALL PRXCHANGE Routine” on page 464 “CALL PRXDEBUG Routine” on page 466
Functions and CALL Routines
4
PRXPOSN Function
1017
“CALL PRXFREE Routine” on page 469 “CALL PRXNEXT Routine” on page 470 “CALL PRXPOSN Routine” on page 472 “CALL PRXSUBSTR Routine” on page 474 “CALL PRXCHANGE Routine” on page 464 “PRXCHANGE Function” on page 1005 “PRXPAREN Function” on page 1013 “PRXMATCH Function” on page 1009 “PRXPOSN Function” on page 1017
PRXPOSN Function Returns a character string that contains the value for a capture buffer. Category: Character String Matching Restriction:
Use with the PRXPARSE function.
Syntax PRXPOSN(regular-expression-id, capture-buffer, source)
Arguments regular-expression-id
specifies a numeric variable with a value that is a pattern identifier that is returned by the PRXPARSE function. capture-buffer
is a numeric constant, variable, or expression that identifies the capture buffer for which to retrieve a value: 3 If the value of capture-buffer is zero, PRXPOSN returns the entire match. 3 If the value of capture-buffer is between 1 and the number of open parentheses in the regular expression, then PRXPOSN returns the value for that capture buffer. 3 If the value of capture-buffer is greater than the number of open parentheses, then PRXPOSN returns a missing value. source
specifies the text from which to extract capture buffers.
Details The PRXPOSN function uses the results of PRXMATCH, PRXSUBSTR, PRXCHANGE, or PRXNEXT to return a capture buffer. A match must be found by one of these functions for PRXPOSN to return meaningful information. A capture buffer is part of a match, enclosed in parentheses, that is specified in a regular expression. This function simplifies using capture buffers by returning the text
1018
PRXPOSN Function
4
Chapter 4
for the capture buffer directly, and by not requiring a call to SUBSTR as in the case of CALL PRXPOSN. For more information about pattern matching, see “Pattern Matching Using Perl Regular Expressions (PRX)” on page 322.
Comparisons The PRXPOSN function is similar to the CALL PRXPOSN routine, except that it returns the capture buffer itself rather than the position and length of the capture buffer. The Perl regular expression (PRX) functions and CALL routines work together to manipulate strings that match patterns. To see a list and short description of these functions and CALL routines, see the Character String Matching category in “Functions and CALL Routines by Category” on page 333.
Examples Example 1: Extracting First and Last Names extract first and last names from a data set. data ReversedNames; input name & $32.; datalines; Jones, Fred Kavich, Kate Turley, Ron Dulix, Yolanda ; data FirstLastNames; length first last $ 16; keep first last; retain re; if _N_ = 1 then re = prxparse(’/(\w+), (\w+)/’); set ReversedNames; if prxmatch(re, name) then do; last = prxposn(re, 1, name); first = prxposn(re, 2, name); end; run; options pageno=1 nodate ls=80 ps=64; proc print data = FirstLastNames; run;
The following example uses PRXPOSN to
Functions and CALL Routines
Output 4.75
4
PRXPOSN Function
1019
Output from PRXPOSN: First and Last Names The SAS System Obs 1 2 3 4
first Fred Kate Ron Yolanda
1
last Jones Kavich Turley Dulix
Example 2: Extracting Names When Some Names Are Invalid
The following example creates a data set that contains a list of names. Observations that have only a first name or only a last name are invalid. PRXPOSN extracts the valid names from the data set, and writes the names to the data set NEW. data old; input name $60.; datalines; Judith S Reaveley Ralph F. Morgan Jess Ennis Carol Echols Kelly Hansen Huff Judith Nick Jones ; data new; length first middle last $ 40; keep first middle last; re = prxparse(’/(\S+)\s+([^\s]+\s+)?(\S+)/o’); set old; if prxmatch(re, name) then do; first = prxposn(re, 1, name); middle = prxposn(re, 2, name); last = prxposn(re, 3, name); output; end; run; options pageno=1 nodate ls=80 ps=64; proc print data = new; run;
1020
4
PTRLONGADD Function
Output 4.76
Chapter 4
Output of Valid Names The SAS System Obs 1 2 3 4 5
1
first
middle
last
Judith Ralph Jess Carol Kelly
S F.
Reaveley Morgan Ennis Echols Huff
Hansen
See Also Functions: “CALL PRXCHANGE Routine” on page 464 “CALL PRXDEBUG Routine” on page 466 “CALL PRXFREE Routine” on page 469 “CALL PRXNEXT Routine” on page 470 “CALL PRXPOSN Routine” on page 472 “CALL PRXSUBSTR Routine” on page 474 “CALL PRXCHANGE Routine” on page 464 “PRXCHANGE Function” on page 1005 “PRXMATCH Function” on page 1009 “PRXPAREN Function” on page 1013 “PRXPARSE Function” on page 1015
PTRLONGADD Function Returns the pointer address as a character variable on 32-bit and 64-bit platforms. Category:
Special
Syntax PTRLONGADD(pointer)
Arguments pointer
is a character constant, variable, or expression that specifies the pointer address. amount
is a numeric constant, variable, or expression that specifies the amount to add to the address.
Functions and CALL Routines
Tip:
4
PUT Function
1021
amount can be a negative number.
Details The PTRLONGADD function performs pointer arithmetic and returns a pointer address as a character string.
Examples The following example returns the pointer address for the variable Z. data _null_; x=’ABCDE’; y=ptrlongadd(addrlong(x),2); z=peekclong(y,1); put z=; run;
The output from the SAS log is: z=C
PUT Function Returns a value using a specified format. Category: Special
Syntax PUT(source, format.)
Arguments source
identifies the constant, variable, or expression whose value you want to reformat. The source argument can be character or numeric. format.
contains the SAS format that you want applied to the value that is specified in the source. This argument must be the name of a format with a period and optional width and decimal specifications, not a character constant, variable, or expression. By default, if the source is numeric, the resulting string is right aligned, and if the source is character, the result is left aligned. To override the default alignment, you can add an alignment specification to a format: -L
left aligns the value.
-C
centers the value.
-R
right aligns the value.
Restriction: The format. must be of the same type as the source, either character
or numeric. That is, if the source is character, the format name must begin with a
1022
PUT Function
4
Chapter 4
dollar sign, but if the source is numeric, the format name must not begin with a dollar sign.
Details If the PUT function returns a value to a variable that has not yet been assigned a length, by default the variable length is determined by the width of the format. Use PUT to convert a numeric value to a character value. The PUT function has no effect on which formats are used in PUT statements or which formats are assigned to variables in data sets. You cannot use the PUT function to change the type of a variable in a data set from numeric to character.
Comparisons The PUT statement and the PUT function are similar. The PUT function returns a value using a specified format. You must use an assignment statement to store the value in a variable. The PUT statement writes a value to an external destination (either the SAS log or a destination you specify).
Examples Example 1: Converting Numeric Values to Character Value In this example, the first statement converts the values of CC, a numeric variable, into the four-character hexadecimal format, and the second writes the same value that the PUT function returns. cchex=put(cc,hex4.); put cc hex4.;
Example 2: Using PUT and INPUT Functions
In this example, the PUT function returns a numeric value as a character string. The value 122591 is assigned to the CHARDATE variable. The INPUT function returns the value of the character string as a SAS date value using a SAS date informat. The value 11681 is stored in the SASDATE variable. numdate=122591; chardate=put(numdate,z6.); sasdate=input(chardate,mmddyy6.);
See Also Functions: “INPUT Function” on page 797 “INPUTC Function” on page 799 “INPUTN Function” on page 801 “PUTC Function” on page 1023, “PUTN Function” on page 1025 Statement: “PUT Statement” on page 1656
Functions and CALL Routines
4
PUTC Function
1023
PUTC Function Enables you to specify a character format at run time. Category: Special
Syntax PUTC(source, format.)
Arguments
source
specifies a character constant, variable, or expression to which you want to apply the format. format.
is a character constant, variable, or expression with a value that is the character format you want to apply to source. w
is a numeric constant, variable, or expression that specifies a width to apply to the format. Interaction: If you specify a width here, it overrides any width specification in the
format.
Details If the PUTC function returns a value to a variable that has not yet been assigned a length, by default the variable length is determined by the length of the first argument.
Comparisons The PUTN function enables you to specify a numeric format at run time. The PUT function is faster than PUTC because PUT lets you specify a format at compile time rather than at run time.
Examples The PROC FORMAT step in this example creates a format, TYPEFMT., that formats the variable values 1, 2, and 3 with the name of one of the three other formats that this step creates. These three formats output responses of "positive," "negative," and "neutral" as different words, depending on the type of question. After PROC FORMAT creates the formats, the DATA step creates a SAS data set from raw data consisting of a number identifying the type of question and a response. After reading a record, the DATA step uses the value of TYPE to create a variable, RESPFMT, that contains the value of the appropriate format for the current type of question. The DATA step also creates another variable, WORD, whose value is the appropriate word for a response. The PUTC function assigns the value of WORD based on the type of question and the appropriate format.
1024
PUTC Function
4
Chapter 4
proc format; value typefmt 1=’$groupx’ 2=’$groupy’ 3=’$groupz’; value $groupx ’positive’=’agree’ ’negative’=’disagree’ ’neutral’=’notsure ’; value $groupy ’positive’=’accept’ ’negative’=’reject’ ’neutral’=’possible’; value $groupz ’positive’=’pass ’ ’negative’=’fail’ ’neutral’=’retest’; run; data answers; length word $ 8; input type response $; respfmt = put(type, typefmt.); word = putc(response, respfmt); datalines; 1 positive 1 negative 1 neutral 2 positive 2 negative 2 neutral 3 positive 3 negative 3 neutral ;
The value of the variable WORD is agree for the first observation. The value of the variable WORD is retest for the last observation.
See Also Functions: “INPUT Function” on page 797 “INPUTC Function” on page 799 “INPUTN Function” on page 801 “PUT Function” on page 1021, “PUTN Function” on page 1025
Functions and CALL Routines
4
PUTN Function
1025
PUTN Function Enables you to specify a numeric format at run time. Category: Special
Syntax PUTN(source, format.< ,w>)
Arguments source
specifies a numeric constant, variable, or expression to which you want to apply the format. format.
is a character constant, variable, or expression with a value that is the numeric format you want to apply to source. w
is a numeric constant, variable, or expression that specifies a width to apply to the format. Interaction: If you specify a width here, it overrides any width specification in the format. d
is a numeric constant, variable, or expression that specifies the number of decimal places to use. Interaction: If you specify a number here, it overrides any decimal-place specification in the format.
Details If the PUTN function returns a value to a variable that has not yet been assigned a length, by default the variable is assigned a length of 200.
Comparisons The PUTC function enables you to specify a character format at run time. The PUT function is faster than PUTN because PUT lets you specify a format at compile time rather than at run time.
Examples The PROC FORMAT step in this example creates a format, WRITFMT., that formats the variable values 1 and 2 with the name of a SAS date format. The DATA step creates a SAS data set from raw data consisting of a number and a key. After reading a record, the DATA step uses the value of KEY to create a variable, DATEFMT, that contains the value of the appropriate date format. The DATA step also creates a new variable, DATE, whose value is the formatted value of the date. PUTN assigns the value of DATE based on the value of NUMBER and the appropriate format.
1026
PVP Function
4
Chapter 4
proc format; value writfmt 1=’date9.’ 2=’mmddyy10.’; run; data dates; input number key; datefmt=put(key,writfmt.); date=putn(number,datefmt); datalines; 15756 1 14552 2 ;
See Also Functions: “INPUT Function” on page 797 “INPUTC Function” on page 799 “INPUTN Function” on page 801 “PUT Function” on page 1021 “PUTC Function” on page 1023
PVP Function Returns the present value for a periodic cash flow stream (such as a bond), with repayment of principal at maturity. Category:
Financial
Syntax PVP(A,c,n,K,k0,y)
Arguments A
specifies the par value. Range:
A>0
c
specifies the nominal per-year coupon rate, expressed as a fraction. Range:
0
c 0 and is an integer
Functions and CALL Routines
4
QTR Function
1027
K
specifies the number of remaining coupons.
K > 0 and is an integer
Range: k0
specifies the time from the present date to the first coupon date, expressed in terms of the number of years.
0
Range:
< k0 n1
y
specifies the nominal per-year yield-to-maturity, expressed as a fraction.
y>0
Range:
Details The PVP function is based on the relationship
K X P=
c (k)1
y tk k=1 1 + n 0
where
tk = nk0 + k 0 1 c (k) = nc0A for k = 1; . . . ;K 0 1 1 c c (K ) = 1 + n A
Examples
data _null_; p=pvp(1000,.01,4,14,.33/2,.10); put p; run;
The value returned is 743.168.
QTR Function Returns the quarter of the year from a SAS date value. Category:
Date and Time
1028
QUANTILE Function
4
Chapter 4
Syntax QTR(date)
Arguments date
specifies a numeric constant, variable, or expression that represents a SAS date value.
Details The QTR function returns a value of 1, 2, 3, or 4 from a SAS date value to indicate the quarter of the year in which a date value falls.
Examples The following SAS statements produce these results. SAS Statements
Results
x=’20jan94’d; y=qtr(x); put y=;
y=1
See Also Function: “YYQ Function” on page 1195
QUANTILE Function Returns the quantile from a distribution that you specify. Quantile “CDF Function” on page 540
Category: See:
Syntax QUANTILE(dist, probability, parm-1,…,parm-k)
Functions and CALL Routines
4
QUANTILE Function
1029
Arguments dist
is a character constant, variable, or expression that identifies the distribution. Valid distributions are as follows: Distribution
Argument
Bernoulli
BERNOULLI
Beta
BETA
Binomial
BINOMIAL
Cauchy
CAUCHY
Chi-Square
CHISQUARE
Exponential
EXPONENTIAL
F
F
Gamma
GAMMA
Geometric
GEOMETRIC
Hypergeometric
HYPERGEOMETRIC
Laplace
LAPLACE
Logistic
LOGISTIC
Lognormal
LOGNORMAL
Negative binomial
NEGBINOMIAL
Normal
NORMAL|GAUSS
Normal mixture
NORMALMIX
Pareto
PARETO
Poisson
POISSON
T
T
Uniform
UNIFORM
Wald (inverse Gaussian)
WALD|IGAUSS
Weibull
WEIBULL
Note: Except for T, F, and NORMALMIX, you can minimally identify any distribution by its first four characters. 4 probability
is a numeric constant, variable, or expression that specifies the value of a random variable. parm-1,…,parm-k
are optional shape, location, or scale parameters appropriate for the specific distribution. The QUANTILE function computes the probability from various continuous and discrete distributions. For more information, see the on page 541.
1030
QUANTILE Function
4
Chapter 4
Examples
SAS Statements
Results
y=quantile(’BERN’,.75,.25);
0
y=quantile(’BETA’,0.1,3,4);
0.2009088789
y=quantile(’BINOM’,.4,.5,10);
5
y=quantile(’CAUCHY’,.85);
1.9626105055
y=quantile(’CHISQ’,.6,11);
11.529833841
y=quantile(’EXPO’,.6);
0.9162907319
y=quantile(’F’,.8,2,3);
2.8860266073
y=quantile(’GAMMA’,.4,3);
2.285076904
y=quantile(’HYPER’,.5,200,50,10);
2
y=quantile(’LAPLACE’,.8);
0.9162907319
y=quantile(’LOGISTIC’,.7);
0.8472978604
y=quantile(’LOGNORMAL’,.5);
1
y=quantile(’NEGB’,.5,.5,2);
1
y=quantile(’NORMAL’,.975);
1.9599639845
y=quantile(’PARETO’,.01,1);
1.0101010101
y=quantile(’POISSON’,.9,1);
2
y=quantile(’T’,.8,5);
0.9195437802
y=quantile(’UNIFORM’,0.25);
0.25
y=quantile(’WALD’,.6,2);
0.9526209927
y=quantile(’WEIBULL’,.6,2);
0.9572307621
See Also Functions: “LOGCDF Function” on page 879 “LOGPDF Function” on page 881 “LOGSDF Function” on page 882 “PDF Function” on page 954 “SDF Function” on page 1081 “CDF Function” on page 540
Functions and CALL Routines
4
QUOTE Function
1031
QUOTE Function Adds double quotation marks to a character value. Category: Character Restriction:
“I18N Level 2” on page 306
Syntax QUOTE(argument)
Arguments
argument
specifies a character constant, variable, or expression.
Details Length of Returned Variable In a DATA step, if the QUOTE function returns a value to a variable that has not previously been assigned a length, then that variable is given a length of 200 bytes. The Basics The QUOTE function adds double quotation marks, the default character, to a character value. If double quotation marks are found within the argument, they are doubled in the output. The length of the receiving variable must be long enough to contain the argument (including trailing blanks), leading and trailing quotation marks, and any embedded quotation marks that are doubled. For example, if the argument is ABC followed by three trailing blanks, then the receiving variable must have a length of at least eight to hold “ABC###”. (The character # represents a blank space.) If the receiving field is not long enough, the QUOTE function returns a blank string, and writes an invalid argument note to the log.
Examples SAS Statements
Results
x=’A"B’; y=quote(x); put y;
"A""B"
x=’A’’B’; y=quote(x); put y;
"A’B"
x=’Paul’’s’; y=quote(x); put y;
"Paul’s"
1032
RANBIN Function
4
Chapter 4
SAS Statements x=’Catering Service Center y=quote(x); put y; x=’Paul’’s Catering Service y=quote(trim(x)); put y;
Results ’; "Catering Service Center
"
’; "Paul’s Catering Service"
RANBIN Function Returns a random variate from a binomial distribution. Random Number If you want to change the seed value during execution, you must use the CALL RANBIN routine instead of the RANBIN function. Category: Tip:
Syntax RANBIN(seed,n,p)
Arguments seed
is a numeric constant, variable, or expression with an integer value. If seed ≤ 0, the time of day is used to initialize the seed stream. 31 Range: seed < 2 −1 See: “Seed Values” on page 306 for more information about seed values n
is a numeric constant, variable, or expression with an integer value that specifies the number of independent Bernoulli trials parameter. Range: n> 0 p
is a numeric constant, variable, or expression that specifies the probability of success. Range: 0 < p < 1
Details The RANBIN function returns a variate that is generated from a binomial distribution with mean np and variance np(1−p). If n ≤ 50, np ≤ 5, or n(1–p) ≤ 5, an inverse transform method applied to a RANUNI uniform variate is used. If n > 50, np > 5, and n(1–p) > 5, the normal approximation to the binomial distribution is used. In that case, the Box-Muller transformation of RANUNI uniform variates is used. For a discussion about seeds and streams of data, as well as examples of using the random-number functions, see “Generating Multiple Variables from One Seed in Random-Number Functions” on page 315.
Functions and CALL Routines
4
RANCAU Function
1033
Comparisons The CALL RANBIN routine, an alternative to the RANBIN function, gives greater control of the seed and random number streams.
See Also Functions and CALL routines: “RAND Function” on page 1034 “CALL RANBIN Routine” on page 477
RANCAU Function Returns a random variate from a Cauchy distribution. Category: Random Number Tip: If you want to change the seed value during execution, you must use the CALL RANCAU routine instead of the RANCAU function.
Syntax RANCAU(seed)
Arguments seed
is a numeric constant, variable, or expression with an integer value. If seed ≤ 0, the time of day is used to initialize the seed stream. 31 Range: seed < 2 −1 See: “Seed Values” on page 306 for more information about seed values
Details The RANCAU function returns a variate that is generated from a Cauchy distribution with location parameter 0 and scale parameter 1. An acceptance-rejection procedure applied to RANUNI uniform variates is used. If u and v are independent uniform (−1/2, 1/2) variables and u2 v 2 1=4, then u/v is a Cauchy variate. A Cauchy variate X with location parameter ALPHA and scale parameter BETA can be generated:
+
x=alpha+beta*rancau(seed);
For a discussion about seeds and streams of data, as well as examples of using the random-number functions, see “Generating Multiple Variables from One Seed in Random-Number Functions” on page 315.
Comparisons The CALL RANCAU routine, an alternative to the RANCAU function, gives greater control of the seed and random number streams.
1034
RAND Function
4
Chapter 4
See Also Functions and CALL routines: “RAND Function” on page 1034 “CALL RANCAU Routine” on page 479
RAND Function Generates random numbers from a distribution that you specify. Category:
Random Number
Syntax RAND (dist, parm-1,…,parm-k)
Arguments
dist
is a character constant, variable, or expression that identifies the distribution. Valid distributions are as follows:
Distribution
Argument
Bernoulli
BERNOULLI
Beta
BETA
Binomial
BINOMIAL
Cauchy
CAUCHY
Chi-Square
CHISQUARE
Erlang
ERLANG
Exponential
EXPONENTIAL
F
F
Gamma
GAMMA
Geometric
GEOMETRIC
Hypergeometric
HYPERGEOMETRIC
Lognormal
LOGNORMAL
Negative binomial
NEGBINOMIAL
Normal
NORMAL|GAUSSIAN
Poisson
POISSON
T
T
Functions and CALL Routines
Distribution
Argument
Tabled
TABLE
Triangular
TRIANGLE
Uniform
UNIFORM
Weibull
WEIBULL
4
RAND Function
1035
Note: Except for T and F, you can minimally identify any distribution by its first four characters. 4 parm-1,…,parm-k
are shape, location, or scale parameters appropriate for the specific distribution. See: “Details” on page 1035 for complete information about these parameters
Details Generating Random Numbers The RAND function generates random numbers from various continuous and discrete distributions. Wherever possible, the simplest form of the distribution is used. The RAND function uses the Mersenne-Twister random number generator (RNG) that was developed by Matsumoto and Nishimura (1998). The random number 19937 generator has a very long period (2 – 1) and very good statistical properties. The period is a Mersenne prime, which contributes to the naming of the RNG. The algorithm is a twisted generalized feedback shift register (TGFSR) that explains the latter part of the name. The TGFSR gives the RNG a very high order of equidistribution (623-dimensional with 32-bit accuracy), which means that there is a very small correlation between successive vectors of 623 pseudo-random numbers. The RAND function is started with a single seed. However, the state of the process cannot be captured by a single seed. You cannot stop and restart the generator from its stopping point. Reproducing a Random Number Stream If you want to create reproducible streams of random numbers, then use the CALL STREAMINIT routine to specify a seed value for random number generation. Use the CALL STREAMINIT routine once per DATA step before any invocation of the RAND function. If you omit the call to the CALL STREAMINIT routine (or if you specify a non-positive seed value in the CALL STREAMINIT routine), then RAND uses a call to the system clock to seed itself. For more information, see CALL STREAMINIT Example 1 on page 518.
Bernoulli Distribution x = RAND(’BERNOULLI’,p) where x is an observation from the distribution with the following probability density function:
f (x) =
(1
px (1 1
0 p)10x
p = 0; x = 0 0 < p < 1; x = 0; 1 p = 1; x = 1
1036
RAND Function
4
Chapter 4
Range: x = 0, 1
p is a numeric probability of success. Range: 0 ≤ p ≤ 1
Beta Distribution x = RAND(’BETA’,a,b) where x is an observation from the distribution with the following probability density function:
f (x) =
0 (a + b) 0 (a) 0 (b)
xa01 (1
0 x)b01
Range: 0 < x < 1
a is a numeric shape parameter. Range: a > 0 b is a numeric shape parameter. Range: b > 0
Binomial Distribution x = RAND(’BINOMIAL’,p,n) where x is an integer observation from the distribution with the following probability density function:
f (x) =
(1
0n1 x x p (1
1
Range:
0 p)n0x
p = 0; x = 0 < p < 1; x = 0; :::; n p = 1; x = n 0
x = 0, 1, ..., n
p is a numeric probability of success. Range: 0 ≤ p ≤ 1 n is an integer parameter that counts the number of independent Bernoulli trials. Range: n = 1, 2, ...
Cauchy Distribution x = RAND(’CAUCHY’) where
Functions and CALL Routines
4
RAND Function
x is an observation from the distribution with the following probability density function:
f (x) =
1
(1 + x2 )
Range: –∞ < x < ∞
Chi-Square Distribution x = RAND(’CHISQUARE’,df) where x is an observation from the distribution with the following probability density function:
f (x) =
0
0df
0x
x 2 01 e 2
2
2
df
df 2
Range: x > 0
df is a numeric degrees of freedom parameter. Range: df > 0
Erlang Distribution x = RAND(’ERLANG’,a) where x is an observation from the distribution with the following probability density function:
f (x) =
1 0 (a)
xa01 e0x
Range: x > 0
a is an integer numeric shape parameter. Range: a = 1, 2, ...
Exponential Distribution x = RAND(’EXPONENTIAL’)
1037
1038
RAND Function
4
Chapter 4
where x is an observation from the distribution with the following probability density function:
f (x) = e0x Range: x > 0
F Distribution x = RAND(’F’,ndf, ddf) where x is an observation from the distribution with the following probability density function:
f (x) =
ndf +ddf
ndf ndf=2 ddf ddf=2 x 2 01 ndf 0 ddf (ddf + ndf x)(ndf +ddf )=2 0 2 2 0
2
ndf
Range: x > 0
ndf is a numeric numerator degrees of freedom parameter. Range: ndf > 0 ddf is a numeric denominator degrees of freedom parameter. Range: ddf > 0
Gamma Distribution x = RAND(’GAMMA’,a) where x is an observation from the distribution with the following probability density function:
f (x) =
1 0 (a)
Range: x > 0
a is a numeric shape parameter. Range: a > 0
Geometric Distribution x = RAND(’GEOMETRIC’,p)
xa01 e0x
Functions and CALL Routines
4
RAND Function
1039
where x is an integer count that denotes the number of trials that are needed to obtain one success. X is an integer observation from the distribution with the following probability density function:
f (x) =
(1 1
0 p)x01 p
0
p
< p < 1; x = 1; 2; ::: = 1; x = 1
Range: x = 1, 2, …
p is a numeric probability of success. Range: 0 < p ≤ 1
Hypergeometric Distribution x = RAND(’HYPER’,N,R,n) where x is an integer observation from the distribution with the following probability density function:
f (x) =
R N 0R x n0x N n
Range: x = max(0, (n – (N – R))), ..., min(n, R)
N is an integer population size parameter. Range: N = 1, 2, ...
R is an integer number of items in the category of interest. Range: R = 0, 1, ..., N
n is an integer sample size parameter. Range: n = 1, 2, ..., N
The hypergeometric distribution is a mathematical formalization of an experiment in which you draw n balls from an urn that contains N balls, R of which are red. The hypergeometric distribution is the distribution of the number of red balls in the sample of n.
Lognormal Distribution x = RAND(’LOGNORMAL’)
1040
RAND Function
4
Chapter 4
where x is an observation from the distribution with the following probability density function: 2 e0 ln (x)=2
p2
f (x) =
x
Range: x > 0
Negative Binomial Distribution x = RAND(’NEGBINOMIAL’,p,k) where x is an integer observation from the distribution with the following probability density function:
( f (x) =
1
x+k01 k01
(1
0 p ) x pk
< p < 1; x = 0; 1; ::: p = 1; x = 0 0
Range: x = 0, 1, ...
k is an integer parameter that is the number of successes. However, non-integer k values are allowed as well. Range: k = 1, 2, ...
p is a numeric probability of success. Range: 0 < p ≤ 1
The negative binomial distribution is the distribution of the number of failures before k successes occur in sequential independent trials, all with the same probability of success, p.
Normal Distribution x = RAND(’NORMAL’,) where x is an observation from the normal distribution with a mean of and a standard deviation of , that has the following probability density function:
f (x) =
Range: –∞ < x < ∞
p
1
2
exp
2 0 ) 0 22
(x
!
Functions and CALL Routines
4
RAND Function
is the mean parameter. Default: 0
is the standard deviation parameter. Default: 1 Range: > 0
Poisson Distribution x = RAND(’POISSON’,m) where x is an integer observation from the distribution with the following probability density function:
mx e0m x!
f (x) = Range: x = 0, 1, ...
m is a numeric mean parameter. Range: m > 0
T Distribution x = RAND(’T’,df) where x is an observation from the distribution with the following probability density function:
df +1 0 df2+1 2 x2 1+ f (x) = p df df 0 df 0
2
Range: –∞ < x < ∞
df is a numeric degrees of freedom parameter. Range: df > 0
1041
1042
RAND Function
4
Chapter 4
Tabled Distribution x = RAND(’TABLE’,p1,p2, …) where x
Pn p 1
is an integer observation from one of the following distributions: If
=1
i
, then x is an observation from this probability density function:
i
f (i) = pi ; i = 1; 2; . . . ; n
and
X 0 n
f (n + 1) = 1
P n
If for some index
=1
i
function:
=1
pi
i
pi1 , then x is an observation from this probability density
f (i) = pi ; i = 1; 2; . . . ; j
and
01 X 0
01
j
f (j ) = 1
=1
pi
i
p1, p2, ... are numeric probability values. Range: 0 ≤ p1, p2, ... ≤ 1 Restriction: The maximum number of probability parameters depends on your operating environment, but the maximum number of parameters is at least 32,767. The tabled distribution takes on the values 1, 2, ..., n with specified probabilities. Note: By using the FORMAT statement, you can map the set {1, 2, ..., n} to any set of n or fewer elements. 4
Triangular Distribution x = RAND(’TRIANGLE’,h) where x is an observation from the distribution with the following probability density function:
f (x) =
( 2x h
2(10x)
0h
1
where 0 ≤ h ≤ 1.
xh h 0 b is a numeric scale parameter. Range: b > 0
Examples SAS Statements
Results
x=rand(’BERN’,.75);
0
x=rand(’BETA’,3,0.1);
.99920
x=rand(’BINOM’,10,0.75);
10
1044
RAND Function
4
Chapter 4
SAS Statements
Results
x=rand(’CAUCHY’);
-1.41525
x=rand(’CHISQ’,22);
25.8526
x=rand(’ERLANG’, 7);
7.67039
x=rand(’EXPO’);
1.48847
x=rand(’F’,12,322);
1.99647
x=rand(’GAMMA’,7.25);
6.59588
x=rand(’GEOM’,0.02);
43
x=rand(’HYPER’,10,3,5);
1
x=rand(’LOGN’);
0.66522
x=rand(’NEGB’,0.8,5);
33
x=rand(’NORMAL’);
1.03507
x=rand(’POISSON’,6.1);
6
x=rand(’T’,4);
2.44646
x=rand(’TABLE’,.2,.5);
2
x=rand(’TRIANGLE’,0.7);
.63811
x=rand(’UNIFORM’);
.96234
x=rand(’WEIB’,0.25,2.1);
6.55778
See Also CALL Routine: “CALL STREAMINIT Routine” on page 517
References Fishman, G. S. 1996. Monte Carlo: Concepts, Algorithms, and Applications. New York: Springer-Verlag. Fushimi, M., and S. Tezuka. 1983. “The k-Distribution of Generalized Feedback Shift Register Pseudorandom Numbers.” Communications of the ACM 26: 516–523. Gentle, J. E. 1998. Random Number Generation and Monte Carlo Methods. New York: Springer-Verlag. Lewis, T. G., and W. H. Payne. 1973. “Generalized Feedback Shift Register Pseudorandom Number Algorithm.” Journal of the ACM 20: 456–468. Matsumoto, M., and Y. Kurita. 1992. “Twisted GFSR Generators.” ACM Transactions on Modeling and Computer Simulation 2: 179–194. Matsumoto, M., and Y. Kurita. 1994. “Twisted GFSR Generators II.” ACM Transactions on Modeling and Computer Simulation 4: 254–266. Matsumoto, M., and T. Nishimura. 1998. “Mersenne Twister: A 623–Dimensionally Equidistributed Uniform Pseudo-Random Number Generator.” ACM Transactions on Modeling and Computer Simulation 8: 3–30. Ripley, B. D. 1987. Stochastic Simulation. New York: Wiley.
Functions and CALL Routines
4
RANEXP Function
1045
Robert, C. P., and G. Casella. 1999. Monte Carlo Statistical Methods. New York: Springer-Verlag. Ross, S. M. 1997. Simulation. San Diego: Academic Press.
RANEXP Function Returns a random variate from an exponential distribution. Category: Random Number Tip: If you want to change the seed value during execution, you must use the CALL RANEXP routine instead of the RANEXP function.
Syntax RANEXP(seed)
Arguments
seed
is a numeric constant, variable, or expression with an integer value. If seed ≤ 0, the time of day is used to initialize the seed stream. Range: seed < 2 −1 31
See:
“Seed Values” on page 306 for more information about seed values
Details The RANEXP function returns a variate that is generated from an exponential distribution with parameter 1. An inverse transform method applied to a RANUNI uniform variate is used. An exponential variate X with parameter LAMBDA can be generated: x=ranexp(seed)/lambda;
An extreme value variate X with location parameter ALPHA and scale parameter BETA can be generated: x=alpha−beta*log(ranexp(seed));
A geometric variate X with parameter P can be generated as follows: x=floor(−ranexp(seed)/log(1−p));
For a discussion about seeds and streams of data, as well as examples of using the random-number functions, see “Generating Multiple Variables from One Seed in Random-Number Functions” on page 315.
Comparisons The CALL RANEXP routine, an alternative to the RANEXP function, gives greater control of the seed and random number streams.
1046
RANGAM Function
4
Chapter 4
See Also Functions and CALL routines: “RAND Function” on page 1034 “CALL RANEXP Routine” on page 481
RANGAM Function Returns a random variate from a gamma distribution. Random Number Tip: If you want to change the seed value during execution, you must use the CALL RANGAM routine instead of the RANGAM function. Category:
Syntax RANGAM(seed,a)
Arguments seed
is a numeric constant, variable, or expression with an integer value. If seed ≤ 0, the time of day is used to initialize the seed stream. 31 Range: seed < 2 −1 See: “Seed Values” on page 306 for more information about seed values a
is a numeric constant, variable, or expression that specifies the shape parameter. Range: a > 0
Details The RANGAM function returns a variate that is generated from a gamma distribution with parameter a. For a > 1, an acceptance-rejection method due to Cheng (1977) (See “References” on page 1213) is used. For a ≤ 1, an acceptance-rejection method due to Fishman is used (1978, Algorithm G2) (See “References” on page 1213). A gamma variate X with shape parameter ALPHA and scale BETA can be generated: x=beta*rangam(seed,alpha);
If 2*ALPHA is an integer, a chi-square variate X with 2*ALPHA degrees of freedom can be generated: x=2*rangam(seed,alpha);
If N is a positive integer, an Erlang variate X can be generated: x=beta*rangam(seed,N);
It has the distribution of the sum of N independent exponential variates whose means are BETA.
Functions and CALL Routines
4
RANGE Function
1047
And finally, a beta variate X with parameters ALPHA and BETA can be generated: y1=rangam(seed,alpha); y2=rangam(seed,beta); x=y1/(y1+y2);
For a discussion about seeds and streams of data, as well as examples of using the random-number functions, see “Generating Multiple Variables from One Seed in Random-Number Functions” on page 315.
Comparisons The CALL RANGAM routine, an alternative to the RANGAM function, gives greater control of the seed and random number streams.
See Also Functions and CALL routines: “RAND Function” on page 1034 “CALL RANGAM Routine” on page 483
RANGE Function Returns the range of the nonmissing values. Category: Descriptive Statistics
Syntax RANGE(argument-1)
Arguments argument
specifies a numeric constant, variable, or expression. At least one nonmissing argument is required. Otherwise, the function returns a missing value. The argument list can consist of a variable list, which is preceded by OF.
Details The RANGE function returns the difference between the largest and the smallest of the nonmissing arguments.
Examples SAS Statements
Results
x0=range(.,.);
.
x1=range(-2,6,3);
8
x2=range(2,6,3,.);
4
1048
RANK Function
4
Chapter 4
SAS Statements
Results
x3=range(1,6,3,1);
5
x4=range(of x1-x3);
4
RANK Function Returns the position of a character in the ASCII or EBCDIC collating sequence. Character
Category: Restriction: See:
“I18N Level 0” on page 305
RANK Function in the documentation for your operating environment.
Syntax RANK(x)
Arguments
x
specifies a character constant, variable, or expression.
Details The RANK function returns an integer that represents the position of the first character in the character expression. The result depends on your operating environment.
Examples SAS Statements
n=rank(’A’); put n;
See Also Functions: “BYTE Function” on page 417 “COLLATE Function” on page 568
Results ASCII
EBCDIC
65
193
Functions and CALL Routines
4
RANNOR Function
1049
RANNOR Function Returns a random variate from a normal distribution. Category: Random Number Tip: If you want to change the seed value during execution, you must use the CALL RANNOR routine instead of the RANNOR function.
Syntax RANNOR(seed)
Arguments
seed
is a numeric constant, variable, or expression with an integer value. If seed ≤ 0, the time of day is used to initialize the seed stream. Range: seed < 2 −1 31
“Seed Values” on page 306 for more information about seed values
See:
Details The RANNOR function returns a variate that is generated from a normal distribution with mean 0 and variance 1. The Box-Muller transformation of RANUNI uniform variates is used. A normal variate X with mean MU and variance S2 can be generated with this code: x=MU+sqrt(S2)*rannor(seed);
A lognormal variate X with mean exp(MU + S2/2) and variance exp(2*MU + 2*S2) −exp(2*MU + S2) can be generated with this code: x=exp(MU+sqrt(S2)*rannor(seed));
For a discussion about seeds and streams of data, as well as examples of using the random-number functions, see “Generating Multiple Variables from One Seed in Random-Number Functions” on page 315.
Comparisons The CALL RANNOR routine, an alternative to the RANNOR function, gives greater control of the seed and random number streams.
See Also Functions and CALL routines: “RAND Function” on page 1034 “CALL RANNOR Routine” on page 485
1050
RANPOI Function
4
Chapter 4
RANPOI Function Returns a random variate from a Poisson distribution. Category:
Random Number
Tip: If you want to change the seed value during execution, you must use the CALL RANPOI routine instead of the RANPOI function.
Syntax RANPOI(seed,m)
Arguments
seed
is a numeric constant, variable, or expression with an integer value. If seed ≤ 0, the time of day is used to initialize the seed stream. Range: See:
seed < 2 −1 31
“Seed Values” on page 306 for more information about seed values
m
is a numeric constant, variable, or expression that specifies the mean of the distribution. Range:
m≥0
Details The RANPOI function returns a variate that is generated from a Poisson distribution with mean m. For m < 85, an inverse transform method applied to a RANUNI uniform variate is used (Fishman 1976) (See “References” on page 1213). For m ≥ 85, the normal approximation of a Poisson random variable is used. To expedite execution, internal variables are calculated only on initial calls (that is, with each new m). For a discussion about seeds and streams of data, as well as examples of using the random-number functions, see “Generating Multiple Variables from One Seed in Random-Number Functions” on page 315.
Comparisons The CALL RANPOI routine, an alternative to the RANPOI function, gives greater control of the seed and random number streams.
See Also Functions and CALL routines: “RAND Function” on page 1034 “CALL RANPOI Routine” on page 491
Functions and CALL Routines
4
RANTBL Function
1051
RANTBL Function Returns a random variate from a tabled probability distribution. Category: Random Number Tip: If you want to change the seed value during execution, you must use the CALL RANTBL routine instead of the RANTBL function.
Syntax RANTBL(seed,p1 ,… pi… ,pn)
Arguments seed
is a numeric constant, variable, or expression with an integer value. If seed ≤ 0, the time of day is used to initialize the seed stream. 31 Range: seed < 2 −1 See:
“Seed Values” on page 306 for more information about seed values
pi
is a numeric constant, variable, or expression. Range: 0 ≤ pi ≤ 1 for 0 )
Arguments
argument
is a numeric constant, variable, or expression. Tip:
The argument list can consist of a variable list, which is preceded by OF.
Details The root mean square is the square root of the arithmetic mean of the squares of the values. If all the arguments are missing values, then the result is a missing value. Otherwise, the result is the root mean square of the non-missing values. Let n be the number of arguments with non-missing values, and let x1 ; x2 ; . . . ; xn be the values of those arguments. The root mean square is
Functions and CALL Routines
rx
2 1
+ x22 +
...+
n
4
ROUND Function
1061
x2n
Examples SAS Statements
Results
x1=rms(1,7);
5
x2=rms(.,1,5,11);
7
x3=rms(of x1-x2);
6.0827625303
ROUND Function Rounds the first argument to the nearest multiple of the second argument, or to the nearest integer when the second argument is omitted. Category: Truncation
Syntax ROUND (argument < ,rounding-unit>)
Arguments argument
is a numeric constant, variable, or expression to be rounded. rounding-unit
is a positive, numeric constant, variable, or expression that specifies the rounding unit.
Details Basic Concepts The ROUND function rounds the first argument to a value that is very close to a multiple of the second argument. The result might not be an exact multiple of the second argument. Differences between Binary and Decimal Arithmetic
Computers use binary arithmetic with finite precision. If you work with numbers that do not have an exact binary representation, computers often produce results that differ slightly from the results that are produced with decimal arithmetic. For example, the decimal values 0.1 and 0.3 do not have exact binary representations. In decimal arithmetic, 3*0.1 is exactly equal to 0.3, but this equality is not true in binary arithmetic. As the following example shows, if you write these two
1062
ROUND Function
4
Chapter 4
values in SAS, they appear the same. If you compute the difference, however, you can see that the values are different. data _null_; point_three=0.3; three_times_point_one=3*0.1; difference=point_three - three_times_point_one; put point_three= ; put three_times_point_one= ; put difference= ; run;
The following lines are written to the SAS log: point_three= 0.3 three_times_point_one= 0.3 difference= -5.55112E-17
Operating Environment Information: The example above was executed in a z/OS environment. If you use other operating environments, the results will be slightly different. 4
The Effects of Rounding Rounding by definition finds an exact multiple of the rounding unit that is closest to the value to be rounded. For example, 0.33 rounded to the nearest tenth equals 3*0.1 or 0.3 in decimal arithmetic. In binary arithmetic, 0.33 rounded to the nearest tenth equals 3*0.1, and not 0.3, because 0.3 is not an exact multiple of one tenth in binary arithmetic. The ROUND function returns the value that is based on decimal arithmetic, even though this value is sometimes not the exact, mathematically correct result. In the example ROUND(0.33,0.1), ROUND returns 0.3 and not 3*0.1. Expressing Binary Values
If the characters "0.3" appear as a constant in a SAS program, the value is computed by the standard informat as 3/10. To be consistent with the standard informat, ROUND(0.33,0.1) computes the result as 3/10, and the following statement produces the results that you would expect. if round(x,0.1) = 0.3 then ... more SAS statements ...
However, if you use the variable Y instead of the constant 0.3, as the following statement shows, the results might be unexpected depending on how the variable Y is computed. if round(x,0.1) = y then ... more SAS statements ...
If SAS reads Y as the characters "0.3" using the standard informat, the result is the same as if a constant 0.3 appeared in the IF statement. If SAS reads Y with a different informat, or if a program other than SAS reads Y, then there is no guarantee that the characters "0.3" would produce a value of exactly 3/10. Imprecision can also be caused by computation involving numbers that do not have exact binary representations, or by porting data sets from one operating environment to another that has a different floating-point representation. If you know that Y is a decimal number with one decimal place, but are not certain that Y has exactly the same value as would be produced by the standard informat, it is better to use the following statement: if round(x,0.1) = round(y,0.1) then ... more SAS statements ...
Functions and CALL Routines
4
ROUND Function
1063
Testing for Approximate Equality
You should not use the ROUND function as a general method to test for approximate equality. Two numbers that differ only in the least significant bit can round to different values if one number rounds down and the other number rounds up. Testing for approximate equality depends on how the numbers have been computed. If both numbers are computed to high relative precision, you could test for approximate equality by using the ABS and the MAX functions, as the following example shows. if abs(x-y) = 0 and < 60.
Examples SAS Statements
Results
time=’3:19:24’t; s=second(time); put s;
24
time=’6:25:65’t; s=second(time); put s;
5
time=’3:19:60’t; s=second(time); put s;
0
1083
1084
SIGN Function
4
Chapter 4
See Also Functions: “HOUR Function” on page 780 “MINUTE Function” on page 898
SIGN Function Returns the sign of a value. Category:
Mathematical
Syntax SIGN(argument)
Arguments argument
specifies a numeric constant, variable, or expression.
Details The SIGN function returns the following values: -1
if argument < 0
0
if argument = 0
1
if argument > 0.
Examples SAS Statements
Results
x=sign(-5);
-1
x=sign(5);
1
x=sign(0);
0
Functions and CALL Routines
4
SINH Function
1085
SIN Function Returns the sine. Category: Trigonometric
Syntax SIN(argument)
Arguments argument
specifies a numeric constant, variable, or expression and is expressed in radians. If the magnitude of argument is so great that mod(argument,pi) is accurate to less than about three decimal places, SIN returns a missing value.
Examples SAS Statements
Results
x=sin(0.5);
0.4794255386
x=sin(0);
0
x=sin(3.14159/4);
.7071063121
SINH Function Returns the hyperbolic sine. Category: Hyperbolic
Syntax SINH(argument)
Arguments argument
specifies a numeric constant, variable, or expression.
Details The SINH function returns the hyperbolic sine of the argument, which is given by
1086
SKEWNESS Function
4
Chapter 4
0 argument e
0 e0argument
1
=2
Examples SAS Statements
Results
x=sinh(0);
0
x=sinh(1);
1.1752011936
x=sinh(-1.0);
-1.175201194
SKEWNESS Function Returns the skewness of the nonmissing arguments. Category:
Descriptive Statistics
Syntax SKEWNESS(argument-1,argument-2,argument-3)
Arguments argument
specifies a numeric constant, variable, or expression.
Details At least three non-missing arguments are required. Otherwise, the function returns a missing value. If all non-missing arguments have equal values, the skewness is mathematically undefined. The SKEWNESS function returns a missing value and sets _ERROR_ equal to 1. The argument list can consist of a variable list, which is preceded by OF.
Examples SAS Statements
Results
x1=skewness(0,1,1);
-1.732050808
x2=skewness(2,4,6,3,1);
0.5901286564
x3=skewness(2,0,0);
1.7320508076
x4=skewness(of x1-x3);
-0.953097714
Functions and CALL Routines
4
SLEEP Function
1087
SLEEP Function For a specified period of time, suspends the execution of a program that invokes this function. Category: Special See:
SLEEP Function in the documentation for your operating environment.
Syntax SLEEP(n< , unit>)
Arguments n
is a numeric constant, variable, or expression that specifies the number of units of time for which you want to suspend execution of a program. Range: n ≥ 0 unit
is a numeric constant, variable, or expression that specifies the unit of time, as a power of 10, which is applied to n. For example, 1 corresponds to a second, and .001 to a millisecond. Default: 1 in a Windows PC environment, .001 in other environments
Details The SLEEP function suspends the execution of a program that invokes this function for a period of time that you specify. The program can be a DATA step, macro, IML, SCL, or anything that can invoke a function. The maximum sleep period for the SLEEP function is 46 days.
Examples Example 1: Suspending Execution for a Specified Period of Time
The following example tells SAS to delay the execution of the DATA step PAYROLL for 20 seconds: data payroll; time_slept=sleep(20,1); ...more SAS statements... run;
1088
SMALLEST Function
4
Chapter 4
Example 2: Suspending Execution Based on a Calculation of Sleep Time
The following example tells SAS to suspend the execution of the DATA step BUDGET until March 1, 2006, at 3:00 AM. SAS calculates the length of the suspension based on the target date and the date and time that the DATA step begins to execute. data budget; sleeptime=’01mar2006:03:00’dt-datetime(); time_calc=sleep(sleeptime,1); ...more SAS statements...; run;
See Also CALL routine “CALL SLEEP Routine” on page 509
SMALLEST Function Returns the kth smallest nonmissing value. Category:
Descriptive Statistics
Syntax SMALLEST (k, value-1)
Arguments k
is a numeric constant, variable, or expression that specifies which value to return. value
specifies a numeric constant, variable, or expression.
Details If k is missing, less than zero, or greater than the number of values, the result is a missing value and _ERROR_ is set to 1. Otherwise, if k is greater than the number of non-missing values, the result is a missing value but _ERROR_ is not set to 1.
Comparisons The SMALLEST function differs from the ORDINAL function in that the SMALLEST function ignores missing values, but the ORDINAL function counts missing values.
Examples This example compares the values that are returned by the SMALLEST function with values that are returned by the ORDINAL function.
Functions and CALL Routines
4
SOUNDEX Function
options pageno=1 nodate linesize=80 pagesize=60; data comparison; label smallest_num=’SMALLEST Function’ ordinal_num=’ORDINAL Function’; do k = 1 to 4; smallest_num = smallest(k, 456, 789, .Q, 123); ordinal_num = ordinal (k, 456, 789, .Q, 123); output; end; run; proc print data=comparison label noobs; var k smallest_num ordinal_num; title ’Results From the SMALLEST and the ORDINAL Functions’; run;
Output 4.88
Comparison of Values: The SMALLEST and the ORDINAL Functions Results From the SMALLEST and the ORDINAL Functions
k
SMALLEST Function
1 2 3 4
123 456 789 .
ORDINAL Function Q 123 456 789
See Also Functions: “LARGEST Function” on page 850 “ORDINAL Function” on page 951 “PCTL Function” on page 953
SOUNDEX Function Encodes a string to facilitate searching. Category: Character Restriction: Restriction:
SOUNDEX algorithm is English-biased. “I18N Level 0” on page 305
Syntax SOUNDEX(argument)
1
1089
1090
SOUNDEX Function
4
Chapter 4
Arguments
argument
specifies a character constant, variable, or expression.
Details Length of Returned Variable In a DATA step, if the SOUNDEX function returns a value to a variable that has not previously been assigned a length, then that variable is given a length of 200 bytes. The Basics The SOUNDEX function encodes a character string according to an algorithm that was originally developed by Margaret K. Odell and Robert C. Russel (US Patents 1261167 (1918) and 1435663 (1922)). The algorithm is described in Knuth, The Art of Computer Programming, Volume 3 (See “References” on page 1213). Note that the SOUNDEX algorithm is English-biased and is less useful for languages other than English. The SOUNDEX function returns a copy of the argument that is encoded by using the following steps: 1 Retain the first letter in the argument and discard the following letters:
AEHIOUWY 2 Assign the following numbers to these classes of letters:
1: B F P V 2: C G J K Q S X Z 3: D T 4: L 5: M N 6: R 3 If two or more adjacent letters have the same classification from Step 2, then
discard all but the first. (Adjacent refers to the position in the word before discarding letters.) The algorithm that is described in Knuth adds trailing zeros and truncates the result to the length of 4. You can perform these operations with other SAS functions.
Examples SAS Statements
Results
x=soundex(’Paul’); put x;
P4
word=’amnesty’; x=soundex(word); put x;
A523
Functions and CALL Routines
4
SPEDIS Function
1091
SPEDIS Function Determines the likelihood of two words matching, expressed as the asymmetric spelling distance between the two words. Category: Character Restriction:
“I18N Level 0” on page 305
Syntax SPEDIS(query,keyword)
Arguments query
identifies the word to query for the likelihood of a match. SPEDIS removes trailing blanks before comparing the value. keyword
specifies a target word for the query. SPEDIS removes trailing blanks before comparing the value.
Details Length of Returned Variable In a DATA step, if the SPEDIS function returns a value to a variable that has not previously been assigned a length, then that variable is given a length of 200 bytes. The Basics SPEDIS returns the distance between the query and a keyword, a nonnegative value that is usually less than 100 but never greater than 200 with the default costs. SPEDIS computes an asymmetric spelling distance between two words as the normalized cost for converting the keyword to the query word by using a sequence of operations. SPEDIS(QUERY, KEYWORD) is not the same as SPEDIS(KEYWORD, QUERY). Costs for each operation that is required to convert the keyword to the query are listed in the following table: Operation
Cost
Explanation
match
0
no change
singlet
25
delete one of a double letter
doublet
50
double a letter
swap
50
reverse the order of two consecutive letters
truncate
50
delete a letter from the end
append
35
add a letter to the end
delete
50
delete a letter from the middle
insert
100
insert a letter in the middle
1092
SPEDIS Function
4
Chapter 4
Operation
Cost
Explanation
replace
100
replace a letter in the middle
firstdel
100
delete the first letter
firstins
200
insert a letter at the beginning
firstrep
200
replace the first letter
The distance is the sum of the costs divided by the length of the query. If this ratio is greater than one, the result is rounded down to the nearest whole number.
Comparisons The SPEDIS function is similar to the COMPLEV and COMPGED functions, but COMPLEV and COMPGED are much faster, especially for long strings.
Examples options nodate pageno=1 linesize=64; data words; input Operation $ Query $ Keyword $; Distance = spedis(query,keyword); Cost = distance * length(query); datalines; match fuzzy fuzzy singlet fuzy fuzzy doublet fuuzzy fuzzy swap fzuzy fuzzy truncate fuzz fuzzy append fuzzys fuzzy delete fzzy fuzzy insert fluzzy fuzzy replace fizzy fuzzy firstdel uzzy fuzzy firstins pfuzzy fuzzy firstrep wuzzy fuzzy several floozy fuzzy ; proc print data = words; run;
The output from the DATA step is as follows.
Functions and CALL Routines
Output 4.89
1 2 3 4 5 6 7 8 9 10 11 12 13
SQRT Function
1093
Costs for SPEDIS Operations The SAS System
Obs
4
Operation
Query
match singlet doublet swap truncate append delete insert replace firstdel firstins firstrep several
fuzzy fuzy fuuzzy fzuzy fuzz fuzzys fzzy fluzzy fizzy uzzy pfuzzy wuzzy floozy
1
Keyword
Distance
Cost
fuzzy fuzzy fuzzy fuzzy fuzzy fuzzy fuzzy fuzzy fuzzy fuzzy fuzzy fuzzy fuzzy
0 6 8 10 12 5 12 16 20 25 33 40 50
0 24 48 50 48 30 48 96 100 100 198 200 300
See Also Functions: “COMPLEV Function” on page 580 “COMPGED Function” on page 575
SQRT Function Returns the square root of a value. Category: Mathematical
Syntax SQRT(argument)
Arguments argument
specifies a numeric constant, variable, or expression. Argument must be nonnegative.
Examples SAS Statements
Results
x=sqrt(36);
6
x=sqrt(25);
5
x=sqrt(4.4);
2.0976176963
1094
4
STD Function
Chapter 4
STD Function Returns the standard deviation of the nonmissing arguments. Category:
Descriptive Statistics
Syntax STD(argument-1,argument-2)
Arguments argument
specifies a numeric constant, variable, or expression. At least two nonmissing arguments are required. Otherwise, the function returns a missing value. The argument list can consist of a variable list, which is preceded by OF.
Examples SAS Statements
Results
x1=std(2,6);
2.8284271247
x2=std(2,6,.);
2.8284271427
x3=std(2,4,6,3,1);
1.9235384062
x4=std(of x1-x3);
0.5224377453
STDERR Function Returns the standard error of the mean of the nonmissing arguments. Category:
Descriptive Statistics
Syntax STDERR(argument-1,argument-2)
Arguments argument
specifies a numeric constant, variable, or expression. At least two nonmissing arguments are required. Otherwise, the function returns a missing value. The argument list can consist of a variable list, which is preceded by OF.
Functions and CALL Routines
4
STFIPS Function
1095
Examples SAS Statements
Results
x1=stderr(2,6);
2
x2=stderr(2,6,.);
2
x3=stderr(2,4,6,3,1);
0.8602325267
x4=stderr(of x1-x3);
0.3799224911
STFIPS Function Converts state postal codes to FIPS state codes. Category: State and Zip Code
Syntax STFIPS(postal-code)
Arguments
postal-code
specifies a character expression that contains the two-character standard state postal code. Characters can be mixed case. The function ignores trailing blanks, but generates an error if the expression contains leading blanks.
Details The STFIPS function converts a two-character state postal code (or world-wide GSA geographic code for U.S. territories) to the corresponding numeric U.S. Federal Information Processing Standards (FIPS) code.
Comparisons The STFIPS, STNAME, and STNAMEL functions take the same argument but return different values. STFIPS returns a numeric U.S. Federal Information Processing Standards (FIPS) code. STNAME returns an uppercase state name. STNAMEL returns a mixed case state name.
Examples The examples show the differences when using STFIPS, STNAME, and STNAMEL.
1096
STNAME Function
4
Chapter 4
SAS Statements
Results
fips=stfips (’NC’); put fips;
37
state=stname(’NC’); put state;
NORTH CAROLINA
state=stnamel(’NC’); put state;
North Carolina
See Also Functions: “FIPNAME Function” on page 726 “FIPNAMEL Function” on page 727 “FIPSTATE Function” on page 728 “STNAME Function” on page 1096, “STNAMEL Function” on page 1097
STNAME Function Converts state postal codes to uppercase state names. Category:
State and Zip Code
Syntax STNAME(postal-code)
Arguments postal-code
specifies a character expression that contains the two-character standard state postal code. Characters can be mixed case. The function ignores trailing blanks, but generates an error if the expression contains leading blanks.
Details The STNAME function converts a two-character state postal code (or world-wide GSA geographic code for U.S. territories) to the corresponding state name in uppercase. Note: For Version 6, the maximum length of the value that is returned is 200 characters. For Version 7 and beyond, the maximum length is 20 characters. 4
Comparisons The STFIPS, STNAME, and STNAMEL functions take the same argument but return different values. STFIPS returns a numeric U.S. Federal Information Processing
Functions and CALL Routines
4
STNAMEL Function
1097
Standards (FIPS) code. STNAME returns an uppercase state name. STNAMEL returns a mixed case state name.
Examples SAS Statements
Results
fips=stfips (’NC’); put fips;
37
state=stname(’NC’); put state;
NORTH CAROLINA
state=stnamel(’NC’); put state;
North Carolina
See Also Functions: “FIPNAME Function” on page 726 “FIPNAMEL Function” on page 727 “FIPSTATE Function” on page 728 “STFIPS Function” on page 1095 “STNAMEL Function” on page 1097
STNAMEL Function Converts state postal codes to mixed case state names. Category: State and Zip Code
Syntax STNAMEL(postal-code)
Arguments postal-code
specifies a character expression that contains the two-character standard state postal code. Characters can be mixed case. The function ignores trailing blanks, but generates an error if the expression contains leading blanks.
Details If the STNAMEL function returns a value to a variable that has not yet been assigned a length, by default the variable is assigned a length of 20.
1098
STRIP Function
4
Chapter 4
The STNAMEL function converts a two-character state postal code (or world-wide GSA geographic code for U.S. territories) to the corresponding state name in mixed case. Note: For Version 6, the maximum length of the value that is returned is 200 characters. For Version 7 and beyond, the maximum length is 20 characters. 4
Comparisons The STFIPS, STNAME, and STNAMEL functions take the same argument but return different values. STFIPS returns a numeric U.S. Federal Information Processing Standards (FIPS) code. STNAME returns an uppercase state name. STNAMEL returns a mixed case state name.
Examples The examples show the differences when using STFIPS, STNAME, and STNAMEL. SAS Statements
Results
fips=stfips (’NC’); put fips;
37
state=stname(’NC’); put state;
NORTH CAROLINA
state=stnamel(’NC’); put state;
North Carolina
See Also Functions: “FIPNAME Function” on page 726 “FIPNAMEL Function” on page 727 “FIPSTATE Function” on page 728 “STFIPS Function” on page 1095
STRIP Function Returns a character string with all leading and trailing blanks removed. Category: Restriction:
Character “I18N Level 0” on page 305
Syntax STRIP(string)
Functions and CALL Routines
4
STRIP Function
1099
Arguments string
is a character constant, variable, or expression.
Details Length of Returned Variable In a DATA step, if the STRIP function returns a value to a variable that has not previously been assigned a length, then that variable is given the length of the argument. The Basics The STRIP function returns the argument with all leading and trailing blanks removed. If the argument is blank, STRIP returns a string with a length of zero. Assigning the results of STRIP to a variable does not affect the length of the receiving variable. If the value that is trimmed is shorter than the length of the receiving variable, SAS pads the value with new trailing blanks. Note: The STRIP function is useful for concatenation because the concatenation operator does not remove leading or trailing blanks. 4
Comparisons The following list compares the STRIP function with the TRIM and TRIMN functions:
3 For strings that are blank, the STRIP and TRIMN functions return a string with a length of zero, whereas the TRIM function returns a single blank.
3 For strings that lack leading blanks, the STRIP and TRIMN functions return the same value.
3 For strings that lack leading blanks but have at least one non-blank character, the STRIP and TRIM functions return the same value. Note: STRIP(string) returns the same result as TRIMN(LEFT(string)), but the STRIP function runs faster. 4
Examples The following example shows the results of using the STRIP function to delete leading and trailing blanks. options pageno=1 nodate ls=80 ps=60; data lengthn; input string $char8.; original = ’*’ || string || ’*’; stripped = ’*’ || strip(string) || ’*’; datalines; abcd abcd abcd abcdefgh x y z ; proc print data=lengthn;
1100
SUBPAD Function
4
Chapter 4
run;
Output 4.90
Results from the STRIP Function The SAS System Obs
string
original
1 2 3 4 5
abcd abcd abcd abcdefgh x y z
*abcd * * abcd * * abcd* *abcdefgh* * x y z *
1 stripped *abcd* *abcd* *abcd* *abcdefgh* *x y z*
See Also Functions: “CAT Function” on page 526 “CATS Function” on page 532 “CATT Function” on page 534 “CATX Function” on page 537 “LEFT Function” on page 854 “TRIM Function” on page 1132 “TRIMN Function” on page 1134
SUBPAD Function Returns a substring that has a length you specify, using blank padding if necessary. Character Restriction: “I18N Level 1” on page 305 Category:
Syntax SUBPAD(string, position )
Arguments string
specifies a character constant, variable, or expression. position
is a positive integer that specifies the position of the first character in the substring. length
is a non-negative integer that specifies the length of the substring. If you do not specify length, the SUBPAD function returns the substring that extends from the position that you specify to the end of the string.
Functions and CALL Routines
4
SUBSTR (left of =) Function
1101
Details In a DATA step, if the SUBPAD function returns a value to a variable that has not previously been assigned a length, then that variable is given a length of 200 bytes. If the substring that you specify extends beyond the length of the string, the result is padded with blanks.
Comparisons The SUBPAD function is similar to the SUBSTR function except for the following differences:
3 If the value of length in SUBPAD is zero, SUBPAD returns a zero-length string. If the value of length in SUBSTR is zero, SUBSTR
3 writes a note to the log stating that the third argument is invalid 3 sets _ERROR_=1 3 returns the substring that extends from the position that you specified to the end of the string.
3 If the substring that you specify extends past the end of the string, SUBPAD pads the result with blanks to yield the length that you requested. If the substring that you specify extends past the end of the string, SUBSTR
3 writes a note to the log stating that the third argument is invalid 3 sets _ERROR_=1 3 returns the substring that extends from the position that you specified to the end of the string.
See Also Function: “SUBSTRN Function” on page 1104
SUBSTR (left of =) Function Replaces character value contents. Category: Character Restriction:
“I18N Level 0” on page 305
Tip: DBCS equivalent functions are KSUBSTR and KSUBSTRB in SAS National Language Support (NLS): Reference Guide.
Syntax SUBSTR(variable, position< ,length>)=characters-to-replace
1102
SUBSTR (left of =) Function
4
Chapter 4
Arguments variable
specifies a character variable. position
specifies a numeric constant, variable, or expression that is the beginning character position. length
specifies a numeric constant, variable, or expression that is the length of the substring that will be replaced. Restriction: length cannot be larger than the length of the expression that remains
in variable after position. If you omit length, SAS uses all of the characters on the right side of the assignment statement to replace the values of variable.
Tip:
characters-to-replace
specifies a character constant, variable, or expression that will replace the contents of variable. Tip:
Enclose a literal string of characters in quotation marks.
Details If you use an undeclared variable, it will be assigned a default length of 8 when the SUBSTR function is compiled. When you use the SUBSTR function on the left side of an assignment statement, SAS replaces the value of variable with the expression on the right side. SUBSTR replaces length characters starting at the character that you specify in position.
Examples SAS Statements
Results
a=’KIDNAP’; substr(a,1,3)=’CAT’; put a;
CATNAP
b=a; substr(b,4)=’TY’; put b;
CATTY
See Also Function: “SUBSTR (right of =) Function” on page 1103
Functions and CALL Routines
4
SUBSTR (right of =) Function
1103
SUBSTR (right of =) Function Extracts a substring from an argument. Category: Character
“I18N Level 0” on page 305 DBCS equivalent functions are KSUBSTR and KSUBSTRB in SAS National Language Support (NLS): Reference Guide. Restriction: Tip:
Syntax SUBSTR(string, position< ,length>)
Arguments variable
specifies a valid SAS variable name. string
specifies a character constant, variable, or expression. position
specifies a numeric constant, variable, or expression that is the beginning character position. length
specifies a numeric constant, variable, or expression that is the length of the substring to extract. Interaction: If length is zero, a negative value, or larger than the length of the expression that remains in string after position, SAS extracts the remainder of the expression. SAS also sets _ERROR_ to 1 and prints a note to the log indicating that the length argument is invalid. Tip: If you omit length, SAS extracts the remainder of the expression.
Details In a DATA step, if the SUBSTR (right of =) function returns a value to a variable that has not previously been assigned a length, then that variable is given the length of the first argument. The SUBSTR function returns a portion of an expression that you specify in string. The portion begins with the character that you specify by position, and is the number of characters that you specify in length.
1104
SUBSTRN Function
4
Chapter 4
Examples SAS Statements
Results ----+----1----+----2
date=’06MAY98’; month=substr(date,3,3); year=substr(date,6,2); put @1 month @5 year;
MAY 98
See Also Functions: “SUBPAD Function” on page 1100 “SUBSTR (left of =) Function” on page 1101 “SUBSTRN Function” on page 1104
SUBSTRN Function Returns a substring, allowing a result with a length of zero. Category: Restriction:
Character “I18N Level 1” on page 305
Tip: KSUBSTR in SAS National Language Support (NLS): Reference Guide has the same functionality.
Syntax SUBSTRN(string, position )
Arguments string
specifies a character or numeric constant, variable, or expression. If string is numeric, then it is converted to a character value that uses the BEST32. format. Leading and trailing blanks are removed, and no message is sent to the SAS log. position
is an integer that specifies the position of the first character in the substring. length
is an integer that specifies the length of the substring. If you do not specify length, the SUBSTRN function returns the substring that extends from the position that you specify to the end of the string.
Functions and CALL Routines
4
SUBSTRN Function
1105
Details Length of Returned Variable In a DATA step, if the SUBSTRN function returns a value to a variable that has not previously been assigned a length, then that variable is given the length of the first argument. The Basics The following information applies to the SUBSTRN function: 3 The SUBSTRN function returns a string with a length of zero if either position or length has a missing value.
3 If the position that you specify is non-positive, the result is truncated at the beginning, so that the first character of the result is the first character of the string. The length of the result is reduced accordingly.
3 If the length that you specify extends beyond the end of the string, the result is truncated at the end, so that the last character of the result is the last character of the string.
Using the SUBSTRN Function in a Macro
If you call SUBSTRN by using the %SYSFUNC macro, then the macro processor resolves the first argument (string) to determine whether the argument is character or numeric. If you do not want the first argument to be evaluated as a macro expression, use one of the macro-quoting functions in the first argument.
Comparisons The following table lists comparisons between the SUBSTRN and the SUBSTR functions: Table 4.6 Comparisons between SUBSTRN and SUBSTR Condition
Function
Result
the value of position is nonpositive
SUBSTRN
returns a result beginning at the first character of the string.
the value of position is nonpositive
SUBSTR
3 3 3
the value of length is nonpositive
SUBSTRN
the value of length is nonpositive
SUBSTR
writes a note to the log stating that the second argument is invalid. sets _ERROR_ =1. returns the substring that extends from the position that you specified to the end of the string.
returns a result with a length of zero.
3 3 3
writes a note to the log stating that the third argument is invalid. sets _ERROR_ =1. returns the substring that extends from the position that you specified to the end of the string.
1106
SUBSTRN Function
4
Chapter 4
Condition
Function
Result
the substring that you specify extends past the end of the string
SUBSTRN
truncates the result.
the substring that you specify extends past the end of the string
SUBSTR
3 3 3
writes a note to the log stating that the third argument is invalid. sets _ERROR_=1. returns the substring that extends from the position that you specified to the end of the string.
Examples Example 1: Manipulating Strings with the SUBSTRN Function shows how to manipulate strings with the SUBSTRN function. options pageno=1 nodate ls=80 ps=60; data test; retain string "abcd"; drop string; do Position = -1 to 6; do Length = max(-1,-position) to 7-position; Result = substrn(string, position, length); output; end; end; datalines; abcd ; proc print noobs data=test; run;
The following example
Functions and CALL Routines
Output 4.91
4
SUBSTRN Function
1107
Output from the SUBSTRN Function The SAS System Position
Length
-1 -1 -1 -1 -1 -1 -1 -1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 3 3 3 3 3 3 4 4 4 4 4 5 5 5 5 6 6 6
1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 -1 0 1 2 3 4 5 6 -1 0 1 2 3 4 5 -1 0 1 2 3 4 -1 0 1 2 3 -1 0 1 2 -1 0 1
1 Result
a ab abc abcd abcd abcd
a ab abc abcd abcd abcd
a ab abc abcd abcd abcd
b bc bcd bcd bcd
c cd cd cd
d d d
Example 2: Comparison between the SUBSTR and SUBSTRN Functions The following example compares the results of using the SUBSTR function and the SUBSTRN function when the first argument is numeric. data _null_; substr_result = "*" || substr(1234.5678,2,6) || "*"; put substr_result=; substrn_result = "*" || substrn(1234.5678,2,6) || "*"; put substrn_result=; run;
1108
SUM Function
4
Chapter 4
Output 4.92
Results from the SUBSTR and SUBSTRN Functions
substr_result=* 1234* substrn_result=*234.56*
See Also Functions: “SUBPAD Function” on page 1100 “SUBSTR (left of =) Function” on page 1101 “SUBSTR (right of =) Function” on page 1103
SUM Function Returns the sum of the nonmissing arguments. Category:
Descriptive Statistics
Syntax SUM(argument,argument, ...)
Arguments
argument
specifies a numeric constant, variable, or expression. If all the arguments have missing values, then one of the following occurs:
3 If you use only one argument, then the value of that argument is returned. 3 If you use two or more arguments, then a standard missing value (.) is returned. Otherwise, the result is the sum of the nonmissing values. The argument list can consist of a variable list, which is preceded by OF.
Examples SAS Statements
Results
x1=sum(4,9,3,8);
24
x2=sum(4,9,3,8,.);
24
x1=9; x2=39; x3=sum(of x1-x2);
48
Functions and CALL Routines
SAS Statements
Results
x1=5; x2=6; x3=4; x4=9; y1=34; y2=12; y3=74; y4=39; result=sum(of x1-x4, of y1-y5);
183
x1=55; x2=35; x3=6; x4=sum(of x1-x3, 5);
101
x1=7; x2=7; x5=sum(x1-x2);
0
y1=20; y2=30; x6=sum(of y:);
50
4
SUMABS Function
1109
SUMABS Function Returns the sum of the absolute values of the non-missing arguments. Category: Descriptive Statistics
Syntax SUMABS(value-1 )
Arguments value
specifies a numeric expression.
Details If all arguments have missing values, then the result is a missing value. Otherwise, the result is the sum of the absolute values of the non-missing values.
Examples Example 1: Calculating the Sum of Absolute Values
The following example returns the sum of the absolute values of the non-missing arguments. data _null_; x=sumabs(1,.,-2,0,3,.q,-4); put x=; run;
SAS writes the following output to the log: x=10
1110
SYMEXIST Function
4
Chapter 4
Example 2: Calculating the Sum of Absolute Values When You Use a Variable List
The following example uses a variable list and returns the sum of the absolute value of the non-missing arguments. data _null_; x1 = 1; x2 = 3; x3 = 4; x4 = 3; x5 = 1; x = sumabs(of x1-x5); put x=; run;
SAS writes the following output to the log: x=12
SYMEXIST Function Returns an indication of the existence of a macro variable. Macro
Category: See:
SYMEXIST Function in SAS Macro Language: Reference
Syntax SYMEXIST (argument)
Argument
argument
can be one of the following items:
3 the name of a macro variable within double quotation marks but without an ampersand
3 the name of a DATA step character variable, specified with no quotation marks, which contains a macro variable name
3 a character expression that constructs a macro variable name
Details The SYMEXIST function searches any enclosing local symbol tables and then the global symbol table for the indicated macro variable and returns 1 if the macro variable is found or 0 if the macro variable is not found. For more information, see the “SYMEXIST Function” in SAS Macro Language: Reference.
Functions and CALL Routines
4
SYMGLOBL Function
1111
SYMGET Function Returns the value of a macro variable during DATA step execution. Category: Macro
Syntax SYMGET(argument)
Arguments argument
can be one of the following items: 3 the name of a macro variable within double quotation marks but without an ampersand 3 the name of a DATA step character variable, specified with no quotation marks, which contains a macro variable name 3 a character expression that constructs a macro variable name
Details If the SYMGET function returns a value to a variable that has not yet been assigned a length, by default the variable is assigned a length of 200. The SYMGET function returns the value of a macro variable during DATA step execution. For more information, see the “SYMGET Function” in SAS Macro Language: Reference.
See Also CALL routine: “CALL SYMPUT Routine” on page 518 SAS Macro Language: Reference
SYMGLOBL Function Returns an indication of whether a macro variable is in global scope to the DATA step during DATA step execution. Category: Macro See:
SYMGLOBL Function in SAS Macro Language: Reference
Syntax SYMGLOBL (argument)
1112
4
SYMLOCAL Function
Chapter 4
Argument argument
can be one of the following items:
3 the name of a macro variable within double quotation marks but without an ampersand.
3 the name of a DATA step character variable, specified with no quotation marks, which contains a macro variable name. 3 a character expression that constructs a macro variable name.
Details The SYMGLOBL function searches only the global symbol table for the indicated macro variable and returns 1 if the macro variable is found or 0 if the macro variable is not found. SYMGLOBL is fully documented in SAS Macro Language: Reference.
SYMLOCAL Function Returns an indication of whether a macro variable is in local scope to the DATA step during DATA step execution. Macro SYMLOCAL Function in SAS Macro Language: Reference
Category: See:
Syntax SYMLOCAL (argument)
Argument argument
can be one of the following items:
3 the name of a macro variable within double quotation marks but without an ampersand.
3 the name of a DATA step character variable, specified with no quotation marks, which contains a macro variable name.
3 a character expression that constructs a macro variable name.
Details The SYMLOCAL function searches the enclosing local symbol tables for the indicated macro variable and returns 1 if the macro variable is found or 0 if the macro variable is not found.
Functions and CALL Routines
4
SYSGET Function
1113
SYMLOCAL is fully documented in SAS Macro Language: Reference.
SYSGET Function Returns the value of the specified operating environment variable. Category: Special See:
SYSGET Function in the documentation for your operating environment.
Syntax SYSGET(operating-environment-variable)
Arguments operating-environment-variable
is a character constant, variable, or expression with a value that is the name of an operating environment variable. The case of operating-environment-variable must agree with the case that is stored in the operating environment. Trailing blanks in the argument of SYSGET are significant. Use the TRIM function to remove them. Operating Environment Information: The term operating-environment-variable used in the description of this function refers to a name that represents a numeric, character, or logical value in the operating environment. Refer to the SAS documentation for your operating environment for details. 4
Details If the SYSGET function returns a value to a variable that has not yet been assigned a length, by default the variable is assigned a length of 200. If the value of the operating environment variable is truncated or the variable is not defined in the operating environment, SYSGET displays a warning message in the SAS log.
Examples This example obtains the value of two environment variables in the UNIX environment: data _null_; length result $200; input env_var $; result=sysget(trim(env_var)); put env_var= result=; datalines; USER PATH ;
1114
SYSMSG Function
4
Chapter 4
Executing this DATA step for user ABCDEF displays these lines: ENV_VAR=USER RESULT=abcdef ENV_VAR=PATH RESULT=path-for-abcdef
See Also 3 Functions: 3 “ENVLEN Function” on page 647
SYSMSG Function Returns error or warning message text from processing the last data set or external file function. Category: Category:
SAS File I/O External Files
Syntax SYSMSG()
Details SYSMSG returns the text of error messages or warning messages that are produced when a data set or external file access function encounters an error condition. If no error message is available, the returned value is blank. The internally stored error message is reset to blank after a call to SYSMSG, so subsequent calls to SYSMSG before another error condition occurs return blank values.
Examples This example uses SYSMSG to write to the SAS log the error message generated if FETCH cannot copy the next observation into the Data Set Data Vector. The return code is 0 only when a record is fetched successfully: %let rc=%sysfunc(fetch(&dsid)); %if &rc ne 0 %then %put %sysfunc(sysmsg());
See Also Functions: “FETCH Function” on page 660 “SYSRC Function” on page 1118
SYSPARM Function Returns the system parameter string.
Functions and CALL Routines
4
SYSPROCESSID Function
1115
Category: Special
Syntax SYSPARM()
Details If the SYSPARM function returns a value to a variable that has not yet been assigned a length, by default the variable is assigned a length of 200. SYSPARM allows you to access a character string specified with the SYSPARM= system option at SAS invocation or in an OPTIONS statement. Note: If the SYSPARM= system option is not specified, the SYSPARM function returns a string with a length of zero. 4
Examples This example shows the SYSPARM= system option and the SYSPARM function. options sysparm=’yes’; data a; If sysparm()=’yes’ then do; ...SAS Statements... end; run;
See Also System option: SYSPARM= System Option in SAS Macro Language: Reference
SYSPROCESSID Function Returns the process ID of the current process. Category:
Special
Syntax SYSPROCESSID()
Details The SYSPROCESSID function returns the 32–character hexadecimal ID of the current process. This ID can be passed to the SYSPROCESSNAME function to obtain the name of the current process.
1116
SYSPROCESSNAME Function
4
Chapter 4
Examples Example 1: Using a DATA Step
The following DATA step writes the current process id
to the SAS log: data _null_; id=sysprocessid(); put id; run;
Example 2: Using SAS Macro Language
The following SAS Macro Language code writes the current process id to the SAS log: %let id=%sysfunc(sysprocessid()); %put &id;
See Also Function: “SYSPROCESSNAME Function” on page 1116
SYSPROCESSNAME Function Returns the process name that is associated with a given process ID, or returns the name of the current process. Category:
Special
Syntax SYSPROCESSNAME(< process_id>)
Arguments process_id
specifies a 32–character hexadecimal process id.
Details The SYSPROCESSNAME function returns the process name associated with the process id you supply as an argument. You can use the value returned from the SYSPROCESSID function as the argument to SYSPROCESSNAME. If you omit the argument, then SYSPROCESSNAME returns the name of the current process. You can also use the values stored in the automatic macro variables SYSPROCESSID and SYSSTARTID as arguments to SYSPROCESSNAME.
Examples Example 1: Using SYSPROCESSNAME Without an Argument in a DATA Step following DATA step writes the current process name to the SAS log:
The
Functions and CALL Routines
4
SYSPROD Function
1117
data _null_; name=sysprocessname(); put name; run;
Example 2: Using SYSPROCESSNAME With an Argument in SAS Macro Language
The following SAS Macro Language code writes the process name associated with the given process id to the SAS log: %let id=&sysprocessid; %let name=%sysfunc(sysprocessname(&id)); %put &name;
See Also Function: “SYSPROCESSID Function” on page 1115
SYSPROD Function Determines whether a product is licensed. Category: Special
Syntax SYSPROD(product-name)
Arguments
product-name
specifies a character constant, variable, or expression with a value that is the name of a SAS product. Requirement:
Product-name must be the correct official name of the product or
solution.
Details The SYSPROD function returns 1 if a specific SAS software product is licensed, 0 if it is a SAS software product but not licensed for your system, and -1 if the product name is not recognized. Use SYSPROD in the DATA step, in an IML step, or in an SCL program. If SYSPROD indicates that a product is licensed, it means that the final license expiration date has not passed. To determine the final expiration date for the product, execute the following program: proc setinit noalias; run;
1118
SYSRC Function
4
Chapter 4
It is possible for a SAS software product to exist on your system even though the product is no longer licensed. In this case, SAS cannot access this product. Similarly, it is possible for a product to be licensed, but not installed. You can enter the product name in uppercase, in lowercase, or in mixed case. You can prefix the product with ’SAS/’. You can prefix SAS/ACCESS product names with ’ACC-’. To view a list of products that are available on your system, execute the following program: proc setinit noalias; run;
Examples These examples determine whether a specified product is licensed.
3 x=sysprod(’graph’); If SAS/GRAPH software is currently licensed, then SYSPROD returns a value of 1. If SAS/GRAPH software is not currently licensed, then SYSPROD returns a value of 0.
3 x=sysprod(’abc’); SYSPROD returns a value of –1 because ABC is not a valid product name.
3 x=sysprod(’base’); or x=sysprod(’base sas’);
SYSPROD always returns a value of 1 because the Base product must be licensed for the SYSPROD function to run successfully.
SYSRC Function Returns a system error number. SAS File I/O Category: External Files Category:
Syntax SYSRC()
Details SYSRC returns the error number for the last system error encountered by a call to one of the data set functions or external file functions.
Examples This example determines the error message if FILEREF does not exist: %if %sysfunc(fileref(myfile)) ne 0 %then %put %sysfunc(sysrc()) - %sysfunc(sysmsg());
Functions and CALL Routines
4
SYSTEM Function
1119
See Also Functions: “FILEREF Function” on page 669 “SYSMSG Function” on page 1114
SYSTEM Function Issues an operating environment command during a SAS session, and returns the system return code. Category: Special See:
SYSTEM Function in the documentation for your operating environment.
Syntax SYSTEM(command)
Arguments command
specifies any of the following: a system command that is enclosed in quotation marks (explicit character string), an expression whose value is a system command, or the name of a character variable whose value is a system command that is executed. Operating Environment Information: See the SAS documentation for your operating environment for information about what you can specify. The system return code is dependent on your operating environment. 4 Restriction: The length of the command cannot be greater than 1024 characters,
including trailing blanks.
Comparisons The SYSTEM function is similar to the X statement, the X command, and the CALL SYSTEM routine. In most cases, the X statement, X command, or %SYSEXEC macro statement are preferable because they require less overhead. However, the SYSTEM function can be executed conditionally, and accepts expressions as arguments. The X statement is a global statement and executes as a DATA step is being compiled, regardless of whether SAS encounters a conditional statement.
Examples Execute the host command TIMEDATA if the macro variable SYSDAY is Friday. data _null_; if "&sysday"="Friday" then do; rc=system("timedata"); end;
1120
4
TAN Function
Chapter 4
else rc=system("errorck"); run;
See Also CALL Routine: “CALL SYSTEM Routine” on page 521 Statement: “X Statement” on page 1755
TAN Function Returns the tangent. Category:
Trigonometric
Syntax TAN(argument)
Arguments argument
specifies a numeric constant, variable, or expression and is expressed in radians. If the magnitude of argument is so great that mod(argument,pi) is accurate to less than about three decimal places, TAN returns a missing value. Restriction: cannot be an odd multiple of /2
Examples SAS Statements
Results
x=tan(0.5);
0.5463024898
x=tan(0);
0
x=tan(3.14159/3);
1.7320472695
TANH Function Returns the hyperbolic tangent. Category:
Hyperbolic
Functions and CALL Routines
4
TIME Function
1121
Syntax TANH(argument)
Arguments argument
specifies a numeric constant, variable, or expression.
Details The TANH function returns the hyperbolic tangent of the argument, which is given by
0 argument e
1
0 e0argument (eargument + e0argument )
Examples SAS Statements
Results
x=tanh(0);
0
x=tanh(0.5);
0.4621171573
x=tanh(-0.5);
-0.462117157
TIME Function Returns the current time of day as a numeric SAS time value. Category: Date and Time
Syntax TIME()
Examples SAS assigns CURRENT a SAS time value corresponding to 14:32:00 if the following statements are executed exactly at 2:32 PM: current=time(); put current=time.;
1122
TIMEPART Function
4
Chapter 4
TIMEPART Function Extracts a time value from a SAS datetime value. Category:
Date and Time
Syntax TIMEPART(datetime)
Arguments datetime
is a numeric constant, variable, or expression that represents a SAS datetime value.
Examples SAS assigns TIME a SAS value that corresponds to 10:40:17 if the following statements are executed exactly at 10:40:17 a.m. on any date: datim=datetime(); time=timepart(datim);
TINV Function Returns a quantile from the t distribution. Category:
Quantile
Syntax TINV(p,df)
Arguments p
is a numeric probability. Range:
0 0 prob
is a probability.
1124
TNONCT Function
4
Chapter 4
0 < prob < 1
Range:
Details The TNONCT function returns the nonnegative noncentrality parameter from a noncentral t distribution whose parameters are x, df, and nc. A Newton-type algorithm is used to find a root nc of the equation
Pt (xjdf; nc) 0 prob = 0 where
1 Pt (xjdf; nc) = df 0
2
Z1
p 2v Z df
x
v 2 01 e0v df
0
01
e0
0nc)2
(u
2
dudv
If the algorithm fails to converge to a fixed point, a missing value is returned.
Examples data work; x=2; df=4; do nc=1 to 3 by .5; prob=probt(x,df,nc); ncc=tnonct(x,df,prob); output; end; run; proc print; run;
Output 4.93
Computations of the Noncentrality Parameter from the t Distribution OBS
x
df
nc
1 2 3 4 5
2 2 2 2 2
4 4 4 4 4
1.0 1.5 2.0 2.5 3.0
prob 0.76457 0.61893 0.45567 0.30115 0.17702
ncc 1.0 1.5 2.0 2.5 3.0
Functions and CALL Routines
4
TRANSLATE Function
1125
TODAY Function Returns the current date as a numeric SAS date value. Category: Date and Time Alias:
DATE
Syntax TODAY()
Details The TODAY function produces the current date in the form of a SAS date value, which is the number of days since January 1, 1960.
Examples These statements illustrate a practical use of the TODAY function: data _null_; tday=today(); if (tday-datedue)> 15 then do; put ’As of ’ tday date9. ’ Account #’ account ’is more than 15 days overdue.’; end; run;
TRANSLATE Function Replaces specific characters in a character string. Category: Character Restriction:
“I18N Level 0” on page 305
Tip: DBCS equivalent function is KTRANSLATE in SAS National Language Support (NLS): Reference Guide. See:
TRANSLATE Function in the documentation for your operating environment.
Syntax TRANSLATE(source,to-1,from-1)
1126
4
TRANSLATE Function
Chapter 4
Arguments
source
specifies a character constant, variable, or expression that contains the original character string. to
specifies the characters that you want TRANSLATE to use as substitutes. from
specifies the characters that you want TRANSLATE to replace. Interaction: Values of to and from correspond on a character-by-character basis;
TRANSLATE changes the first character of from to the first character of to, and so on. If to has fewer characters than from, TRANSLATE changes the extra from characters to blanks. If to has more characters than from, TRANSLATE ignores the extra to characters. Operating Environment Information: You must have pairs of to and from arguments on some operating environments. On other operating environments, a segment of the collating sequence replaces null from arguments. See the SAS documentation for your operating environment for more information. 4
Details In a DATA step, if the TRANSLATE function returns a value to a variable that has not previously been assigned a length, then that variable is given the length of the first argument. The maximum number of pairs of to and from arguments that TRANSLATE accepts depends on the operating environment you use to run SAS. There is no functional difference between using several pairs of short arguments, or fewer pairs of longer arguments.
Comparisons The TRANWRD function differs from TRANSLATE in that it scans for words (or patterns of characters) and replaces those words with a second word (or pattern of characters).
Examples SAS Statements
Results
x=translate(’XYZW’,’AB’,’VW’); put x;
XYZB
See Also Function: “TRANWRD Function” on page 1129
Functions and CALL Routines
4
TRANSTRN Function
1127
TRANSTRN Function Replaces or removes all occurrences of a substring in a character string. Category: Character
Syntax TRANSTRN(source,target,replacement)
Arguments
source
specifies a character constant, variable, or expression that you want to translate. target
specifies a character constant, variable, or expression that is searched for in source. Requirement:
The length for target must be greater than zero.
replacement
specifies a character constant, variable, or expression that replaces target.
Details Length of Returned Variable In a DATA step, if the TRANSTRN function returns a value to a variable that has not previously been assigned a length, then that variable is given a length of 200 bytes. You can use the LENGTH statement, before calling TRANSTRN, to change the length of the value. The Basics The TRANSTRN function replaces or removes all occurrences of a given substring within a character string. The TRANSTRN function does not remove trailing blanks in the target string and the replacement string. To remove all occurrences of target, specify replacement as TRIMN("").
Comparisons The TRANWRD function differs from the TRANSTRN function because TRANSTRN allows the replacement string to have a length of zero. TRANWRD uses a single blank instead when the replacement string has a length of zero. The TRANSLATE function converts every occurrence of a user-supplied character to another character. TRANSLATE can scan for more than one character in a single call. In doing this scan, however, TRANSLATE searches for every occurrence of any of the individual characters within a string. That is, if any letter (or character) in the target string is found in the source string, it is replaced with the corresponding letter (or character) in the replacement string. The TRANSTRN function differs from TRANSLATE in that TRANSTRN scans for substrings and replaces those substrings with a second substring.
1128
TRANSTRN Function
4
Chapter 4
Examples Example 1: Replacing All Occurrences of a Word
These statements and these values
produce these results: name=transtrn(name, "Mrs.", "Ms."); name=transtrn(name, "Miss", "Ms."); put name; Values Mrs.
Results Joan Smith
Miss Alice Cooper
Ms.
Joan Smith
Ms. Alice Cooper
Example 2: Removing Blanks from the Search String
In this example, the TRANSTRN function does not replace the source string because the target string contains blanks. data list; input salelist $; length target $10 replacement $3; target=’FISH’; replacement=’NIP’; salelist=transtrn(salelist,target,replacement); put salelist; datalines; CATFISH ;
The LENGTH statement pads target with blanks to the length of 10, which causes the TRANSTRN function to search for the character string ’FISH ’ in SALELIST. Because the search fails, this line is written to the SAS log: CATFISH
You can use the TRIM function to exclude trailing blanks from a target or replacement variable. Use the TRIM function with target: salelist=transtrn(salelist,trim(target),replacement); put salelist;
Now, this line is written to the SAS log: CATNIP
Example 3: Zero Length in the Third Argument of the TRANSTRN Function The following example shows the results of the TRANSTRN function when the third argument, replacement, has a length of zero. In the DATA step, a character constant that consists of two quotation marks represents a single blank, and not a zero-length string. In the following example, the results for string1 are different from the results for string2. data _null_; string1=’*’ || transtrn(’abcxabc’, ’abc’, trimn(’’)) || ’*’; put string1=; string2=’*’ || transtrn(’abcxabc’, ’abc’, ’’) || ’*’;
Functions and CALL Routines
4
TRANWRD Function
1129
put string2=; run;
SAS writes the following output to the log: Output 4.94
Output When the Third Argument of TRANSTRN Has a Length of Zero
string1=*x* string2=* x *
See Also Function: “TRANSLATE Function” on page 1125
TRANWRD Function Replaces all occurrences of a substring in a character string. Category: Character Restriction:
“I18N Level 2” on page 306
Syntax TRANWRD(source,target,replacement)
Arguments source
specifies a character constant, variable, or expression that you want to translate. target
specifies a character constant, variable, or expression that is searched for in source. Requirement:
The length for target must be greater than zero.
replacement
specifies a character constant, variable, or expression that replaces target. When the replacement string has a length of zero, TRANWRD uses a single blank instead.
Details Length of Returned Variable In a DATA step, if the TRANWRD function returns a value to a variable that has not previously been assigned a length, then that variable is given a length of 200 bytes. You can use the LENGTH statement, before calling TRANWRD, to change the length of the value.
1130
TRANWRD Function
4
Chapter 4
The Basics The TRANWRD function replaces all occurrences of a given substring within a character string. The TRANWRD function does not remove trailing blanks in the target string and the replacement string.
Comparisons The TRANWRD function differs from the TRANSTRN function because TRANSTRN allows the replacement string to have a length of zero. TRANWRD uses a single blank instead when the replacement string has a length of zero. The TRANSLATE function converts every occurrence of a user-supplied character to another character. TRANSLATE can scan for more than one character in a single call. In doing this scan, however, TRANSLATE searches for every occurrence of any of the individual characters within a string. That is, if any letter (or character) in the target string is found in the source string, it is replaced with the corresponding letter (or character) in the replacement string. The TRANWRD function differs from TRANSLATE in that TRANWRD scans for substrings and replaces those substrings with a second substring.
Examples Example 1: Replacing All Occurrences of a Word
These statements and these values
produce these results: name=tranwrd(name, "Mrs.", "Ms."); name=tranwrd(name, "Miss", "Ms."); put name; Values Mrs.
Results Joan Smith
Miss Alice Cooper
Ms.
Joan Smith
Ms. Alice Cooper
Example 2: Removing Blanks From the Search String
In this example, the TRANWRD function does not replace the source string because the target string contains blanks. data list; input salelist $; length target $10 replacement $3; target=’FISH’; replacement=’NIP’; salelist=tranwrd(salelist,target,replacement); put salelist; datalines; CATFISH ;
The LENGTH statement pads target with blanks to the length of 10, which causes the TRANWRD function to search for the character string ’FISH ’ in SALELIST. Because the search fails, this line is written to the SAS log: CATFISH
Functions and CALL Routines
4
TRANWRD Function
1131
You can use the TRIM function to exclude trailing blanks from a target or replacement variable. Use the TRIM function with target: salelist=tranwrd(salelist,trim(target),replacement); put salelist;
Now, this line is written to the SAS log: CATNIP
Example 3: Zero Length in the Third Argument of the TRANWRD Function The following example shows the results of the TRANWRD function when the third argument, replacement, has a length of zero. In this case, TRANWRD uses a single blank. In the DATA step, a character constant that consists of two consecutive quotation marks represents a single blank, and not a zero-length string. In this example, the results for string1 and string2 are the same: data _null_; string1=’*’ || tranwrd(’abcxabc’, ’abc’, trimn(’’)) || ’*’; put string1=; string2=’*’ || tranwrd(’abcxabc’, ’abc’, ’’) || ’*’; put string2=; run;
SAS writes the following output to the log: Output 4.95
Output When the Third Argument of TRANWRD Has a Length of Zero
string1=* x * string2=* x *
Removing Repeated Commas You can use the TRANWRD function to remove repeated commas in text, and replace the repeated commas with a single comma. In the following example, the TRANWRD function is used twice: to replace three commas with one comma, and to replace the ending two commas with a period: data _null_; mytxt=’If you exercise your power to vote,,,then your opinion will be heard,,’; newtext=tranwrd(mytxt, ’,,,’, ’,’); newtext2=tranwrd(newtext, ’,,’ , ’.’); put // mytxt= / newtext= / newtext2=; run;
SAS writes the following output to the log: Output 4.96
Output from Removing Repeated Commas
mytxt=If you exercise your power to vote,,,then your opinion will be heard,, newtext=If you exercise your power to vote,then your opinion will be heard,, newtext2=If you exercise your power to vote,then your opinion will be heard.
1132
TRIGAMMA Function
4
Chapter 4
See Also Function: “TRANSLATE Function” on page 1125
TRIGAMMA Function Returns the value of the trigamma function. Category:
Mathematical
Syntax TRIGAMMA(argument)
Arguments argument
specifies a numeric constant, variable, or expression. Restriction: Nonpositive integers are invalid.
Details The TRIGAMMA function returns the derivative of the DIGAMMA function. For argument > 0, the TRIGAMMA function is the second derivative of the LGAMMA function.
Examples SAS Statements
Results
x=trigamma(3);
0.3949340668
TRIM Function Removes trailing blanks from a character string, and returns one blank if the string is missing. Character “I18N Level 0” on page 305 Tip: DBCS equivalent function is KTRIM in SAS National Language Support (NLS): Reference Guide. Category:
Restriction:
Functions and CALL Routines
4
TRIM Function
1133
Syntax TRIM(argument)
Arguments argument
specifies a character constant, variable, or expression.
Details Length of Returned Variable In a DATA step, if the TRIM function returns a value to a variable that has not previously been assigned a length, then that variable is given the length of the argument. The Basics TRIM copies a character argument, removes trailing blanks, and returns the trimmed argument as a result. If the argument is blank, TRIM returns one blank. TRIM is useful for concatenating because concatenation does not remove trailing blanks. Assigning the results of TRIM to a variable does not affect the length of the receiving variable. If the trimmed value is shorter than the length of the receiving variable, SAS pads the value with new blanks as it assigns it to the variable.
Comparisons The TRIM and TRIMN functions are similar. TRIM returns one blank for a blank string. TRIMN returns a string with a length of zero for a blank string.
Examples Example 1: Removing Trailing Blanks
These statements and this data line produce
these results: data test; input part1 $ 1-10 part2 $ 11-20; hasblank=part1||part2; noblank=trim(part1)||part2; put hasblank; put noblank; datalines; Data Line
Results ----+----1----+----2
apple
sauce
apple
sauce
applesauce
1134
TRIMN Function
4
Chapter 4
Example 2: Concatenating a Blank Character Expression SAS Statements
Results
x="A"||trim(" ")||"B"; put x; x="
"; y=">"||trim(x)||""||trimn(x)||"
Statements
4
DATA Statement
1417
; vDATA _NULL_ > ; wDATA view-name / VIEW=view-name )> ; xDATA data-set-name / PGM=program-name )> ; yDATA VIEW=view-name ; DESCRIBE; UDATA PGM=program-name ;
Without Arguments If you omit the arguments, the DATA step automatically names each successive data set that you create as DATAn, where n is the smallest integer that makes the name unique.
Arguments data-set-name names the SAS data file or DATA step view that the DATA step creates. To create a DATA step view, you must specify at least one data-set-name and that data-set-name must match view-name. Restriction: data-set-name must conform to the rules for SAS names, and
additional restrictions might be imposed by your operating environment. You can execute a DATA step without creating a SAS data set. See Example 5 on page 1423 for an example. For more information, see “vWhen Not Creating a Data Set” on page 1420.
Tip:
See also: For details about the types of SAS data set names and when to use each
type, see “Names in the SAS Language” in SAS Language Reference: Concepts. (data-set-options) specifies optional arguments that the DATA step applies when it writes observations to the output data set. “Definition of Data Set Options” on page 10 for more information and Chapter 2, “SAS Data Set Options,” on page 9 for a list of data set options .
See also:
Featured in: Example 1 on page 1421
/ DEBUG enables you to debug your program interactively by helping to identify logic errors, and sometimes data errors. / NESTING specifies that a note will be printed to the SAS log for the beginning and end of each DO-END and SELECT-END nesting level. This option enables you to debug
1418
DATA Statement
4
Chapter 6
mismatched DO-END and SELECT-END statements and is particularly useful in large programs where the nesting level is not obvious. / STACK=stack-size specifies the maximum number of nested LINK statements. _NULL_ specifies that SAS does not create a data set when it executes the DATA step. VIEW=view-name names a view that the DATA step uses to store the input DATA step view. Restriction: view-name must match one of the data set names. Restriction: SAS creates only one view in a DATA step. If you specify additional data sets in the DATA statement, SAS creates these data sets when the view is processed in a subsequent DATA or PROC step. Views have the capability of generating other data sets at the time the view is executed. Tip: SAS macro variables resolve when the view is created. Use the SYMGET function to delay macro variable resolution until the view is processed. Featured in: Example 2 on page 1422 and Example 3 on page 1422 Tip:
password-option assigns a password to a stored compiled DATA step program or a DATA step view. The following password options are available: ALTER=alter-password assigns an alter password to a SAS data file. The password allows you to protect or replace a stored compiled DATA step program or a DATA step view. Requirement: If you use an ALTER password in creating a stored compiled DATA step program or a DATA step view, an ALTER password is required to replace the program or view. Requirement: If you use an ALTER password in creating a stored compiled DATA step program or a DATA step view, an ALTER password is required to execute a DESCRIBE statement. Alias: PROTECT= READ=read-password assigns a read password to a SAS data file. The password allows you to read or execute a stored compiled DATA step program or a DATA step view. Requirement: If you use a READ password in creating a stored compiled DATA step program or a DATA step view, a READ password is required to execute the program or view. Requirement: If you use a READ password in creating a stored compiled
DATA step program or a DATA step view, a READ password is required to execute DESCRIBE and EXECUTE statements. If you use an invalid password, SAS will execute the DESCRIBE statement. Tip: If you use a READ password in creating a stored compiled DATA step program or a DATA step view, no password is required to replace the program or view. Alias: EXECUTE= PW=password assigns a READ and ALTER password, both having the same value. SOURCE=source-option specifies one of the following source options:
Statements
4
DATA Statement
1419
SAVE saves the source code that created a stored compiled DATA step program or a DATA step view. ENCRYPT encrypts and saves the source code that created a stored compiled DATA step program or a DATA step view. Tip: If you encrypt source code, use the ALTER password option as well. SAS issues a warning message if you do not use ALTER. NOSAVE does not save the source code. CAUTION:
If you use the NOSAVE option for a DATA step view, the view cannot be migrated or copied from one version of SAS to another version. 4 Default: SAVE
PGM=program-name names the stored compiled program that SAS creates or executes in the DATA step. To create a stored compiled program, specify a slash (/) before the PGM= option. To execute a stored compiled program, specify the PGM= option without a slash (/). Tip: SAS macro variables resolve when the stored program is created. Use the SYMGET function to delay macro variable resolution until the view is processed. Featured in: Example 4 on page 1422 NOLIST suppresses the output of all variables to the SAS log when the value of _ERROR_ is 1. Restriction: NOLIST must be the last option in the DATA statement.
Details Using the DATA Statement The DATA step begins with the DATA statement. You use the DATA statement to create the following types of output: SAS data sets, data views, and stored programs. You can specify more than one output in a DATA statement. However, only one of the outputs can be a data view. You create a view by specifying the wVIEW= option and a stored program by specifying the xPGM=option. Using Both a READ and an ALTER Password If you use both a READ and an ALTER password in creating a stored compiled DATA step program or a DATA step view, the following items apply: 3 A READ or ALTER password is required to execute the stored compiled DATA step program or DATA step view. 3 A READ or ALTER password is required if the stored compiled DATA step program or DATA step view contains both DESCRIBE and EXECUTE statements. 3 If you use an ALTER password with the DESCRIBE and EXECUTE statements, the following items apply: 3 SAS executes both the DESCRIBE and the EXECUTE statements. 3 If you execute a stored compiled DATA step program or DATA step view with an invalid ALTER password: 3 The DESCRIBE statement does not execute. 3 In batch mode, the EXECUTE statement has no effect.
1420
DATA Statement
4
Chapter 6
3 In interactive mode, SAS prompts you for a READ password. If the READ password is valid, SAS processes the EXECUTE statement. If it is invalid, SAS does not process the EXECUTE statement.
3 If you use a READ password with the DESCRIBE and EXECUTE statements, the following items apply: 3 In interactive mode, SAS prompts you for the ALTER password: 3 If you enter a valid ALTER password, SAS executes both the DESCRIBE and the EXECUTE statements. 3 If you enter an invalid ALTER password, SAS processes the EXECUTE statement but not the DESCRIBE statement.
3 In batch mode, SAS processes the EXECUTE statement but not the DESCRIBE statement. 3 In both interactive and batch modes, if you specify an invalid READ password SAS does not process the EXECUTE statement.
3 An ALTER password is required if the stored compiled DATA step program or DATA step view contains a DESCRIBE statement.
3 An ALTER password is required to replace the stored compiled DATA step program or DATA step view.
uCreating an Output Data Set Use the DATA statement to create one or more output data sets. You can use data set options to customize the output data set. The following DATA step creates two output data sets, example1 and example2. It uses the data set option DROP to prevent the variable IDnumber from being written to the example2 data set. data example1 example2 (drop=IDnumber); set sample; . . .more SAS statements. . . run;
vWhen Not Creating a Data Set Usually, the DATA statement specifies at least one data set name that SAS uses to create an output data set. However, when the purpose of a DATA step is to write a report or to write data to an external file, you might not want to create an output data set. Using the keyword _NULL_ as the data set name causes SAS to execute the DATA step without writing observations to a data set. This example writes to the SAS log the value of Name for each observation. SAS does not create an output data set. data _NULL_; set sample; put Name ID; run;
wCreating a DATA Step View
You can create DATA step views and execute them at a later time. The following DATA step example creates a DATA step view. It uses the SOURCE=ENCRYPT option to both save and encrypt the source code. data phone_list / view=phone_list (source=encrypt); set customer_list; . . .more SAS statements. . . run;
For more information about DATA step views, see “SAS Data Views” in SAS Language Reference: Concepts.
Statements
4
DATA Statement
1421
xCreating a Stored Compiled DATA Step Program The ability to compile and store DATA step programs allows you to execute the stored programs later. Stored compiled DATA step programs can reduce processing costs by eliminating the need to compile DATA step programs repeatedly. The following DATA step example compiles and stores a DATA step program. It uses the ALTER password option, which allows the user to replace an existing stored program, and to protect the stored compiled program from being replaced. data testfile / pgm=stored.test_program (alter=sales); set sales_data; . . .more SAS statements. . . run;
For more information about stored compiled DATA step programs, see “Stored Compiled DATA Step Programs” in SAS Language Reference: Concepts.
yDescribing a DATA Step View
The following example uses the DESCRIBE statement in a DATA step view to write a copy of the source code to the SAS log. data view=inventory; describe; run;
For information about the DESCRIBE statement, see the “DESCRIBE Statement” on page 1437.
UExecuting a Stored Compiled DATA Step Program
The following example executes a stored compiled DATA step program. It uses the DESCRIBE statement to write a copy of the source code to the SAS log. libname stored ’SAS library’; data pgm=stored.employee_list; describe; execute; run;
For information about the DESCRIBE statement, see the “DESCRIBE Statement” on page 1437. For information about the EXECUTE statement, see the “EXECUTE Statement” on page 1453.
Examples
Example 1: Creating Multiple Data Files and Using Data Set Options This DATA statement creates more than one data set, and it changes the contents of the output data sets: data error (keep=subject date weight) fitness(label=’Exercise Study’ rename=(weight=pounds));
The ERROR data set contains three variables. SAS assigns a label to the FITNESS data set and renames the variable weight to pounds.
1422
DATA Statement
4
Chapter 6
Example 2: Creating Input DATA Step Views
This DATA step creates an input DATA
step view instead of a SAS data file: libname ourlib ’SAS-library’; data ourlib.test / view=ourlib.test; set ourlib.fittest; tot=sum(of score1-score10); run;
Example 3: Creating a View and a Data File
This DATA step creates an input DATA step view named THEIRLIB.TEST and an additional temporary SAS data set named SCORETOT: libname ourlib ’SAS-library-1’; libname theirlib ’SAS-library-2’; data theirlib.test scoretot / view=theirlib.test; set ourlib.fittest; tot=sum(of score1-score10); run;
SAS does not create the data file SCORETOT until a subsequent DATA or PROC step processes the view THEIRLIB.TEST.
Example 4: Storing and Executing a Compiled Program
The first DATA step produces
a stored compiled program named STORED.SALESFIG: libname in ’SAS-library-1 ’; libname stored ’SAS-library-2 ’; data salesdata / pgm=stored.salesfig; set in.sales; qtr1tot=jan+feb+mar; run;
SAS creates the data set SALESDATA when it executes the stored compiled program STORED.SALESFIG. data pgm=stored.salesfig; run;
Statements
4
DATA Statement
1423
Example 5: Creating a Custom Report
The second DATA step in this program produces a custom report and uses the _NULL_ keyword to execute the DATA step without creating a SAS data set: data sales; input dept : $10. jan feb mar; datalines; shoes 4344 3555 2666 housewares 3777 4888 7999 appliances 53111 7122 41333 ; data _null_; set sales; qtr1tot=jan+feb+mar; put ’Total Quarterly Sales: ’ qtr1tot dollar12.; run;
Example 6: Using a Password with a Stored Compiled DATA Step Program
The first DATA step creates a stored compiled DATA step program called STORED.ITEMS. This program includes the ALTER password, which limits access to the program. libname stored ’SAS-library’; data employees / pgm=stored.items (alter=klondike); set sample; if TotalItems > 200 then output; run;
This DATA step executes the stored compiled DATA step program STORED.ITEMS. It uses the DESCRIBE statement to print the source code to the SAS log. Because the program was created with the ALTER password, you must use the password if you use the DESCRIBE statement. If you do not enter the password, SAS will prompt you for it. data pgm=stored.items (alter=klondike); describe; execute; run;
Example 7: Displaying Nesting Levels The following program has two nesting levels. SAS will generate four log messages, one begin and end message for each nesting level. data _null_ /nesting; do i = 1 to 10; do j = 1 to 5; put i= j=; end; end; run;
1424
DATALINES Statement
4
Output 6.4
Chapter 6
Nesting Level Debug (partial SAS log)
6 7
data _null_ /nesting; do i = 1 to 10; 719 NOTE 719-185: *** DO begin level 1 ***. 8
do j = 1 to 5; 719 NOTE 719-185: *** DO begin level 2 ***. 9 10
put i= j=; end; --720 NOTE 720-185: *** DO end level 2 ***. 11
end; --720 NOTE 720-185: *** DO end level 1 ***. 12
run;
See Also Statements: “DESCRIBE Statement” on page 1437 “EXECUTE Statement” on page 1453 “LINK Statement” on page 1619 “Definition of Data Set Options” on page 10
DATALINES Statement Specifies that data lines follow. Valid: in a DATA step Category: File-handling Type: Declarative Aliases: CARDS, LINES Restriction: Data lines cannot contain semicolons. Use “DATALINES4 Statement” on
page 1426 when your data contain semicolons.
Syntax DATALINES;
Without Arguments Use the DATALINES statement with an INPUT statement to read data that you enter directly in the program, rather than data stored in an external file.
Statements
4
DATALINES Statement
1425
Details Using the DATALINES Statement
The DATALINES statement is the last statement in the DATA step and immediately precedes the first data line. Use a null statement (a single semicolon) to indicate the end of the input data. You can use only one DATALINES statement in a DATA step. Use separate DATA steps to enter multiple sets of data.
Reading Long Data Lines SAS handles data line length with the CARDIMAGE system option. If you use CARDIMAGE, SAS processes data lines exactly like 80–byte punched card images padded with blanks. If you use NOCARDIMAGE, SAS processes data lines longer than 80 columns in their entirety. Refer to “CARDIMAGE System Option” on page 1803 for details. Using Input Options with In-stream Data
The DATALINES statement does not provide input options for reading data. However, you can access some options by using the DATALINES statement in conjunction with an INFILE statement. Specify DATALINES in the INFILE statement to indicate the source of the data and then use the options you need. See Example 2 on page 1425.
Comparisons 3 Use the DATALINES statement whenever data do not contain semicolons. If your data contain semicolons, use the DATALINES4 statement. 3 The following SAS statements also read data or point to a location where data are stored: 3 The INFILE statement points to raw data lines stored in another file. The INPUT statement reads those data lines. 3 The %INCLUDE statement brings SAS program statements or data lines stored in SAS files or external files into the current program. 3 The SET, MERGE, MODIFY, and UPDATE statements read observations from existing SAS data sets.
Examples Example 1: Using the DATALINES Statement In this example, SAS reads a data line and assigns values to two character variables, NAME and DEPT, for each observation in the DATA step: data person; input name $ dept $; datalines; John Sales Mary Acctng ;
Example 2: Reading In-stream Data with Options
This example takes advantage of options available with the INFILE statement to read in-stream data lines. With the DELIMITER= option, you can use list input to read data values that are delimited by commas instead of blanks. data person; infile datalines delimiter=’,’; input name $ dept $; datalines;
1426
DATALINES4 Statement
4
Chapter 6
John,Sales Mary,Acctng ;
See Also Statements: “DATALINES4 Statement” on page 1426 “INFILE Statement” on page 1541 System Option: “CARDIMAGE System Option” on page 1803
DATALINES4 Statement Indicates that data lines that contain semicolons follow. in a DATA step Category: File-handling Valid: Type:
Declarative CARDS4, LINES4
Aliases:
Syntax DATALINES4;
Without Arguments Use the DATALINES4 statement together with an INPUT statement to read data that contain semicolons that you enter directly in the program.
Details The DATALINES4 statement is the last statement in the DATA step and immediately precedes the first data line. Follow the data lines with four consecutive semicolons that are located in columns 1 through 4.
Comparisons Use the DATALINES4 statement when data contain semicolons. If your data do not contain semicolons, use the DATALINES statement.
Examples In this example, SAS reads data lines that contain internal semicolons until it encounters a line of four semicolons. Execution continues with the rest of the program. data biblio; input number citation $50.;
Statements
4
DECLARE Statement, Hash and Hash Iterator Objects
1427
datalines4; KIRK, 1988 2 LIN ET AL., 1995; BRADY, 1993 3 BERG, 1990; ROA, 1994; WILLIAMS, 1992 ;;;;
See Also Statements: “DATALINES Statement” on page 1424
DECLARE Statement, Hash and Hash Iterator Objects Declares a hash or hash iterator object; creates an instance of and initializes data for a hash or hash iterator object. Valid:
in a DATA step
Category: Action Type: Executable Alias:
DCL
Syntax uDECLARE object object-reference; vDECLARE object object-reference< ()>;
Arguments object
specifies the component object. It can be one of the following values: hash specifies a hash object. The hash object provides a mechanism for quick data storage and retrieval. The hash object stores and retrieves data based on lookup keys. See Also: “Using the Hash Object” in SAS Language Reference: Concepts hiter specifies a hash iterator object. The hash iterator object enables you to retrieve the hash object’s data in forward or reverse key order. See Also: “Using the Hash Iterator Object” in SAS Language Reference: Concepts object-reference
specifies the object reference name for the hash or hash iterator object. argument_tag
specifies the information that is used to create an instance of the hash object.
1428
DECLARE Statement, Hash and Hash Iterator Objects
4
Chapter 6
There are five valid hash object argument tags: dataset: ’dataset_name ’ Specifies the name of a SAS data set to load into the hash object. The name of the SAS data set can be a literal or character variable. The data set name must be enclosed in single or double quotation marks. Macro variables must be enclosed in double quotation marks. You can use SAS data set options when declaring a hash object in the DATASET argument tag. Data set options specify actions that apply only to the SAS data set with which they appear. They enable you to perform the following operations:
3 3 3 3
renaming variables selecting a subset of observations based on observation number for processing selecting observations using the WHERE option dropping or keeping variables from a data set loaded into a hash object, or for an output data set that is specified in an OUTPUT method call
3 specifying a password for a data set. The following syntax is used: dcl hash h (dataset: ’x (where = (i > 10))’);
For a list of SAS data set options, see “Data Set Options by Category” on page 12. Note: If the data set contains duplicate keys, the default is to keep the first instance in the hash object; subsequent instances are ignored. To store the last instance in the hash object or an error message written to the SAS log if there is a duplicate key, use the DUPLICATE argument tag. 4 duplicate: ’option’ determines whether to ignore duplicate keys when loading a data set into the hash object. The default is to store the first key and ignore all subsequent duplicates. Option can be one of the following values: ’replace’ | ’r’ stores the last duplicate key record. ’error’ | ’e’ reports an error to the log if a duplicate key is found. The following example that uses the REPLACE option storesbrown for the key 620 and blue for the key 531. If you use the default, green would be stored for 620 and yellow would be stored for 531. data table; input key data $; datalines; 531 yellow 620 green 531 blue 908 orange 620 brown 143 purple run; data _null_; length key 8 data $ 8; if (_n_ = 1) then do; declare hash myhash(dataset: "table", duplicate: "r"); rc = myhash.definekey(’key’);
Statements
4
DECLARE Statement, Hash and Hash Iterator Objects
1429
rc = myhash.definedata(’data’); myhash.definedone(); end; rc = myhash.output(dataset:"otable"); run;
hashexp: n n The hash object’s internal table size, where the size of the hash table is 2 . The value of HASHEXP is used as a power-of-two exponent to create the hash table size. For example, a value of 4 for HASHEXP equates to a hash table size of 4 2 , or 16. The maximum value for HASHEXP is 20. The hash table size is not equal to the number of items that can be stored. Imagine the hash table as an array of ’buckets.’ A hash table size of 16 would have 16 ’buckets.’ Each bucket can hold an infinite number of items. The efficiency of the hash table lies in the ability of the hashing function to map items to and retrieve items from the buckets. You should specify the hash table size relative to the amount of data in the hash object in order to maximize the efficiency of the hash object lookup routines. Try different HASHEXP values until you get the best result. For example, if the hash object contains one million items, a hash table size of 16 (HASHEXP = 4) would work, but not very efficiently. A hash table size of 512 or 1024 (HASHEXP = 9 or 10) would result in the best performance. 8
Default: 8, which equates to a hash table size of 2 or 256
ordered: ’option’ Specifies whether or how the data is returned in key-value order if you use the hash object with a hash iterator object or if you use the hash object OUTPUT method. option can be one of the following values: ’ascending’ | ’a’
Data is returned in ascending key-value order. Specifying ’ascending’ is the same as specifying ’yes’.
’descending’ | ’d’
Data is returned in descending key-value order.
’YES’ | ’Y’
Data is returned in ascending key-value order. Specifying ’yes’ is the same as specifying ’ascending’.
’NO’ | ’N’
Data is returned in some undefined order.
Default: NO
The argument can also be enclosed in double quotation marks. multidata: ’option’ specifies whether multiple data items are allowed for each key. option can be one of the following values: ’YES’ | ’Y’
Multiple data items are allowed for each key.
’NO’ | ’N’
Only one data item is allowed for each key.
Default: NO See Also: “Non-Unique Key and Data Pairs” in SAS Language Reference: Concepts
The argument value can also be enclosed in double quotation marks. suminc: ’variable-name’ maintains a summary count of hash object keys. The SUMINC argument tag is given a DATA step variable, which holds the sum increment—that is, how much to
1430
DECLARE Statement, Hash and Hash Iterator Objects
4
Chapter 6
add to the key summary for each reference to the key. The SUMINC value treats a missing value as zero, like the SUM function. For example, a key summary changes using the current value of the DATA step variable. dcl hash myhash(suminc: ’count’);
See Also: ”Maintaining Key Summaries” in SAS Language Reference: Concepts. See Also: “Initializing Hash Object Data Using a Constructor” and “Declaring and
Instantiating a Hash Iterator Object” in SAS Language Reference: Concepts.
Details The Basics To use a DATA step component object in your SAS program, you must declare and create (instantiate) the object. The DATA step component interface provides a mechanism for accessing predefined component objects from within the DATA step. For more information about the predefined DATA step component objects, see “Using DATA Step Component Objects” in SAS Language Reference: Concepts. uDeclaring a Hash or Hash Iterator Object
You use the DECLARE statement to
declare a hash or hash iterator object. declare hash h;
The DECLARE statement tells SAS that the object reference H is a hash object. After you declare the new hash or hash iterator object, use the _NEW_ operator to instantiate the object. For example, in the following line of code, the _NEW_ operator creates the hash object and assigns it to the object reference H: h = _new_ hash( );
vUsing the DECLARE Statement to Instantiate a Hash or Hash Iterator Object
As an alternative to the two-step process of using the DECLARE statement and the _NEW_ operator to declare and instantiate a hash or hash iterator object, you can use the DECLARE statement to declare and instantiate the hash or hash iterator object in one step. For example, in the following line of code, the DECLARE statement declares and instantiates a hash object and assigns it to the object reference H: declare hash h( );
The previous line of code is equivalent to using the following code: declare hash h; h = _new_ hash( );
A constructor is a method that you can use to instantiate a hash object and initialize the hash object data. For example, in the following line of code, the DECLARE statement declares and instantiates a hash object and assigns it to the object reference 4 H. In addition, the hash table size is initialized to a value of 16 (2 ) using the argument tag, HASHEXP. declare hash h(hashexp: 4);
Using SAS Data Set Options When Loading a Hash Object
SAS data set options can be used when declaring a hash object that uses the DATASET argument tag. Data set options specify actions that apply only to the SAS data set with which they appear. They enable you to perform the following operations: 3 renaming variables 3 selecting a subset of observations based on observation number for processing 3 selecting observations using the WHERE option
Statements
4
DECLARE Statement, Hash and Hash Iterator Objects
1431
3 dropping or keeping variables from a data set loaded into a hash object, or for an output data set that is specified in an OUTPUT method call 3 specifying a password for a data set. The following syntax is used: dcl hash h(dataset: ’x (where = (i > 10))’);
For more examples of using data set options, see Example 4 on page 1432. For a list of data set options, see “Data Set Options by Category” on page 12.
Comparisons You can use the DECLARE statement and the _NEW_ operator, or the DECLARE statement alone to declare and instantiate an instance of a hash or hash iterator object.
Examples
Example 1: Declaring and Instantiating a Hash Object by Using the DECLARE Statement and _NEW_ Operator This example uses the DECLARE statement to declare a hash object. The _NEW_ operator is used to instantiate the hash object. data _null_; length k $15; length d $15; if _N_ = 1 then do; /* Declare and instantiate hash object "myhash" */ declare hash myhash; myhash = _new_ hash( ); /* Define key and data variables */ rc = myhash.defineKey(’k’); rc = myhash.defineData(’d’); rc = myhash.defineDone( ); /* avoid uninitialized variable notes */ call missing(k, d); end; /* Create constant key and data values */ rc = myhash.add(key: ’Labrador’, data: ’Retriever’); rc = myhash.add(key: ’Airedale’, data: ’Terrier’); rc = myhash.add(key: ’Standard’, data: ’Poodle’); /* Find data associated with key and write data to log */ rc = myhash.find(key: ’Airedale’); if (rc = 0) then put d=; else put ’Key Airedale not found’; run;
Example 2: Declaring and Instantiating a Hash Object by Using the DECLARE Statement This example uses the DECLARE statement to declare and instantiate a hash object in one step. data _null_; length k $15; length d $15;
1432
DECLARE Statement, Hash and Hash Iterator Objects
4
Chapter 6
if _N_ = 1 then do; /* Declare and instantiate hash object "myhash" */ declare hash myhash( ); rc = myhash.defineKey(’k’); rc = myhash.defineData(’d’); rc = myhash.defineDone( ); /* avoid uninitialized variable notes */ call missing(k, d); end; /* Create constant key and data values */ rc = myhash.add(key: ’Labrador’, data: ’Retriever’); rc = myhash.add(key: ’Airedale’, data: ’Terrier’); rc = myhash.add(key: ’Standard’, data: ’Poodle’); /* Find data associated with key and write data to log*/ rc = myhash.find(key: ’Airedale’); if (rc = 0) then put d=; else put ’Key Airedale not found’; run;
Example 3: Instantiating and Sizing a Hash Object This example uses the DECLARE 4 statement to declare and instantiate a hash object. The hash table size is set to 16 (2 ). data _null_; length k $15; length d $15; if _N_ = 1 then do; /* Declare and instantiate hash object "myhash". */ /* Set hash table size to 16. */ declare hash myhash(hashexp: 4); rc = myhash.defineKey(’k’); rc = myhash.defineData(’d’); rc = myhash.defineDone( ); /* avoid uninitialized variable notes */ call missing(k, d); end; /* Create constant key and data values */ rc = myhash.add(key: ’Labrador’, data: ’Retriever’); rc = myhash.add(key: ’Airedale’, data: ’Terrier’); rc = myhash.add(key: ’Standard’, data: ’Poodle’); rc = myhash.find(key: ’Airedale’); /* Find data associated with key and write data to log*/ if (rc = 0) then put d=; else put ’Key Airedale not found’; run;
Example 4: Using SAS Data Set Options When Loading a Hash Object The following examples use various SAS data set options when declaring a hash object: data x; retain j 999; do i = 1 to 20;
Statements
4
DECLARE Statement, Hash and Hash Iterator Objects
output; end; run; /* Using the WHERE option. */ data _null_; length i 8; dcl hash h(dataset: ’x (where =(i > 10))’, ordered: ’a’); h.definekey(’i’); h.definedone(); h.output(dataset: ’out’); run; /* Using the DROP option. */ data _null_; length i 8; dcl hash h(dataset: ’x (drop = j)’, ordered: ’a’); h.definekey(all: ’y’); h.definedone(); h.output(dataset: ’out (where =( i < 8))’); run; /* Using the FIRSTOBS option. */ data _null_; length i j 8; dcl hash h(dataset: ’x (firstobs=5)’, ordered: ’a’); h.definekey(all: ’y’); h.definedone(); h.output(dataset: ’out’); run; /* Using the OBS option. */ data _null_; length i j 8; dcl hash h(dataset: ’x (obs=5)’, ordered: ’d’); h.definekey(all: ’y’); h.definedone(); h.output(dataset: ’out (rename =(j=k))’); run;
For a list of SAS data set options, see “Data Set Options by Category” on page 12.
See Also Operators: “_NEW_ Operator, Hash or Hash Iterator Object” on page 2053 Chapter 9, “Hash and Hash Iterator Object Language Elements,” on page 2027 “Using DATA Step Component Objects” in SAS Language Reference: Concepts
1433
1434
DECLARE Statement, Java Object
4
Chapter 6
DECLARE Statement, Java Object Declares a Java object; creates an instance of and initializes data for a Java object. in a DATA step Category: Action Type: Executable Alias: DCL Valid:
Syntax uDECLARE JAVAOBJ object-reference; vDECLARE JAVAOBJ object-reference ("java-class", );
Arguments object-reference
specifies the object reference name for the Java object. java-class
specifies the name of the Java class to be instantiated. Requirement: The Java class name must be enclosed in either double or single quotation marks. Requirement: If you specify a Java package path, you must use forward slashes (/) and not periods (.) in the path. For example, an incorrect classname is "java.util.Hashtable". The correct classname is "java/util/Hashtable". argument
specifies the information that is used to create an instance of the Java object. Valid values for argument depend on the Java object. See also: “vUsing the DECLARE Statement to Instantiate a Java Object” on page 1435
Details The Basics To use a DATA step component object in your SAS program, you must declare and create (instantiate) the object. The DATA step component interface provides a mechanism for accessing predefined component objects from within the DATA step. For more information, see “Using DATA Step Component Objects” in SAS Language Reference: Concepts. uDeclaring a Java Object
You use the DECLARE statement to declare a Java object.
declare javaobj j;
The DECLARE statement tells SAS that the object reference J is a Java object. After you declare the new Java object, use the _NEW_ operator to instantiate the object. For example, in the following line of code, the _NEW_ operator creates the Java object and assigns it to the object reference J: j = _new_ javaobj("somejavaclass");
Statements
4
DECLARE Statement, Java Object
1435
vUsing the DECLARE Statement to Instantiate a Java Object
Instead of the two-step process of using the DECLARE statement and the _NEW_ operator to declare and instantiate a Java object, you can use the DECLARE statement to declare and instantiate the Java object in one step. For example, in the following line of code, the DECLARE statement declares and instantiates a Java object and assigns the Java object to the object reference J: declare javaobj j("somejavaclass");
The preceding line of code is equivalent to using the following code: declare javaobj j; j = _new_ javaobj("somejavaclass");
A constructor is a method that you can use to instantiate a component object and initialize the component object data. For example, in the following line of code, the DECLARE statement declares and instantiates a Java object and assigns the Java object to the object reference J. Note that the only required argument for a Java object constructor is the name of the Java class to be instantiated. All other arguments are constructor arguments for the Java class itself. In the following example, the Java class name, testjavaclass, is the constructor, and the values 100 and .8 are constructor arguments. declare javaobj j("testjavaclass", 100, .8);
Comparisons You can use the DECLARE statement and the _NEW_ operator, or the DECLARE statement alone to declare and instantiate an instance of a Java object.
Examples Example 1: Declaring and Instantiating a Java Object by Using the DECLARE Statement and the _NEW_ Operator In the following example, a simple Java class is created. The DECLARE statement and the _NEW_ operator are used to create an instance of this class. /* Java code */ import java.util.*; import java.lang.*; public class simpleclass { public int i; public double d; }
/* DATA step code data _null_; declare javaobj myjo; myjo = _new_ javaobj("simpleclass"); run;
Example 2: Using the DECLARE Statement to Create and Instantiate a Java Object the following example, a Java class is created for a hash table. The DECLARE
In
1436
DELETE Statement
4
Chapter 6
statement is used to create and instantiate an instance of this class by specifying the capacity and load factor. In this example, a wrapper class, mhash, is necessary because the DATA step’s only numeric type is equivalent to the Java type DOUBLE. /* Java code */ import java.util.*; public class mhash extends Hashtable; { mhash (double size, double load) { super ((int)size, (float)load); } }
/* DATA step code */ data _null_; declare javaobj h("mhash", 100, .8); run;
See Also Operator: “_NEW_ Operator, Java Object” on page 2103 Chapter 9, “Hash and Hash Iterator Object Language Elements,” on page 2027 “Using DATA Step Component Objects” in SAS Language Reference: Concepts
DELETE Statement Stops processing the current observation. in a DATA step Category: Action Type: Executable Valid:
Syntax DELETE;
Without Arguments When DELETE executes, the current observation is not written to a data set, and SAS returns immediately to the beginning of the DATA step for the next iteration.
Details The DELETE statement is often used in a THEN clause of an IF-THEN statement or as part of a conditionally executed DO group.
Statements
4
DESCRIBE Statement
1437
Comparisons 3 Use the DELETE statement when it is easier to specify a condition that excludes observations from the data set or when there is no need to continue processing the DATA step statements for the current observation.
3 Use the subsetting IF statement when it is easier to specify a condition for including observations.
3 Do not confuse the DROP statement with the DELETE statement. The DROP statement excludes variables from an output data set; the DELETE statement excludes observations.
Examples Example 1: Using the DELETE Statement as Part of an IF-THEN Statement
When the
value of LEAFWT is missing, the current observation is deleted: if leafwt=. then delete;
Example 2: Using the DELETE Statement to Subset Raw Data data topsales; infile file-specification; input region office product yrsales; if yrsales
names the window and group of fields to be displayed. This field is preceded by a period (.). If the window has more than one group of fields, give the complete window.group specification; if a window contains a single unnamed group, use only window.
Tip:
NOINPUT
specifies that you cannot input values into fields that are displayed in the window. Default: If you omit NOINPUT, you can input values into unprotected fields that are displayed in the window. Restriction: If you use NOINPUT in all DISPLAY statements in a DATA step, you
must include a STOP statement to stop processing the DATA step.
Statements
4
DISPLAY Statement
1439
The NOINPUT option is useful when you want to allow values to be entered into a window at some times but not others. For example, you can display a window once for entering values and a second time for verifying them.
Tip:
BLANK
clears the window. Tip: Use the BLANK option when you want to display different groups of fields in a window and you do not want text from the previous group to appear in the current display. BELL
produces an audible alarm, beep, or bell sound when the window is displayed if your personal computer is equipped with a speaker device that provides sound. DELETE
deletes the display of the window after processing passes from the DISPLAY statement on which the option appears.
Details You must create a window in the same DATA step that you use to display it. Once you display a window, the window remains visible until you display another window over it or until the end of the DATA step. When you display a window that contains fields where you enter values, either enter a value or press ENTER at each unprotected field to cause SAS to proceed to the next display. You cannot skip any fields. While a window is being displayed, use commands and function keys to view other windows, to change the size of the current window, and so on. A DATA step that contains a DISPLAY statement continues execution until the last observation that is read by a SET, MERGE, UPDATE, MODIFY, or INPUT statement has been processed or until a STOP or ABORT statement is executed. You can also issue the END command on the command line of the window to stop the execution of the DATA step. You must create a window before you can display it. See the “WINDOW Statement” on page 1744 for a description of how to create windows. A window that is displayed with the DISPLAY statement does not become part of the SAS log or output file.
Examples This DATA step creates and displays a window named START. The START window fills the entire screen. Both lines of text are centered. data _null_; window start #5 @28 ’WELCOME TO THE SAS SYSTEM’ #12 @30 ’PRESS ENTER TO CONTINUE’; display start; stop; run;
Although the START window in this example does not require you to input any values, you must press ENTER to cause the execution to proceed to the STOP statement. If you omit the STOP statement, the DATA step executes endlessly unless you enter END on the command line of the window. Note: Because this DATA step does not read any observations, SAS cannot detect an end-of-file to cause DATA step execution to cease. If you add the NOINPUT option to the DISPLAY statement, the window displays quickly and is removed. 4
1440
DM Statement
4
Chapter 6
See Also Statement: “WINDOW Statement” on page 1744
DM Statement Submits SAS Program Editor, Log, Procedure Output or text editor commands as SAS statements. Valid:
anywhere
Category:
Program Control
Syntax DM < window> ’command(s)’ ;
Arguments window
specifies the active window. For more information, see “Details” on page 1440. Default: If you omit the window name, SAS uses the Program Editor window as the
default. ’command(s)’
can be any windowing command or text editor command and must be enclosed in single quotation marks. If you want to issue several commands, separate them with semicolons. CONTINUE
causes SAS to execute any SAS statements that follow the DM statement in the Program Editor window and, if a windowing command in the DM statement called a window, makes that window active. Tip: Any windows that are activated by the SAS statements (such as the Output window) appear before the window that is to be made active. If you specify Log as the active window, for example, and have other SAS statements that follow the DM statement (for example, in an autoexec file), those statements are not submitted to SAS until control returns to the SAS interface.
Note:
Details Execution occurs when the DM statement is submitted to SAS. You can use this statement to modify the windowing environment:
3 Change SAS interface features during a SAS session. 3 Change SAS interface features at the beginning of each SAS session by placing the DM statement in an autoexec file.
3 Perform utility functions in windowing applications, such as saving a file with the FILE command or clearing a window with the CLEAR command.
Statements
4
DO Statement
1441
Window placement affects the outcome of the statement: 3 If you name a window before the commands, those commands apply to that window. 3 If you name a window after the commands, SAS executes the commands and then makes that window the active window. The active window is opened and contains the cursor.
Examples Example 1: Using the DM Statement 3 dm ’color text cyan; color command
3
red’;
dm log ’clear; pgm; color numbers green’ output;
3
dm ’caps on’;
3
dm log ’clear’ output;
Example 2: Using the CONTINUE Option with SAS Statements That Do Not Activate a Window This example causes SAS to display the first window of the SAS/AF application, executes the DATA step, moves the cursor to the first field of the SAS/AF application window, and makes that window active. dm ’af c=your-program’ continue; data temp; . . . run;
more SAS statements .
.
.
Example 3: Using the CONTINUE Option with SAS Statements That Activate a Window This example displays the first window of the SAS/AF application and executes the PROC PRINT step, which activates the OUTPUT window. Closing the OUTPUT window moves the cursor to the last active window.. dm ’af c=your-program’ continue; proc print data=temp; run;
DO Statement Specifies a group of statements to be executed as a unit. in a DATA step Category: Control Type: Executable Valid:
Syntax DO; ...more SAS statements... END;
1442
DO Statement
4
Chapter 6
Without Arguments Use the DO statement for simple DO group processing.
Details The DO statement is the simplest form of DO group processing. The statements between the DO and END statements are called a DO group. You can nest DO statements within DO groups. Note: The memory capabilities of your system can limit the number of nested DO statements you can use. For details, see the SAS documentation about how many levels of nested DO statements your system’s memory can support. 4 A simple DO statement is often used within IF-THEN/ELSE statements to designate a group of statements to be executed depending on whether the IF condition is true or false.
Comparisons There are three other forms of the DO statement: 3 The iterative DO statement executes statements between DO and END statements repetitively based on the value of an index variable. The iterative DO statement can contain a WHILE or UNTIL clause.
3 The DO UNTIL statement executes statements in a DO loop repetitively until a condition is true, checking the condition after each iteration of the DO loop.
3 The DO WHILE statement executes statements in a DO loop repetitively while a condition is true, checking the condition before each iteration of the DO loop.
Examples In this simple DO group, the statements between DO and END are performed only when YEARS is greater than 5. If YEARS is less than or equal to 5, statements in the DO group do not execute, and the program continues with the assignment statement that follows the ELSE statement. if years>5 then do; months=years*12; put years= months=; end; else yrsleft=5-years;
See Also Statements: “DO Statement, Iterative” on page 1443 “DO UNTIL Statement” on page 1446 “DO WHILE Statement” on page 1448
Statements
4
DO Statement, Iterative
1443
DO Statement, Iterative Executes statements between the DO and END statements repetitively, based on the value of an index variable. in a DATA step
Valid:
Category: Control Type: Executable
Syntax DO index-variable=specification-1 ; . . . more SAS statements . . . END;
Arguments index-variable
names a variable whose value governs execution of the DO group. The index -variable argument is required. Unless you specify to drop it, the index variable is included in the data set that is being created.
Tip:
CAUTION:
Avoid changing the index variable within the DO group. If you modify the index variable within the iterative DO group, you might cause infinite looping. 4 specification
denotes an expression or a series of expressions in this form start Requirement:
The iterative DO statement requires at least one specification
argument. Tip:
The order of the optional TO and BY clauses can be reversed.
When you use more than one specification, each one is evaluated before its execution.
Tip:
start specifies the initial value of the index variable. Restriction: When it is used with TO stop or BY increment, start must be a
number or an expression that yields a number. Explanation: When it is used without TO stop or BY increment, the value of start
can be a series of items expressed in this form: item-1 ; The items can be either all numeric or all character constants, or they can be variables. Enclose character constants in quotation marks. The DO group is
1444
DO Statement, Iterative
4
Chapter 6
executed once for each value in the list. If a WHILE condition is added, it applies only to the item that it immediately follows. The DO group is executed first with index-variable equal to start. The value of start is evaluated before the first execution of the loop. Featured in: Example 1 on page 1445 TO stop specifies the ending value of the index variable. This argument is optional. Restriction: Stop must be a number or an expression that yields a number. Explanation: When both start and stop are present, execution continues (based on the value of increment) until the value of index-variable passes the value of stop. When only start and increment are present, execution continues (based on the value of increment) until a statement directs execution out of the loop, or until a WHILE or UNTIL expression that is specified in the DO statement is satisfied. If neither stop nor increment is specified, the group executes according to the value of start. The value of stop is evaluated before the first execution of the loop. Tip: Any changes to stop made within the DO group do not affect the number of iterations. To stop iteration of a loop before it finishes processing, change the value of index-variable so that it passes the value of stop, or use a LEAVE statement to go to a statement outside the loop. Featured in: Example 1 on page 1445 BY increment specifies a positive or negative number (or an expression that yields a number) to control the incrementing of index-variable. This argument is optional. Explanation: The value of increment is evaluated before the execution of the loop. Any changes to the increment that are made within the DO group do not affect the number of iterations. If no increment is specified, the index variable is increased by 1. When increment is positive, start must be the lower bound and stop, if present, must be the upper bound for the loop. If increment is negative, start must be the upper bound and stop, if present, must be the lower bound for the loop. Featured in: Example 1 on page 1445 WHILE(expression) | UNTIL(expression) evaluates, either before or after execution of the DO group, any SAS expression that you specify. Enclose the expression in parentheses. This argument is optional. Restriction: A WHILE or UNTIL specification affects only the last item in the clause in which it is located. Explanation: A WHILE expression is evaluated before each execution of the loop, so that the statements inside the group are executed repetitively while the expression is true. An UNTIL expression is evaluated after each execution of the loop, so that the statements inside the group are executed repetitively until the expression is true. Featured in: Example 1 on page 1445 See Also: “DO WHILE Statement” on page 1448 and “DO UNTIL Statement” on page 1446 for more information.
Comparisons There are three other forms of the DO statement: 3 The DO statement, the simplest form of DO-group processing, designates a group of statements to be executed as a unit, usually as a part of IF-THEN/ELSE statements.
Statements
4
DO Statement, Iterative
1445
3 The DO UNTIL statement executes statements in a DO loop repetitively until a condition is true, checking the condition after each iteration of the DO loop.
3 The DO WHILE statement executes statements in a DO loop repetitively while a condition is true, checking the condition before each iteration of the DO loop.
Examples Example 1: Using Various Forms of the Iterative DO Statement 3 These iterative DO statements use a list of items for the value of start: 3 do month=’JAN’,’FEB’,’MAR’;
3
do count=2,3,5,7,11,13,17;
3
do i=5;
3
do i=var1, var2, var3;
3
do i=’01JAN2001’d,’25FEB2001’d,’18APR2001’d;
3 These iterative DO statements use the start TO stop syntax: 3 do i=1 to 10; 3
do i=1 to exit;
3
do i=1 to x-5;
3
do i=1 to k-1, k+1 to n;
3
do i=k+1 to n-1;
3 These iterative DO statements use the BY increment syntax: 3 do i=n to 1 by -1; 3 3
do i=.1 to .9 by .1, 1 to 10 by 1, 20 to 100 by 10; do count=2 to 8 by 2;
3 These iterative DO statements use WHILE and UNTIL clauses: 3 do i=1 to 10 while(xy);
3
do i=10 to 0 by -1 while(month=’JAN’);
3 In this example, the DO loop is executed when I=1 and I=2; the WHILE condition is evaluated when I=3, and the DO loop is executed if the WHILE condition is true. DO I=1,2,3 WHILE (condition);
Example 2: Using the Iterative DO Statement without Infinite Looping In each of the following examples, the DO group executes ten times. The first example demonstrates the preferred approach. /* correct coding */ do i=1 to 10; ...more SAS statements... end;
The next example uses the TO and BY arguments. do i=1 to n by m; ...more SAS statements...
1446
DO UNTIL Statement
4
Chapter 6
if i=10 then leave; end; if i=10 then put ’EXITED LOOP’;
Example 3: Stopping the Execution of the DO Loop
In this example, setting the value of the index variable to the current value of EXIT causes the loop to terminate. data iterate1; input x; exit=10; do i=1 to exit; y=x*normal(0); /* if y>25, */ /* changing i’s value */ /* stops execution */ if y>25 then i=exit; output; end; datalines; 5 000 2500 ;
See Also Statements: “ARRAY Statement” on page 1391 “Array Reference Statement” on page 1396 “DO Statement” on page 1441 “DO UNTIL Statement” on page 1446 “DO WHILE Statement” on page 1448 “GO TO Statement” on page 1529
DO UNTIL Statement Executes statements in a DO loop repetitively until a condition is true. Valid:
in a DATA step
Category: Type:
Control
Executable
Syntax DO UNTIL (expression); ...more SAS statements... END;
Statements
4
DO UNTIL Statement
1447
Arguments
(expression)
is any SAS expression, enclosed in parentheses. You must specify at least one expression.
Details The expression is evaluated at the bottom of the loop after the statements in the DO loop have been executed. If the expression is true, the DO loop does not iterate again. Note:
The DO loop always iterates at least once.
4
Comparisons There are three other forms of the DO statement:
3 The DO statement, the simplest form of DO-group processing, designates a group of statements to be executed as a unit, usually as a part of IF-THEN/ELSE statements.
3 The iterative DO statement executes statements between DO and END statements repetitively based on the value of an index variable.
3 The DO WHILE statement executes statements in a DO loop repetitively while a condition is true, checking the condition before each iteration of the DO loop. The DO UNTIL statement evaluates the condition at the bottom of the loop; the DO WHILE statement evaluates the condition at the top of the loop. Note: The statements in a DO UNTIL loop always execute at least one time, whereas the statements in a DO WHILE loop do not iterate even once if the condition is false. 4
Examples These statements repeat the loop until N is greater than or equal to 5. The expression N>=5 is evaluated at the bottom of the loop. There are five iterations in all (0, 1, 2, 3, 4). n=0; do until(n>=5); put n=; n+1; end;
See Also Statements: “DO Statement” on page 1441 “DO Statement, Iterative” on page 1443 “DO WHILE Statement” on page 1448
1448
DO WHILE Statement
4
Chapter 6
DO WHILE Statement Executes statements in a DO-loop repetitively while a condition is true. in a DATA step
Valid:
Control Type: Executable Category:
Syntax DO WHILE (expression); ...more SAS statements... END;
Arguments (expression)
is any SAS expression, enclosed in parentheses. You must specify at least one expression.
Details The expression is evaluated at the top of the loop before the statements in the DO loop are executed. If the expression is true, the DO loop iterates. If the expression is false the first time it is evaluated, the DO loop does not iterate even once.
Comparisons There are three other forms of the DO statement: 3 The DO statement, the simplest form of DO-group processing, designates a group of statements to be executed as a unit, usually as a part of IF-THEN/ELSE statements.
3 The iterative DO statement executes statements between DO and END statements repetitively based on the value of an index variable.
3 The DO UNTIL statement executes statements in a DO loop repetitively until a condition is true, checking the condition after each iteration of the DO loop. The DO WHILE statement evaluates the condition at the top of the loop; the DO UNTIL statement evaluates the condition at the bottom of the loop. Note: If the expression is false, the statements in a DO WHILE loop do not execute. However, because the DO UNTIL expression is evaluated at the bottom of the loop, the statements in the DO UNTIL loop always execute at least once. 4
Examples These statements repeat the loop while N is less than 5. The expression N;
Arguments file-specification identifies an external file that the DATA step uses to write output from a PUT statement. File-specification can have these forms: ’external-file’ specifies the physical name of an external file, which is enclosed in quotation marks. The physical name is the name by which the operating environment recognizes the file. fileref specifies the fileref of an external file. Requirement: You must have previously associated fileref with an external
file in a FILENAME statement or function, or in an appropriate operating environment command. There is only one exception to this rule: when you use the FILEVAR= option, the fileref is simply a placeholder. See Also: “FILENAME Statement” on page 1470
fileref(file) specifies a fileref that is previously assigned to an external file that is an aggregate grouping of files. Follow the fileref with the name of a file or member, which is enclosed in parentheses.
Statements
4
FILE Statement
1455
Note: A file that is located in an aggregate storage location and has a name that is not a valid SAS name must have its name enclosed in quotation marks. 4 Requirement: You must previously associate fileref with an external file in a FILENAME statement or function, or in an appropriate operating environment command. See Also: “FILENAME Statement” on page 1470 Operating Environment Information: Different operating environments call an aggregate grouping of files by different names, such as a directory, a MACLIB, or a partitioned data set. For details, see the SAS documentation for your operating environment. 4 LOG is a reserved fileref that directs the output that is produced by any PUT statements to the SAS log. At the beginning of each execution of a DATA step, the fileref that indicates where the PUT statements write is automatically set to LOG. Therefore, the first PUT statement in a DATA step always writes to the SAS log, unless it is preceded by a FILE statement that specifies otherwise. Tip: Because output lines are by default written to the SAS log, use a FILE LOG statement to restore the default action or to specify additional FILE statement options. PRINT is a reserved fileref that directs the output that is produced by any PUT statements to the same file as the output that is produced by SAS procedures. Interaction: When you write to a file, the value of the N= option must be either 1 or PAGESIZE. Tip: When PRINT is the fileref, SAS uses carriage-control characters and writes the output with the characteristics of a print file. See Also: A complete discussion of print files in SAS Language Reference: Concepts Operating Environment Information: The carriage-control characters that are written to a file can be specific to the operating environment. For details, see the SAS documentation for your operating environment. 4 If the file does not exist in the directory that you specify for file-specification, SAS creates the file. If the directory specified in file-specification does not exist, SAS sets the SYSERR macro variable, which can be checked if the ERRORCHECK option is set to STRICT.
Tip:
device-type specifies the type of device or the access method that is used if the fileref points to an input or output device or a location that is not a physical file: DISK
specifies that the device is a disk drive. Tip: When you assign a fileref to a file on disk, you are not required to specify DISK.
DUMMY
specifies that the output to the file is discarded. Tip: Specifying DUMMY can be useful for testing.
GTERM
indicates that the output device type is a graphics device that will receive graphics data.
1456
FILE Statement
4
Chapter 6
PIPE
specifies an unnamed pipe. Note:
Some operating environments do not support pipes.
4
PLOTTER
specifies an unbuffered graphics output device.
PRINTER
specifies a printer or printer spool file.
TAPE
specifies a tape drive.
TEMP
creates a temporary file that exists only as long as the filename is assigned. The temporary file can be accessed only through the logical name and is available only while the logical name exists. Restriction: Do not specify a physical pathname. If you do, SAS returns an error. Tip: Files manipulated by the TEMP device can have the same attributes and behave identically to DISK files.
TERMINAL
specifies the user’s terminal.
UPRINTER
specifies a Universal Printing printer definition name. Tip: If you do not specify the printer name in the FILENAME
statement, the PRINTERPATH options control which Universal Printer is used and the destination of the output. DEVICE= Requirement: device-type must appear right after the physical path. DEVICE=device-type can appear anywhere in the statement. Alias:
Operating Environment Information: Additional specifications might be required when you specify some devices. See the SAS documentation for your operating environment before specifying a value other than DISK. Values in addition to the ones listed here might be available in some operating environments. 4
Options BLKSIZE=block-size specifies the block size of the output file. Default: Dependent on your operating environment. Operating Environment Information: For details, see the FILE Statement in the SAS documentation for your operating environment. 4 COLUMN=variable specifies a variable that SAS automatically sets to the current column location of the pointer. This variable, like automatic variables, is not written to the data set. Alias: COL= See Also: LINE= on page 1460 DELIMITER= delimiter(s) specifies an alternate delimiter (other than blank) to be used for LIST output where delimiter is ’list-of-delimiting-characters’ specifies one or more characters to write as delimiters. Requirement: Enclose the list of characters in quotation marks. character-variable
Statements
4
FILE Statement
1457
specifies a character variable whose value becomes the delimiter. Alias: DLM= Default: blank space Restriction: Even though a character string or character variable is accepted,
only the first character of the string or variable is used as the output delimiter. The FILE DLM= processing differs from INFILE DELIMITER= processing. Interaction: Output that contains embedded delimiters requires the delimiter
sensitive data (DSD) option. Tip: DELIMITER= can be used with the colon (:) modifier (modified LIST output). Tip:
The delimiter is case sensitive.
See Also:
DLMSTR= on page 1457, DSD (delimiter sensitive data) on page 1458
DLMSTR= delimiter specifies a character string as an alternate delimiter (other than a blank) to be used for LIST output, where delimiter is ’delimiting-string’ specifies a character string to write as a delimiter. Requirement: Enclose the string in quotation marks.
character-variable specifies a character variable whose value becomes the delimiter. Default: blank space Interaction: If you specify more than one DLMSTR= option in the FILE
statement, the DLMSTR= option that is specified last will be used. If you specify both the DELIMITER= and DLMSTR= options, the option that is specified last will be used. Interaction: If you specify RECFM=N, make sure that the LRECL is large
enough to hold the largest input item. Otherwise, it might be possible for the delimiter to be split across the record boundary. See Also: DELIMITER= on page 1456, DLMSOPT= on page 1457, DSD (delimiter sensitive data) on page 1458 DLMSOPT= ’T’ |’t’ specifies a parsing option for the DLMSTR= T option that removes trailing blanks of the string delimiter. The DLMSOPT=T option has an effect only when used with the DLMSTR= option.
Requirement:
The DLMSOPT=T option is useful when you use a variable as the delimiter string See Also: DLMSTR= on page 1457 Tip:
DROPOVER discards data items that exceed the output line length (as specified by the LINESIZE= or LRECL= options in the FILE statement). Default: FLOWOVER Explanation: By default, data that exceeds the current line length is written on
a new line. When you specify DROPOVER, SAS drops (or ignores) an entire item when there is not enough space in the current line to write it. When an entire item is dropped, the column pointer remains positioned after the last value that is written in the current line. Thus, the PUT statement might write other items in the current output line if they fit in the space that remains or if the column pointer is repositioned. When a data item is dropped, the DATA step
1458
FILE Statement
4
Chapter 6
continues normal execution (_ERROR_=0). At the end of the DATA step, a message is printed for each file from which data was lost. Use DROPOVER when you want the DATA step to continue executing if the PUT statement attempts to write past the current line length, but you do not want the data item that exceeds the line length to be written on a new line.
Tip:
See Also: FLOWOVER on page 1459 and STOPOVER on page 1463
DSD (delimiter sensitive data) specifies that data values that contain embedded delimiters, such as tabs or commas, be enclosed in quotation marks. The DSD option enables you to write data values that contain embedded delimiters to LIST output. This option is ignored for other types of output (for example, formatted, column, and named). Any double quotation marks that are included in the data value are repeated. When a variable value contains the delimiter and DSD is used in the FILE statement, the variable value will be enclosed in double quotation marks when the output is generated. For example, the following code DATA _NULL_; FILE log dsd; x=’"lions, tigers, and bears"’; put x ’ "Oh, my!"’; run;
will result in the following output: """lions, tigers, and bears""", "Oh, my!"
If a quoted (text) string contains the delimiter and DSD is used in the FILE statement, then the quoted string will not be enclosed in double quotation marks when used in a PUT statement. For example, the following code DATA _NULL_; FILE log dsd; PUT ’lions, tigers, and bears’; run;
will result in the following output: lions, tigers, and bears
Interaction: If you specify DSD, the default delimiter is assumed to be the
comma (,). Specify the DELIMITER= or DLMSTR= option if you want to use a different delimiter. By default, data values that do not contain the delimiter that you specify are not enclosed in quotation marks. However, you can use the tilde (~) modifier to force any data value, including missing values, to be enclosed in quotation marks, even if it contains no embedded delimiter.
Tip:
See Also: DELIMITER= on page 1456, DLMSTR= on page 1457
ENCODING= ’encoding-value’ specifies the encoding to use when writing to the output file. The value for ENCODING= indicates that the output file has a different encoding from the current session encoding. When you write data to the output file, SAS transcodes the data from the session encoding to the specified encoding. Default: SAS uses the current session encoding.
Statements
4
FILE Statement
1459
“Encoding Values in SAS Language Elements” in the SAS National Language Support (NLS): Reference Guide Featured in: Example 8 on page 1469 See Also:
FILENAME=variable defines a character variable, whose name you supply, that SAS sets to the value of the physical name of the file currently open for PUT statement output. The physical name is the name by which the operating environment recognizes the file. Tip: This variable, like automatic variables, is not written to the data set. Tip: Use a LENGTH statement to make the variable length long enough to contain the value of the physical filename if it is longer than eight characters (the default length of a character variable). See Also: FILEVAR= on page 1459 Featured in: Example 4 on page 1468 FILEVAR=variable defines a variable whose change in value causes the FILE statement to close the current output file and open a new one the next time the FILE statement executes. The next PUT statement that executes writes to the new file that is specified as the value of the FILEVAR= variable. Restriction: The value of a FILEVAR= variable is expressed as a character string that contains a physical filename. Interaction: When you use the FILEVAR= option, the file-specification is just a placeholder, not an actual filename or a fileref that has been previously assigned to a file. SAS uses this placeholder for reporting processing information to the SAS log. It must conform to the same rules as a fileref. Tip: This variable, like automatic variables, is not written to the data set. Tip: If any of the physical filenames is longer than eight characters (the default length of a character variable), assign the FILEVAR= variable a longer length with another statement, such as a LENGTH statement or an INPUT statement. See Also: FILENAME= on page 1459 Featured in: Example 5 on page 1468 FLOWOVER causes data that exceeds the current line length to be written on a new line. When a PUT statement attempts to write beyond the maximum allowed line length (as specified by the LINESIZE= option in the FILE statement), the current output line is written to the file and the data item that exceeds the current line length is written to a new line. Default: FLOWOVER Interaction: If the PUT statement contains a trailing @, the pointer is positioned after the data item on the new line, and the next PUT statement writes to that line. This process continues until the end of the input data is reached or until a PUT statement without a trailing @ causes the current line to be written to the file. See Also: DROPOVER on page 1457 and STOPOVER on page 1463 FOOTNOTES | NOFOOTNOTES controls whether currently defined footnotes are printed. Alias: FOOTNOTE | NOFOOTNOTE Requirement: In order to print footnotes in a DATA step report, you must set the FOOTNOTE option in the FILE statement. Default: NOFOOTNOTES
1460
FILE Statement
4
Chapter 6
HEADER=label defines a statement label that identifies a group of SAS statements that you want to execute each time SAS begins a new output page. Restriction: The first statement after the label must be an executable statement. Thereafter you can use any SAS statement. Restriction: Use the HEADER= option only when you write to print files. Tip: To prevent the statements in this group from executing with each iteration of the DATA step, use two RETURN statements: one precedes the label and the other appears as the last statement in the group. Featured in: Example 1 on page 1466 LINE=variable defines a variable whose value is the current relative line number within the group of lines available to the output pointer. You supply the variable name; SAS automatically assigns the value. Range: 1 to the value that is specified by the N= option or with the #n line pointer control. If neither is specified, the LINE= variable has a value of 1. Tip: This variable, like automatic variables, is not written to the data set. Tip: The value of the LINE= variable is set at the end of PUT statement execution to the number of the next available line. LINESIZE=line-size sets the maximum number of columns per line for reports and the maximum record length for data files. Alias: LS= Default: The default LINESIZE= value is determined by one of two options: 3 the LINESIZE= system option when you write to a file that contains carriage-control characters or to the SAS log. 3 the LRECL= option in the FILE statement when you write to a file. Range: From 64 to the maximum logical record length that is allowed in your
operating environment. Operating Environment Information: The highest value allowed for LINESIZE= is dependent on your operating environment. For details, see the SAS documentation for your operating environment. 4 Interaction: If a PUT statement tries to write a line that is longer than the
value that is specified by the LINESIZE= option, the action that is taken is determined by whether FLOWOVER, DROPOVER, or STOPOVER is in effect. By default (FLOWOVER), SAS writes the line as two or more separate records. Comparisons: LINESIZE= tells SAS how much of the line to use. LRECL= specifies the physical record length of the file. See Also: LRECL= on page 1461, DROPOVER on page 1457, FLOWOVER on
page 1459, and STOPOVER on page 1463 Featured in: Example 6 on page 1468 LINESLEFT=variable defines a variable whose value is the number of lines left on the current page. You supply the variable name; SAS assigns the value of the number of lines left on the current page to that variable. The value of the LINESLEFT= variable is set at the end of PUT statement execution. Alias: LL= Tip: This variable, like automatic variables, is not written to the data set.
Statements
4
FILE Statement
1461
Featured in: Example 2 on page 1467
LRECL=logical-record-length specifies the logical record length of the output file. Operating Environment Information: Values for logical-record-length are dependent on the operating environment. For details, see the SAS documentation for your operating environment. 4 Default: If you omit the LRECL= option, SAS chooses a value based on the
operating environment’s file characteristics. Comparisons: LINESIZE= tells SAS how much of the line to use; LRECL= specifies the physical line length of the file. Interaction: Alternatively, you can specify a global logical record length by using the LRECL= system option“LRECL= System Option” on page 1884. See Also: LINESIZE= on page 1460, PAD on page 1462, and PAGESIZE= on page 1462 MOD writes the output lines after any existing lines in the file. Default: OLD Restriction: MOD is not accepted under all operating environments. Operating Environment Information: For more information, see the SAS documentation for your operating environment. 4 Restriction: Do not use the MOD option with any ODS destination other than
the Listing destination. Otherwise, you might receive unexpected output. See Also: OLD on page 1462 N=available-lines specifies the number of lines that you want available to the output pointer in the current iteration of the DATA step. Available-lines can be expressed as a number (n) or as the keyword PAGESIZE or PS. n specifies the number of lines that are available to the output pointer. The system can move back and forth between the number of lines that are specified while composing them before moving on to the next set. PAGESIZE specifies that the entire page is available to the output pointer. Alias: PS Restriction: N=PAGESIZE is valid only when output is printed. Restriction: If the current output file is a file that is to be printed, available-lines must have a value of either 1 or PAGESIZE. Interactions: There are two ways to control the number of lines available to the output pointer: 3 the N= option 3 the #n line pointer control in a PUT statement. Interaction: If you omit the N= option and no # pointer controls are used,
one line is available; that is, by default, N=1. If N= is not used but there are # pointer controls, N= is assigned the highest value that is specified for a # pointer control in any PUT statement in the current DATA step. Tip: Setting N=PAGESIZE enables you to compose a page of multiple columns one column at a time.
1462
FILE Statement
4
Chapter 6
Featured in: Example 3 on page 1467
ODS < = (ODS-suboptions) > specifies to use the Output Delivery System to format the output from a DATA step. It defines the structure of the data component and holds the results of the DATA step and binds that component to a table definition to produce an output object. ODS sends this object to all open ODS destinations, each of which formats the output appropriately. For information about the ODS-suboptions, see the “FILE Statement for ODS”. For general information about the Output Delivery System, see SAS Output Delivery System: User’s Guide. Default: If you omit the ODS suboptions, the DATA step uses a default table definition (base.datastep.table) that is stored in the SASHELP.TMPLMST template store. This definition defines two generic columns: one for character variables, and one for numeric variables. ODS associates each variable in the DATA step with one of these columns and displays the variables in the order in which they are defined in the DATA step. Without suboptions, the default table definition uses the variable’s label as its column heading. If no label exists, the definition uses the variable’s name as the column heading. Requirement: The ODS option is valid only when you use the fileref PRINT in the FILE statement. Restriction: You cannot use _FILE_=, FILEVAR=, HEADER=, and PAD with the ODS option. Interaction: The DELIMITER= and DSD options have no effect on the ODS option. The FOOTNOTES|NOFOOTNOTES, LINESIZE, PAGESIZE, and TITLES | NOTITLES options have an effect only on the LISTING destination. OLD replaces the previous contents of the file. Default: OLD Restriction: OLD is not accepted under all operating environments. Operating Environment Information: for your operating environment. 4
For details, see the SAS documentation
See Also: MOD on page 1461
PAD | NOPAD controls whether records written to an external file are padded with blanks to the length that is specified in the LRECL= option. Default: NOPAD is the default when writing to a variable-length file; PAD is the default when writing to a fixed-length file. Tip: PAD provides a quick way to create fixed-length records in a variable-length file. See Also: LRECL= on page 1461 PAGESIZE=value sets the number of lines per page for your reports. Alias: PS= Default: the value of the PAGESIZE= system option. Range: The value can range from 15 to 32767. Interaction: If any TITLE statements are currently defined, the lines they occupy are included in counting the number of lines for each page. Tip: After the value of the PAGESIZE= option is reached, the output pointer advances to line 1 of a new page.
Statements
4
FILE Statement
1463
If you specify FILE LOG, the number of lines that are output on the first page is reduced by the number of lines in the SAS startup notes. For example, if PAGESIZE=20 and there are nine lines of SAS startup notes, only 11 lines are available for output on the first page.
Tip:
See Also:
“PAGESIZE= System Option” on page 1900
PRINT | NOPRINT controls whether carriage-control characters are placed in the output lines. Operating Environment Information: The carriage-control characters that are written to a file can be specific to the operating environment. For details, see the SAS documentation for your operating environment. 4 Restriction: When you write to a file, the value of the N= option must be either 1
or PAGESIZE. Tip:
The PRINT option is not necessary if you are using fileref PRINT.
If you specify FILE PRINT in an interactive SAS session, then the Output window interprets the form-feed control characters as page breaks, and blank lines that are output before the form feed are removed from the output. Writing the results from the Output window to a flat file produces a file without page break characters. If a file needs to contain the form-feed characters, then the FILE statement should include a physical file location and the PRINT option.
Tip:
RECFM=record-format specifies the record format of the output file. Range: Values are dependent on the operating environment.
Operating Environment Information: For details, see the SAS documentation for your operating environment. 4 STOPOVER stops processing the DATA step immediately if a PUT statement attempts to write a data item that exceeds the current line length. In such a case, SAS discards the data item that exceeds the current line length, writes the portion of the line that was built before the error occurred, and issues an error message. Default: FLOWOVER See Also:
FLOWOVER on page 1459 and DROPOVER on page 1457
TITLES | NOTITLES controls the printing of the current title lines on the pages of files. When NOTITLES is omitted, or when TITLES is specified, SAS prints any titles that are currently defined. Alias: TITLE | NOTITLE Default: TITLES
_FILE_=variable names a character variable that references the current output buffer of this FILE statement. You can use the variable in the same way as any other variable, even as the target of an assignment. The variable is automatically retained and initialized to blanks. Like automatic variables, the _FILE_= variable is not written to the data set. Restriction: variable cannot be a previously defined variable. Make sure that
the _FILE_= specification is the first occurrence of this variable in the DATA step. Do not set or change the length of _FILE_= variable with the LENGTH or ATTRIB statements. However, you can attach a format to this variable with the ATTRIB or FORMAT statement.
1464
FILE Statement
4
Chapter 6
Interaction: The maximum length of this character variable is the logical record
length (LRECL) for the specified FILE statement. However, SAS does not open the file to know the LRECL until before the execution phase. Therefore, the designated size for this variable during the compilation phase is 32,767. Tip: Modification of this variable directly modifies the FILE statement’s current output buffer. Any subsequent PUT statement for this FILE statement outputs the contents of the modified buffer. The _FILE_= variable accesses only the current output buffer of the specified FILE statement even if you use the N= option to specify multiple output buffers. Tip: To access the contents of the output buffer in another statement without using the _FILE_= option, use the automatic variable _FILE_. Main Discussion: “Updating the _FILE_ Variable” on page 1465
Operating Environment Options Operating Environment Information: For descriptions of operating-environment-specific options in the FILE statement, see the SAS documentation for your operating environment. 4
Details Overview By default, PUT statement output is written to the SAS log. Use the FILE statement to route this output to either the same external file to which procedure output is written or to a different external file. You can indicate whether carriage-control characters should be added to the file. See the PRINT | NOPRINT option on page 1463. You can use the FILE statement in conditional (IF-THEN) processing because it is executable. You can also use multiple FILE statements to write to more than one external file in a single DATA step. Operating Environment Information: Using the FILE statement requires operating-environment-specific information. See the SAS documentation for your operating environment before you use this statement. 4 You can now use the Output Delivery System with the FILE statement to write DATA step results. This functionality is briefly discussed here. For details, see the “FILE Statement for ODS” in SAS Output Delivery System: User’s Guide.
Updating an External File in Place
You can use the FILE statement with the INFILE and PUT statements to update an external file in place, updating either an entire record or only selected fields within a record. Follow these guidelines: 3 Always place the INFILE statement first. 3 Specify the same fileref or physical filename in the INFILE and FILE statements. 3 Use options that are common to both the INFILE and FILE statements in the INFILE statement. (Any such options that are used in the FILE statement are ignored.) 3 Use the SHAREBUFFERS option in the INFILE statement to allow the INFILE and FILE statements to use the same buffer, which saves CPU time and enables you to update individual fields instead of entire records.
Accessing the Contents of the Output Buffer In addition to the _FILE_= variable, you can use the automatic _FILE_ variable to reference the contents of the current output buffer for the most recent execution of the FILE statement. This character variable is
Statements
4
FILE Statement
1465
automatically retained and initialized to blanks. Like other automatic variables, _FILE_ is not written to the data set. When you specify the _FILE_= option in a FILE statement, this variable is also indirectly referenced by the automatic _FILE_ variable. If the automatic _FILE_ variable is present and you omit _FILE_= in a particular FILE statement, then SAS creates an internal _FILE_= variable for that FILE statement. Otherwise, SAS does not create the _FILE_= variable for a particular FILE. During execution and at the point of reference, the maximum length of this character variable is the maximum length of the current _FILE_= variable. However, because _FILE_ merely references other variables whose lengths are not known until before the execution phase, the designated length is 32,767 during the compilation phase. For example, if you assign _FILE_ to a new variable whose length is undefined, the default length of the new variable is 32,767. You cannot use the LENGTH statement and the ATTRIB statement to set or override the length of _FILE_. You can use the FORMAT statement and the ATTRIB statement to assign a format to _FILE_.
Updating the _FILE_ Variable
Like other SAS variables, you can update the _FILE_ variable. The following two methods are available: 3 Use _FILE_ in an assignment statement. 3 Use a PUT statement. You can update the _FILE_ variable by using an assignment statement that has the following form. _FILE_ =
The assignment statement updates the contents of the current output buffer and sets the buffer length to the length of ’string-in-quotation-marks’ or character-expression. However, using an assignment statement does not affect the current column pointer of the PUT statement. The next PUT statement for this FILE statement begins to update the buffer at column 1 or at the last known location when you use the trailing @ in the PUT statement. In the following example, the assignment statement updates the contents of the current output buffer. The column pointer of the PUT statement is not affected: file print; _file_ = ’_FILE_’; put ’This is PUT’;
SAS creates the following output: This is PUT In this example, file print; _file_ = ’This is from FILE, sir.’; put @14 ’both’;
SAS creates the following output: This is from both, sir. You can also update the _FILE_ variable by using a PUT statement. The PUT statement updates the _FILE_ variable because the PUT statement formats data in the output buffer and _FILE_ points to that buffer. However, by default SAS clears the output buffers after a PUT statement executes and outputs the current record (or N= block of records). Therefore, if you want to examine or further modify the contents of _FILE_ before it is output, include a trailing @ or @@ in any PUT statement (when N=1). For other values of N=, use a trailing @ or @@ in any PUT statement where the last line pointer location is on the last record of the record block. In the following example, when N=1 file ABC; put ’Something’ @;
1466
FILE Statement
4
Chapter 6
Y = _file_||’ is here’; file ABC; put ’Nothing’ ; Y = _file_||’ is here’;
Y is first assigned Something is here then Y is assigned is here. Any modification of _FILE_ directly modifies the current output buffer for the current FILE statement. The execution of any subsequent PUT statements for this FILE statement will output the contents of the modified buffer. _FILE_ only accesses the contents of the current output buffer for a FILE statement, even when you use the N= option to specify multiple buffers. You can access all the N= buffers, but you must use a PUT statement with the # line pointer control to make the desired buffer the current output buffer.
Comparisons 3 The FILE statement specifies the output file for PUT statements. The INFILE statement specifies the input file for INPUT statements.
3 Both the FILE and INFILE statements allow you to use options that provide SAS with additional information about the external file being used.
3 In the Program Editor, Log, and Output windows, the FILE command specifies an external file and writes the contents of the window to the file.
Examples
Example 1: Executing Statements When Beginning a New Page
This DATA step
illustrates how to use the HEADER= option:
3 Write a report. Use DATA _NULL_ to write a report rather than create a data set. data _null_; set sprint; by dept;
3 Route output to the SAS output window. Point to the header information. The PRINT fileref routes output to the same location as procedure output. HEADER= points to the label that precedes the statements that create the header for each page: file print header=newpage;
3 Start a new page for each department: if first.dept then put _page_; put @22 salesrep @34 salesamt;
3 Write a header on each page. These statements execute each time a new page is begun. RETURN is necessary before the label and as the final statement in a labeled group: return; newpage: put @20 ’Sales for 1989’ / @20 dept=; return; run;
Statements
4
FILE Statement
1467
Example 2: Determining New Page by Lines Left on the Current Page
This DATA step demonstrates using the LINESLEFT= option to determine where the page break should occur, according to the number of lines left on the current page.
3 Write a report. Use DATA _NULL_ to write a report rather than create a data set: data _null_; set info;
3 Route output to the standard SAS output window. The PRINT fileref routes output to the same location as procedure output. LINESLEFT indicates that the variable REMAIN contains the number of lines left on the current page: file print linesleft=remain pagesize=20; put @5 name @30 phone @35 bldg @37 room;
3 Begin a new page when there are fewer than seven lines left on the current page. Under this condition, PUT _PAGE_ begins a new page and positions the pointer at line 1: if remain ’external-file’ < operating-environment-options>; vFILENAME fileref < device-type> ; wFILENAME fileref CLEAR | _ALL_ CLEAR; xFILENAME fileref LIST | _ALL_ LIST;
Arguments fileref is any SAS name that you use when you assign a new fileref. When you disassociate a currently assigned fileref or when you list file attributes with the FILENAME statement, specify a fileref that was previously assigned with a FILENAME statement or an operating environment-level command. Tip: The association between a fileref and an external file lasts only for the duration of the SAS session or until you change it or discontinue it by using another FILENAME statement. Change the fileref for a file as often as you want. ’external-file’ is the physical name of an external file. The physical name is the name that is recognized by the operating environment. Operating Environment Information: For details about specifying the physical names of external files, see the SAS documentation for your operating environment. 4
Statements
4
FILENAME Statement
1471
Specify external-file when you assign a fileref to an external file. Tip: You can associate a fileref with a single file or with an aggregate file storage location. Tip:
ENCODING= ’encoding-value’ specifies the encoding to use when SAS is reading from or writing to an external file. The value for ENCODING= indicates that the external file has a different encoding from the current session encoding. When you read data from an external file, SAS transcodes the data from the specified encoding to the session encoding. When you write data to an external file, SAS transcodes the data from the session encoding to the specified encoding. For valid encoding values, see “Encoding Values in SAS Language Elements” in SAS National Language Support (NLS): Reference Guide. Default: SAS assumes that an external file is in the same encoding as the session encoding. Featured in: Example 5 on page 1475 and Example 6 on page 1476 device-type specifies the type of device or the access method that is used if the fileref points to an input or output device or location that is not a physical file: DISK
specifies that the device is a disk drive. Tip: When you assign a fileref to a file on disk, you are not required to specify DISK.
DUMMY
specifies that the output to the file is discarded. Tip: Specifying DUMMY can be useful for testing.
GTERM
indicates that the output device type is a graphics device that will receive graphics data.
PIPE
specifies an unnamed pipe. Note:
Some operating environments do not support pipes.
4
PLOTTER
specifies an unbuffered graphics output device.
PRINTER
specifies a printer or printer spool file.
TAPE
specifies a tape drive.
TEMP
creates a temporary file that exists only as long as the filename is assigned. The temporary file can be accessed only through the logical name and is available only while the logical name exists. Restriction: Do not specify a physical pathname. If you do, SAS returns an error. Tip: Files manipulated by the TEMP device can have the same attributes and behave identically to DISK files.
TERMINAL
specifies the user’s terminal.
UPRINTER
specifies a Universal Printing printer definition name. Tip: If you do not specify the printer name on the FILENAME statement, the PRINTERPATH options control which Universal Printer is used and the destination of the output.
Operating Environment Information: Additional specifications might be required when you specify some devices. See the SAS documentation for your operating
1472
4
FILENAME Statement
Chapter 6
environment before specifying a value other than DISK. Values in addition to the ones listed here might be available in some operating environments. 4 CLEAR disassociates one or more currently assigned filerefs. Specify fileref to disassociate a single fileref. Specify _ALL_ to disassociate all currently assigned filerefs.
Tip:
_ALL_ specifies that the CLEAR or LIST argument applies to all currently assigned filerefs. LIST writes the attributes of one or more files to the SAS log. Interaction: Specify fileref to list the attributes of a single file. Specify _ALL_ to
list the attributes of all files that have filerefs in your current session.
Options RECFM=record-format specifies the record format of the external file. Operating Environment Information: Values for record-format are dependent on the operating environment. For details, see the SAS documentation for your operating environment. 4
Operating Environment Options Operating environment options specify details, such as file attributes and processing attributes, that are specific to your operating environment. Operating Environment Information: For a list of valid specifications, see the SAS documentation for your operating environment. 4
Details Operating Environment Information Operating Environment Information: Using the FILENAME statement requires operating environment-specific information. See the SAS documentation for your operating environment before using this statement. Note also that commands are available in some operating environments that associate a fileref with a file and that break that association. 4
Definitions external file is a file that is created and maintained in the operating environment from which you need to read data, SAS programming statements, or autocall macros, or to which you want to write output. An external file can be a single file or an aggregate storage location that contains many individual external files. See Example 3 on page 1474. Operating Environment Information: Different operating environments call an aggregate grouping of files by different names, such as a directory, a MACLIB, or a partitioned data set. For details about specifying external files, see the SAS documentation for your operating environment. 4
Statements
4
FILENAME Statement
1473
fileref (a file reference name) is a shorthand reference to an external file. After you associate a fileref with an external file, you can use it as a shorthand reference for that file in SAS programming statements (such as INFILE, FILE, and %INCLUDE) and in other commands and statements in SAS software that access external files.
Reading Delimited Data from an External File
Any time a text file originates from anywhere other than the local encoding environment, it might be necessary to specify the ENCODING= option in either EBCDIC or ASCII environments. For example, when you read an EBCDIC text file on an ASCII platform, it is recommended that you specify the ENCODING= option in the FILENAME statement. However, if you use the DSD and DLM options in the FILENAME statement, the ENCODING= option is a requirement because these options require certain characters in the session encoding (such as quotation marks, commas, and blanks). The use of encoding-specific informats should be reserved for use with true binary files; that is, they contain both character and non-character fields.
uAssociating a Fileref with an External File
Use this form of the FILENAME statement to associate a fileref with an external file on disk: FILENAME fileref ’external-file’ ; To associate a fileref with a file other than a disk file, you might need to specify a device type, depending on your operating environment, as shown in this form: FILENAME fileref ; The association between a fileref and an external file lasts only for the duration of the SAS session or until you change it or discontinue it with another FILENAME statement. Change the fileref for a file as often as you want. To specify a character-set encoding, use the following form: FILENAME fileref ;
vAssociating a Fileref with a Terminal, Printer, Universal Printer, or Plotter
To
associate a fileref with an output device, use this form: FILENAME fileref device-type ;
wDisassociating a Fileref from an External File To disassociate a fileref from a file, use a FILENAME statement, specifying the fileref and the CLEAR option. xWriting File Attributes to the SAS Log
Use a FILENAME statement to write the attributes of one or more external files to the SAS log. Specify fileref to list the attributes of one file; use _ALL_ to list the attributes of all the files that have been assigned filerefs in your current SAS session. FILENAME fileref LIST | _ALL_ LIST;
Comparisons The FILENAME statement assigns a fileref to an external file. The LIBNAME statement assigns a libref to a SAS data set or to a DBMS file that can be accessed like a SAS data set.
1474
FILENAME Statement
4
Chapter 6
Examples
Example 1: Specifying a Fileref or a Physical Filename
You can specify an external file either by associating a fileref with the file and then specifying the fileref or by specifying the physical filename in quotation marks: filename sales ’your-input-file’; data jansales; /* specifying a fileref */ infile sales; input salesrep $20. +6 jansales febsales marsales; run; data jansales; /* physical filename in quotation marks */ infile ’your-input-file’; input salesrep $20. +6 jansales febsales marsales; run;
Example 2: Using a FILENAME and a LIBNAME Statement This example reads data from a file that has been associated with the fileref GREEN and creates a permanent SAS data set stored in a SAS library that has been associated with the libref SAVE. filename green ’your-input-file’; libname save ’SAS-library’; data save.vegetable; infile green; input lettuce cabbage broccoli; run;
Example 3: Associating a Fileref with an Aggregate Storage Location If you associate a fileref with an aggregate storage location, use the fileref, followed in parentheses by an individual filename, to read from or write to any of the individual external files that are stored there. Operating Environment Information: Some operating environments allow you to read from but not write to members of aggregate storage locations. For details, see the SAS documentation for your operating environment. 4 In this example, each DATA step reads from an external file (REGION1 and REGION2, respectively) that is stored in the same aggregate storage location and that is referenced by the fileref SALES. filename sales ’aggregate-storage-location’; data total1; infile sales(region1); input machine $ jansales febsales marsales; totsale=jansales+febsales+marsales;
Statements
4
FILENAME Statement
1475
run; data total2; infile sales(region2); input machine $ jansales febsales marsales; totsale=jansales+febsales+marsales; run;
Example 4: Routing PUT Statement Output
In this example, the FILENAME statement associates the fileref OUT with a printer that is specified with an operating environment-dependent option. The FILE statement directs PUT statement output to that printer. filename out printer operating-environment-option; data sales; file out print; input salesrep $20. +6 jansales febsales marsales; put _infile_; datalines; Jones, E. A. 124357 155321 167895 Lee, C. R. 111245 127564 143255 Desmond, R. T. 97631 101345 117865 ;
You can use the FILENAME and FILE statements to route PUT statement output to several devices during the same session. To route PUT statement output to your display monitor, use the TERMINAL option in the FILENAME statement, as shown here: filename show terminal; data sales; file show; input salesrep $20. +6 jansales febsales marsales; put _infile_; datalines; Jones, E. A. 124357 155321 167895 Lee, C. R. 111245 127564 143255 Desmond, R. T. 97631 101345 117865 ;
Example 5: Specifying an Encoding When Reading an External File
This example creates a SAS data set from an external file. The external file is in UTF-8 character-set encoding, and the current SAS session is in the Wlatin1 encoding. By default, SAS assumes that an external file is in the same encoding as the session encoding, which causes the character data to be written to the new SAS data set incorrectly. To tell SAS what encoding to use when reading the external file, specify the ENCODING= option. When you tell SAS that the external file is in UTF-8, SAS then transcodes the external file from UTF-8 to the current session encoding when writing to the new SAS data set. Therefore, the data is written to the new data set correctly in Wlatin1. libname myfiles
’SAS-library’;
filename extfile ’external-file’ encoding="utf-8";
1476
FILENAME Statement, CATALOG Access Method
4
Chapter 6
data myfiles.unicode; infile extfile; input Make $ Model $ Year; run;
Example 6: Specifying an Encoding When Writing to an External File This example creates an external file from a SAS data set. The current session encoding is Wlatin1, but the external file’s encoding needs to be UTF-8. By default, SAS writes the external file using the current session encoding. To tell SAS what encoding to use when writing data to the external file, specify the ENCODING= option. When you tell SAS that the external file is to be in UTF-8 encoding, SAS then transcodes the data from Wlatin1 to the specified UTF-8 encoding when writing to the external file. libname myfiles
’SAS-library’;
filename outfile ’external-file’ encoding="utf-8"; data _null_; set myfiles.cars; file outfile; put Make Model Year; run;
See Also Statements: “FILE Statement” on page 1454 “%INCLUDE Statement” on page 1534 “INFILE Statement” on page 1541 “FILENAME Statement, CATALOG Access Method” on page 1476 “FILENAME Statement, EMAIL (SMTP) Access Method” on page 1482 “FILENAME Statement, FTP Access Method” on page 1492 “FILENAME Statement, SOCKET Access Method” on page 1509 “FILENAME Statement, SFTP Access Method” on page 1503 “FILENAME Statement, URL Access Method” on page 1513 “LIBNAME Statement” on page 1606 SAS Windowing Interface Commands: FILE and INCLUDE
FILENAME Statement, CATALOG Access Method Enables you to reference a SAS catalog as an external file. Valid:
anywhere
Category:
Data Access
Statements
4
FILENAME Statement, CATALOG Access Method
1477
Syntax FILENAME fileref CATALOG ’catalog’ ;
Arguments fileref is a valid fileref. CATALOG specifies the access method that enables you to reference a SAS catalog as an external file. You can then use any SAS commands, statements, or procedures that can access external files to access a SAS catalog. Tip: This access method makes it possible for you to invoke an autocall macro directly from a SAS catalog. Tip: With this access method you can read any type of catalog entry, but you can write only to entries of type LOG, OUTPUT, SOURCE, and CATAMS. Tip: If you want to access an entire catalog (instead of a single entry), you must specify its two-level name in the catalog parameter. Alias: LIBRARY ’catalog’ is a valid two-, three-, or four-part SAS catalog name, where the parts represent library.catalog.entry.entrytype. Default: The default entry type is CATAMS. Restriction: The CATAMS entry type is used only by the CATALOG access method. The CPORT and CIMPORT procedures do not support this entry type.
Catalog Options Catalog-options can be any of the following: LRECL=lrecl where lrecl is the maximum record length for the data in bytes. Default: For input, the actual LRECL value of the file is the default. For output, the default is 132. Interaction: Alternatively, you can specify a global logical record length by using the LRECL= system option“LRECL= System Option” on page 1884. RECFM=recfm where recfm is one of four record formats: F
is fixed-record format. Data is transferred in image (binary) mode.
P
is print format.
S
is stream-record format. Data is transferred in image (binary) mode. Interaction: The amount of data that is read is controlled by the value of the NBYTE= variable in the INFILE statement. The NBYTE= option specifies a variable that is equal to the amount of data to be read. This amount must be less than or equal to LRECL. See Also: The NBYTE= option on page 1548 in the INFILE statement.
1478
FILENAME Statement, CATALOG Access Method
V
4
Chapter 6
is variable-record format (the default). In this format, records have varying lengths, and they are separated by newlines. Data is transferred in image (binary) mode.
Default: V
DESC=description where description is a text description of the catalog. MOD specifies to append to the file. Default: If you omit MOD, the file is replaced.
Details The CATALOG access method in the FILENAME statement enables you to reference a SAS catalog as an external file. You can then use any SAS commands, statements, or procedures that can access external files to access a SAS catalog. As an example, the catalog access method makes it possible for you to invoke an autocall macro directly from a SAS catalog. See Example 5 on page 1479. With the CATALOG access method you can read any type of catalog entry, but you can write to only entries of type LOG, OUTPUT, SOURCE, and CATAMS. If you want to access an entire catalog (instead of a single entry), you must specify its two-level name in the catalog argument.
Examples
Example 1: Using %INCLUDE with a Catalog Entry This example submits the source program that is contained in SASUSER.PROFILE.SASINP.SOURCE: filename fileref1 catalog ’sasuser.profile.sasinp.source’; %include fileref1;
Example 2: Using %INCLUDE with Several Entries in a Single Catalog
This example submits the source code from three entries in the catalog MYLIB.INCLUDE. When no entry type is specified, the default is CATAMS. filename %include %include %include
dir catalog ’mylib.include’; dir(mem1); dir(mem2); dir(mem3);
Example 3: Reading and Writing a CATAMS Entry
This example uses a DATA step to write data to a CATAMS entry, and another DATA step to read it back in: filename mydata catalog ’sasuser.data.update.catams’; /* write data to catalog entry update.catams */ data _null_; file mydata; do i=1 to 10; put i; end; run;
Statements
4
FILENAME Statement, CATALOG Access Method
1479
/* read data from catalog entry update.catams */ data _null_; infile mydata; input; put _INFILE_; run;
Example 4: Writing to a SOURCE Entry
This example writes code to a catalog SOURCE entry and then submits it for processing: filename incit catalog ’sasuser.profile.sasinp.source’; data _null_; file incit; put ’proc options; run;’; run; %include incit;
Example 5: Executing an Autocall Macro from a SAS Catalog
If you store an autocall macro in a SOURCE entry in a SAS catalog, you can point to that entry and invoke the macro in a SAS job. Use these steps: 1 Store the source code for the macro in a SOURCE entry in a SAS catalog. The
name of the entry is the macro name. 2 Use a LIBNAME statement to assign a libref to that SAS library. 3 Use a FILENAME statement with the CATALOG specification to assign a fileref to
the catalog: libref.catalog. 4 Use the SASAUTOS= option and specify the fileref so that the system knows where
to locate the macro. Also set MAUTOSOURCE to activate the autocall facility. 5 Invoke the macro as usual: %macro-name.
This example points to a SAS catalog named MYSAS.MYCAT. It then invokes a macro named REPORTS, which is stored as a SAS catalog entry named MYSAS.MYCAT.REPORTS.SOURCE: libname mysas ’SAS-library’; filename mymacros catalog ’mysas.mycat’; options sasautos=mymacros mautosource; %reports
1480
FILENAME, CLIPBOARD Access Method
4
Chapter 6
See Also Statements: “FILENAME Statement” on page 1470 “FILENAME Statement, EMAIL (SMTP) Access Method” on page 1482 “FILENAME Statement, FTP Access Method” on page 1492 “FILENAME Statement, SOCKET Access Method” on page 1509 “FILENAME Statement, SFTP Access Method” on page 1503 “FILENAME Statement, URL Access Method” on page 1513
FILENAME, CLIPBOARD Access Method Enables you to read text data from and write text data to the clipboard on the host computer. Valid:
anywhere
Category:
Data Access
Syntax FILENAME fileref CLIPBRD ;
Arguments
fileref
is a valid fileref. CLIPBRD
specifies the access method that enables you to read data from or write data to the clipboard on the host computer. BUFFER=paste-buffer-name
creates and names the paste buffer. You can create any number of paste buffers by naming them with the BUFFER= argument in the STORE command.
Details The FILENAME statement, CLIPBOARD Access Method enables you to share data within SAS and between SAS and applications other than SAS.
Comparisons The STORE command copies marked text in the current window and stores the copy in a paste buffer. You can also copy data to the clipboard by using the Explorer pop-up menu item Copy Contents to Clipboard.
Statements
4
FILENAME, CLIPBOARD Access Method
1481
Examples Example 1: Using ODS to Write a Data Set as HTML to the Clipboard
This example uses the Sashelp.Air data set as the input file. The ODS is used to write the data set in HTML format to the clipboard. filename _temp_ clipbrd; ods noresults; ods listing close; ods html file=_temp_ rs=none style=minimal; proc print data=Sashelp.’Air’N noobs; run; ods html close; ods results; ods listing; filename _temp_;
Example 2: Using the DATA Step to Write a Data Set As Comma-separated Values to the Clipboard This example uses the Sashelp.Air data set as the input file. The data is written in the DATA step as comma-separated values to the clipboard. filename _temp1_ temp; filename _temp2_ clipbrd; proc contents data=Sashelp."Air"N out=info noprint; proc sort data=info; by npos; run; data _null_; set info end=eof; ; file _temp1_ dsd; put name @@; if _n_=1 then do; call execute("data _null_; set Sashelp.""Air""N; file _temp1_ dsd mod; put"); end; call execute(trim(name)); if eof then call execute(’; run;’); run; data _null_; infile _temp1_; file _temp2_; input; put _infile_; run; filename _temp1_ clear; filename _temp2_ clear;
Example 3: Using the DATA Step to Write Text to the Clipboard three lines to the clipboard. filename clippy clipbrd; data _null_;
This example writes
1482
FILENAME Statement, EMAIL (SMTP) Access Method
4
Chapter 6
file clippy; put ’Line 1’; put ’Line 2’; put ’Line 3’; run;
Example 4: Using the DATA Step to Retrieve Text from the Clipboard
This example
writes three lines to the clipboard and then retrieves them. filename clippy clipbrd; data _null_; file clippy; put ’Line 1’; put ’Line 2’; put ’Line 3’; run; data _null_; infile clippy; input; put _infile_; run;
See Also Command: The STORE command in the Base SAS Help and Documentation.
FILENAME Statement, EMAIL (SMTP) Access Method Enables you to send electronic mail programmatically from SAS using the SMTP (Simple Mail Transfer Protocol) e-mail interface. Valid:
Anywhere
Category:
Data Access
Syntax FILENAME fileref EMAIL < ’address’ >< email-options>;
Arguments
fileref
is a valid file reference. The fileref is a name that is temporarily assigned to an external file or to a device type. Note that the fileref cannot exceed eight characters.
Statements
4
FILENAME Statement, EMAIL (SMTP) Access Method
1483
EMAIL
specifies the EMAIL device type, which provides the access method that enables you to send electronic mail programmatically from SAS. In order to use SAS to send a message to an SMTP server, you must enable SMTP e-mail. For more information, see “The SMTP E-Mail Interface” in SAS Language Reference: Concepts. ’address’
is the e-mail address to which you want to send the message. You must enclose the address in quotation marks. Specifying an address as a FILENAME statement argument is optional if you specify the TO= e-mail option or the PUT statement !EM_TO! directive, which will override an address specification.
E-mail Options You can use any of the following email options in the FILENAME statement to specify attributes for the electronic message. Note: You can also specify these options in the FILE statement. E-mail options that you specify in the FILE statement override any corresponding e-mail options that you specified in the FILENAME statement. 4 ATTACH=’filename.ext’ | ATTACH= (’filename.ext’ attachment-options) specifies the physical name of the file or files to be attached to the message and any options to modify attachment specifications. The physical name is the name that is recognized by the operating environment. Enclose the physical name in quotation marks. To attach more than one file, enclose the group of files in parentheses, enclose each file in quotation marks, and separate each with a space. Here are examples: attach="/u/userid/opinion.txt" attach=(’C:\Status\June2001.txt’ ’C:\Status\July2001.txt’) attach="user.misc.pds(member)"
The attachment-options include the following: CONTENT_TYPE=’content/type’ specifies the content type for the attached file. You must enclose the value in quotation marks. If you do not specify a content type, SAS tries to determine the correct content type based on the filename. For example, if you do not specify a content type, a filename of home.html is sent with a content type of text/html. Aliases: CT= and TYPE= Default: If SAS cannot determine a content type based on the filename and
extension, the default value is text/plain. ENCODING=’encoding-value’ specifies the text encoding of the attachment that is read into SAS. You must enclose the value in quotation marks. See Also: “Encoding Values in SAS Language Elements” in the SAS National
Language Support (NLS): Reference Guide
1484
FILENAME Statement, EMAIL (SMTP) Access Method
4
Chapter 6
EXTENSION=’extension’ specifies a different file extension to be used for the specified attachment. You must enclose the value in quotation marks. This extension is used by the recipient’s e-mail program for selecting the appropriate utility to use for displaying the attachment. For example, the following results in the attachment home.html being received as index.htm: attach=("home.html" name="index" ext="htm")
Note: If you specify extension="", the specified attachment will have no file extension. 4 Alias: EXT=
NAME=’filename’ specifies a different name to be used for the specified attachment. You must enclose the value in quotation marks. For example, the following results in the attachment home.html being received as index.html: attach=("home.html" name="index")
OUTENCODING=’encoding-value’ specifies the resulting text encoding for the attachment to be sent. You must enclose the value in quotation marks. Restriction: Do not specify EBCDIC encoding values, because the SMTP
e-mail interface does not support EBCDIC. See Also: “Encoding Values in SAS Language Elements” in the SAS National
Language Support (NLS): Reference Guide BCC=’bcc-address’ specifies the recipient or recipients that you want to receive a blind copy of the electronic mail. Individuals that are listed in the bcc field will receive a copy of the e-mail. The BCC field does not appear in the e-mail header, so that these e-mail addresses cannot be viewed by other recipients. If a BCC address contains more than one word, then enclose it in quotation marks. To specify more than one address, you must enclose the group of addresses in parentheses, enclose each address in quotation marks, and separate each address with a space. To specify a real name as well as an address, enclose the address in angle brackets (< >). Here are examples: bcc="[email protected]" bcc=("[email protected]" "[email protected]") bcc="Joe Smith "
CC=’cc-address’ specifies the recipient or recipients to receive a copy of the e-mail message. You must enclose an address in quotation marks. To specify more than one address, enclose the group of addresses in parentheses, enclose each address in quotation marks, and separate each address with a space. To specify a real name as well as an address, enclose the address in angle brackets (< >). Here are examples: cc=’[email protected]’ cc=("[email protected]" "[email protected]") cc="Joe Smith "
Statements
4
FILENAME Statement, EMAIL (SMTP) Access Method
1485
CONTENT_TYPE=’content/type’ specifies the content type for the message body. If you do not specify a content type, SAS tries to determine the correct content type. You must enclose the value in quotation marks. Aliases:
CT= and TYPE=
Default: text/plain
ENCODING=’encoding-value’ specifies the text encoding to use for the message body. For valid encoding values, see “Encoding Values in SAS Language Elements” in the SAS National Language Support (NLS): Reference Guide. FROM=’from-address’ specifies the e-mail address of the author of the message that is being sent. The default value for FROM= is the e-mail address of the user who is running SAS. Specify this option, for example, when the person who is sending the message from SAS is not the author. You must enclose an address in quotation marks. You can specify only one e-mail address. To specify the author’s real name along with the address, enclose the address in angle brackets (< >). Here are examples: from=’[email protected]’ from="Brad Martin "
Requirement:
The FROM option is required if the EMAILFROM system option
is set. LRECL=lrecl where lrecl is the logical record length of the data. Default: 256 Interaction: Alternatively, you can specify a global logical record length by using
the “LRECL= System Option” on page 1884. IMPORTANCE=’LOW’ | ’NORMAL’ | ’HIGH’ specifies the priority of the e-mail message. You must enclose the value in quotation marks. You can specify the priority in the language that matches your session encoding. However, SAS will translate the priority into English because the actual message header must contain English in accordance with the RFC-2076 specification (Common Internet Message Headers). Here are examples: filename inventory email ’[email protected]’ importance=’high’; filename inventory email ’[email protected]’ importance=’hoch’;
Default: NORMAL
REPLYTO=’replyto-address’ specifies the e-mail address(es) for who will receive replies. You must enclose an address in quotation marks. To specify more than one address, enclose the group of addresses in parentheses, enclose each address in quotation marks, and separate each address with a space. To specify a real name along with an address, enclose the address in angle brackets (< >). Here are examples: replyto=’[email protected]’ replyto=(’[email protected]’ ’[email protected]’) replyto="Hiroshi Mori "
1486
FILENAME Statement, EMAIL (SMTP) Access Method
4
Chapter 6
SUBJECT=subject specifies the subject of the message. If the subject contains special characters or more than one word (that is, it contains at least one blank space), you must enclose the text in quotation marks. Here are examples: subject=Sales subject="June Sales Report"
Note: If you do not enclose a one-word subject in quotation marks, it is converted to uppercase. 4 TO=’to-address’ specifies the primary recipient or recipients of the e-mail message. You must enclose the address in quotation marks. To specify more than one address, enclose the group of addresses in parentheses, enclose each address in quotation marks, and separate each address with a space. To specify a real name as well as an address, enclose the address in angle brackets (< >). Here are examples: to=’[email protected]’ to=("[email protected]" "[email protected]") to="Joe Smith "
Tip:
Specifying TO= overrides the ’address’ argument.
PUT Statement Syntax for EMAIL (SMTP) Access Method In the DATA step, after using the FILE statement to define your e-mail fileref as the output destination, use PUT statements to define the body of the message. For example, filename mymail email ’[email protected]’ subject=’Sending Email’; data _null_; file mymail; put ’Hi’; put ’This message is sent from SAS...’; run;
You can also use PUT statements to specify e-mail directives that override the attributes of your message (the e-mail options like TO=, CC=, SUBJECT=, CONTENT_TYPE=, ATTACH=), or to perform actions such as send, end abnormally, or start a new message. Specify only one directive in each PUT statement; each PUT statement can contain only the text that is associated with the directive that it specifies. The directives that change the attributes of a message are as follows: ’!EM_ATTACH! ’filename.ext’ | ATTACH=(’filename.ext’ attachment-options)’ replaces the physical name of the file or files to be attached to the message and any options to modify attachment specifications. The physical name is the name that is recognized by the operating environment. The directive must be enclosed in quotation marks, and the physical name must be enclosed in quotation marks. To attach more than one file, enclose the group of files in parentheses, enclose each file in quotation marks, and separate each with a space. Here are examples: put ’!em_attach! /u/userid/opinion.txt’; put ’!em_attach! ("C:\Status\June2001.txt" "C:\Status\July2001.txt")’;
Statements
4
FILENAME Statement, EMAIL (SMTP) Access Method
1487
put ’!em_attach! user.misc.pds(member)’;
The attachment-options include the following: CONTENT_TYPE=’content/type’ specifies the content type for the attached file. You must enclose the value in quotation marks. If you do not specify a content type, SAS tries to determine the correct content type based on the filename. For example, if you do not specify a content type, a filename of home.html is sent with a content type of text/html. Aliases: CT= and TYPE= Default: If SAS cannot determine a content type based on the filename and
extension, the default value is text/plain. ENCODING=’encoding-value’ specifies the text encoding to use for the attachment as it is read into SAS. You must enclose the value in quotation marks. For valid encoding values, see “Encoding Values in SAS Language Elements” in the SAS National Language Support (NLS): Reference Guide. EXTENSION=’extension’ specifies a different file extension to be used for the specified attachment. You must enclose the value in quotation marks. This extension is used by the recipient’s e-mail program for selecting the appropriate utility to use for displaying the attachment. For example, the following results in the attachment home.html being received as index.htm: put ’!em_attach! ("home.html" name="index" ext="htm")’;
Alias: EXT= Default: TXT
NAME=’filename’ specifies a different name to be used for the specified attachment. You must enclose the value in quotation marks. For example, the following results in the attachment home.html being received as index.html: put ’!em_attach! ("home.html" name="index")’;
OUTENCODING=’encoding-value’ specifies the resulting text encoding for the attachment to be sent. You must enclose the value in quotation marks. Restriction: Do not specify EBCDIC encoding values, because the SMTP
e-mail interface does not support EBCDIC. See Also: “Encoding Values in SAS Language Elements” in the SAS National
Language Support (NLS): Reference Guide ’!EM_BCC! bcc-address’ replaces the current blind copied recipient address(es) with addresses. These recipients are not visible to the recipients in the !EM_TO! or !EM_CC! addresses. If you want to specify more than one address, then you must enclose the group of addresses in parentheses, enclose each address in quotation marks, and separate each address with a space. To specify real names along with addresses, enclose the address in angle brackets (< >). Here are examples: put ’!em_bcc! [email protected]’; put ’!em_bcc! ("[email protected]" "[email protected]")’;
1488
FILENAME Statement, EMAIL (SMTP) Access Method
4
Chapter 6
put ’!em_bcc! Joe Smith ’;
’!EM_CC! cc-address’ replaces the current copied recipient address(es). The directive must be enclosed in quotation marks. To specify more than one address, enclose the group of addresses in parentheses, enclose each address in quotation marks, and separate each address with a space. To specify real names along with addresses, enclose the address in angle brackets (< >). Here are examples: put ’!em_cc! [email protected]’; put ’!em_cc! ("[email protected]" "[email protected]")’; put ’!em_cc! Joe Smith ’;
’!EM_FROM! from-address’ replaces the current address of the author of the message being sent, which could be either the default or the one specified by the FROM= e-mail option. The directive must be enclosed in quotation marks. You can specify only one e-mail address. To specify the author’s real name along with the address, enclose the address in angle brackets (< >). Here are examples: put ’!em_from! [email protected]’; put ’!em_from! Brad Martin ’;
’!EM_IMPORTANCE! LOW | NORMAL | HIGH’ specifies the priority of the e-mail message. The directive must be enclosed in quotation marks. You can specify the priority in the language that matches your session encoding. However, SAS will translate the priority into English because the actual message header must contain English in accordance with the RFC-2076 specification (Common Internet Message Headers). Here are examples: put ’!em_importance! high’; put ’!em_importance! haut’;
Default: NORMAL
’!EM_REPLYTO! replyto-address’ replaces the current address(es) of who will receive replies. The directive must be enclosed in quotation marks. To specify more than one address, enclose the group of addresses in parentheses, enclose each address in quotation marks, and separate each address with a space. To specify a real name along with an address, enclose the address in angle brackets (< >). Here are examples: put ’!em_replyto! [email protected]’; put ’!em_replyto! ("[email protected]" "[email protected]")’; put ’!em_replyto! Hiroshi Mori ’;
’!EM_SUBJECT! subject’ replaces the current subject of the message. The directive must be enclosed in quotation marks. If the subject contains special characters or more than one word (that is, it contains at least one blank space), you must enclose the text in quotation marks. Here are examples: put ’!em_subject! Sales’;
Statements
4
FILENAME Statement, EMAIL (SMTP) Access Method
1489
put ’!em_subject! "June Sales Report"’;
’!EM_TO! to-address’ replaces the current primary recipient address(es). The directive must be enclosed in quotation marks. To specify more than one address, enclose the group of addresses in parentheses, enclose each address in quotation marks, and separate each address with a space. To specify a real name along with an address, enclose the address in angle brackets (< >). Here are examples: put ’!em_to! [email protected]’; put ’!em_to! ("[email protected]" "[email protected]")’; put ’!em_to! Joe Smith ’;
Specifying !EM_TO! overrides the ’address’ argument and the TO= e-mail option. Here are the directives that perform actions: Tip:
’!EM_SEND!’ sends the message with the current attributes. By default, SAS sends a message when the fileref is closed. The fileref closes when the next FILE statement is encountered or the DATA step ends. If you use this directive, SAS sends the message when it encounters the directive, and again at the end of the DATA step. This directive is useful for writing DATA step programs that conditionally send messages or use a loop to send multiple messages. ’!EM_ABORT!’ abnormally end the current message. You can use this directive to stop SAS from automatically sending the message at the end of the DATA step. By default, SAS sends a message for each FILE statement. ’!EM_NEWMSG!’ clears all attributes of the current message that were set using PUT statement directives.
Details You can send electronic mail programmatically from SAS using the EMAIL (SMTP) access method. To send e-mail to an SMTP server, you first specify the SMTP e-mail interface with the EMAILSYS system option, use the FILENAME statement to specify the EMAIL device type, and then submit SAS statements in a DATA step or in SCL code. The e-mail access method has several advantages: 3 You can use the logic of the DATA step or SCL to subset e-mail distribution based on a large data set of e-mail addresses. 3 You can automatically send e-mail upon completion of a SAS program that you submitted for batch processing. 3 You can direct output through e-mail based on the results of processing. In general, DATA step or SCL code that sends e-mail has the following components:
3 a FILENAME statement with the EMAIL device-type keyword 3 e-mail options specified in the FILENAME or FILE statement that indicate e-mail recipients, subject, attached file or files, and so on 3 PUT statements that define the body of the message 3 PUT statements that specify e-mail directives (of the form !EM_directive!) that override the e-mail options (for example, TO=, CC=, SUBJECT=, ATTACH=) or perform actions such as send, end abnormally, or start a new message.
1490
FILENAME Statement, EMAIL (SMTP) Access Method
4
Chapter 6
Examples
Example 1: Sending E-mail with an Attachment Using a DATA Step
In order to share a copy of your SAS configuration file with another user, you could send it by submitting the following program. The e-mail options are specified in the FILENAME statement: filename mymail email "[email protected]" subject="My SAS Configuration File" attach="/u/sas/sasv8.cfg"; data _null_; file mymail; put ’Jim,’; put ’This is my SAS configuration file.’; put ’I think you might like the’; put ’new options I added.’; run;
The following program sends a message and two file attachments to multiple recipients. For this example, the e-mail options are specified in the FILE statement instead of the FILENAME statement. filename outbox email "[email protected]"; data _null_; file outbox to=("[email protected]" "[email protected]") /* Overrides value in */ /* filename statement */ cc=("[email protected]" "[email protected]") subject="My SAS Output" attach=("C:\sas\results.out" "C:\sas\code.sas") ; put ’Folks,’; put ’Attached is my output from the SAS’; put ’program I ran last night.’; put ’It worked great!’; run;
Example 2: Using Conditional Logic in a DATA Step
You can use conditional logic in a DATA step in order to send multiple messages and control which recipients get which message. For example, in order to send customized reports to members of two different departments, the following program produces an e-mail message and attachments that are dependent on the department to which the recipient belongs. In the program, the following occurs: 1 In the first PUT statement, the !EM_TO! directive assigns the TO attribute. 2 The second PUT statement assigns the SUBJECT attribute using the !EM_SUBJECT! directive. 3 The !EM_SEND! directive sends the message. 4 The !EM_NEWMSG! directive clears the message attributes, which must be used to clear message attributes between recipients. 5 The !EM_ABORT! directive abnormally ends the message before the RUN statement causes it to be sent again. The !EM_ABORT! directive prevents the message from being automatically sent at the end of the DATA step.
Statements
4
FILENAME Statement, EMAIL (SMTP) Access Method
1491
filename reports email "[email protected]"; data _null_; file reports; length name dept $ 21; input name dept; put ’!EM_TO! ’ name; put ’!EM_SUBJECT! Report for ’ dept; put name ’,’; put ’Here is the latest report for ’ dept ’.’ ; if dept=’marketing’ then put ’!EM_ATTACH! c:\mktrept.txt’; else /* ATTACH the appropriate report */ put ’!EM_ATTACH! c:\devrept.txt’; put ’!EM_SEND!’; put ’!EM_NEWMSG!’; put ’!EM_ABORT!’; datalines; Susan marketing Peter marketing Alma development Andre development ; run;
Example 3: Sending Procedure Output in E-mail
You can use e-mail to send procedure output. This example illustrates how to send ODS HTML in the body of an e-mail message. Note that ODS HTML procedure output must be sent with the RECORD_SEPARATOR (RS) option set to NONE. filename outbox email to=’[email protected]’ type=’text/html’ subject=’Temperature Conversions’; data temperatures; do centigrade = -40 to 100 by 10; fahrenheit = centigrade*9/5+32; output; end; run; ods html body=outbox /* Mail it! */ rs=none; title ’Centigrade to Fahrenheit Conversion Table’; proc print; id centigrade; var fahrenheit; run; ods html close;
1492
FILENAME Statement, FTP Access Method
4
Chapter 6
Example 4: Creating and E-mailing an Image The following example illustrates how to create a GIF image and send it from SAS in an e-mail message: filename gsasfile email to=’[email protected]’ type=’image/gif’ subject="SAS/GRAPH Output"; goptions dev=gif gsfname=gsasfile; proc gtestit pic=1; run;
See Also Statements: “FILENAME Statement” on page 1470 “FILENAME Statement, CATALOG Access Method” on page 1476 “FILENAME Statement, FTP Access Method” on page 1492 “FILENAME Statement, SOCKET Access Method” on page 1509 “FILENAME Statement, SFTP Access Method” on page 1503 “FILENAME Statement, URL Access Method” on page 1513 The SMTP E-Mail Interface in SAS Language Reference: Concepts
FILENAME Statement, FTP Access Method Enables you to access remote files by using the FTP protocol. anywhere Category: Data Access Valid:
Syntax FILENAME fileref FTP ’external-file’ ;
Arguments fileref
is a valid fileref. Tip: The association between a fileref and an external file lasts only for the duration of the SAS session or until you change it or discontinue it with another FILENAME statement. You can change the fileref for a file as often as you want. FTP
specifies the access method that enables you to use File Transfer Protocol (FTP) to read from or write to a file from any host computer that you can connect to on a network with an FTP server running.
Statements
4
FILENAME Statement, FTP Access Method
1493
Use FILENAME with FTP when you want to connect to the host computer, to log in to the FTP server, to make records in the specified file available for reading or writing, and to disconnect from the host computer.
Tip:
’external-file’
specifies the physical name of an external file that you want to read from or write to. The physical name is the name that is recognized by the operating environment. If the file has an IBM 370 format and a record format of FB or FBA, and if the ENCODING= option is specified, then you must also specify the LRECL= option. If the length of a record is shorter than the value of LRECL, then SAS pads the record with blanks until the record length is equal to the value of LRECL. Operating Environment Information: For details about specifying the physical names of external files, see the SAS documentation for your operating environment.
4
If you are not transferring a file but performing a task such as retrieving a directory listing, then you do not need to specify a filename. Instead, put empty quotation marks in the statement. See Example 1 on page 1499. Tip: You can associate a fileref with a single file or with an aggregate file storage location. Tip: If you use the DIR option, specify the directory in this argument. Tip:
ftp-options
specifies details that are specific to your operating environment such as file attributes and processing attributes. Operating Environment Information: For more information about some of these FTP options, see the SAS documentation for your operating environment. 4
FTP Options AUTHDOMAIN="auth-domain" specifies the name of an authentication domain metadata object in order to connect to the FTP server. The authentication domain references credentials (user ID and password) without your having to explicitly specify the credentials. The auth-domain name is case sensitive, and it must be enclosed in double quotation marks. An administrator creates authentication domain definitions while creating a user definition with the User Manager in SAS Management Console. The authentication domain is associated with one or more login metadata objects that provide access to the FTP server and is resolved by the BASE engine calling the SAS Metadata Server and returning the authentication credentials. Requirement: The authentication domain and the associated login definition must be stored in a metadata repository, and the metadata server must be running in order to resolve the metadata object specification. Interaction: If you specify AUTHDOMAIN=, you do not need to specify USER= and PASS=. See also: For more information about creating and using authentication domains, see the discussion on credential management in the SAS Intelligence Platform: Security Administration Guide. BINARY is fixed-record format. Thus, all records are of size LRECL with no line delimiters. Data is transferred in image (binary) mode. The BINARY option overrides the value of RECFM= in the FILENAME FTP statement, if specified, and forces a binary transfer.
1494
FILENAME Statement, FTP Access Method
Alias:
4
Chapter 6
RECFM=F
Interaction: If you specify the BINARY option and the S370V or S370VS option,
then SAS ignores the BINARY option. BLOCKSIZE=blocksize where blocksize is the size of the data buffer in bytes. Default: 32768
CD=’directory’ issues a command that changes the working directory for the file transfer to the directory that you specify. Interaction: The CD and DIR options are mutually exclusive. If both are
specified, FTP ignores the CD option and SAS writes an informational note to the log. DEBUG writes to the SAS log informational messages that are sent to and received from the FTP server. DIR enables you to access directory files or PDS/PDSE members. Specify the directory name in the external-file argument. You must use valid directory syntax for the specified host. If you want FTP to append a file extension of DATA to the member name that is specified in the FILE or INFILE statement, then use the FILEEXT option in conjunction with the DIR option. The FILEEXT option is ignored if you specify a file extension in the FILE or INFILE statement.
Tip:
If you want FTP to create the directory, then use the NEW option in conjunction with the DIR option. The NEW option will be ignored if the directory exists.
Tip:
If the NEW option is omitted and you specify an invalid directory, then a new directory will not be created and you will receive an error message.
Tip:
The maximum number of directory or z/OS PDSE members that can be open simultaneously is limited by the number of sockets that can be open simultaneously on an FTP server. The number of sockets that can be open simultaneously is proportional to the number of connections that are set up during the installation of the FTP server. You might want to limit the number of sockets that are open simultaneously to avoid performance degradation.
Tip:
Interaction: The CD and DIR options are mutually exclusive. If both are
specified, FTP ignores the CD option and SAS writes an informational note to the log. Featured in:
Example 10 on page 1502
ENCODING=encoding-value specifies the encoding to use when reading from or writing to the external file. The value for ENCODING= indicates that the external file has a different encoding from the current session encoding. When you read data from an external file, SAS transcodes the data from the specified encoding to the session encoding. When you write data to an external file, SAS transcodes the data from the session encoding to the specified encoding. Default: SAS assumes that an external file is in the same encoding as the
session encoding. The data is transferred in image or binary format and is in local data format. Thus, you must use appropriate SAS informats to read the data correctly.
Tip:
Statements
4
FILENAME Statement, FTP Access Method
1495
“Encoding Values in SAS Language Elements” in the SAS National Language Support (NLS): Reference Guide
See Also:
FILEEXT specifies that the member type of DATA is automatically appended to the member name on the FILE or INFILE statement when you use the DIR option. The FILEEXT option is ignored if you specify a file extension in the FILE or INFILE statement.
Tip:
See Also:
LOWCASE_MEMNAME option on page 1495
Featured in: Example 10 on page 1502
HOST=’host’ where host is the network name of the remote host with the FTP server running. You can specify either the name of the host (for example, server.pc.mydomain.com) or the IP address of the computer (for example, 2001:db8::). HOSTRESPONSELEN=’size’ where size is the length of the FTP server response message. Default: 2048 bytes Range: 2048 to 16384 bytes Restriction: If you specify a size that is less than 2048 or is greater than 16384,
the size will be set to 2048. LIST issues the LIST command to the FTP server. LIST returns the contents of the working directory as records that contain all of the file attributes that are listed for each file. The file attributes that are returned will vary, depending on the FTP server that is being accessed.
Tip:
LOWCASE_MEMNAME enables autocall macro retrieval of lowercase directory or member names from FTP servers. Restriction: SAS autocall macro retrieval always searches for uppercase
directory member names. Mixed case directory or member names are not supported. Interaction: If you access files off FTP servers by using the %INCLUDE, FILE,
INFILE, or other DATA step I/O statements, case sensitivity will be preserved. See Also:
FILEEXT option on page 1495
LRECL=lrecl where lrecl is the logical record length of the data. Default: 256 Interaction: Alternatively, you can specify a global logical record length by using
the LRECL= system option“LRECL= System Option” on page 1884. LS issues the LS command to the FTP server. LS returns the contents of the working directory as records with no file attributes. The file attributes that are returned will vary, depending on the FTP server that is being accessed.
Tip: Tip:
LS.
To return a listing of a subset of files, use the LSFILE= option in addition to
1496
FILENAME Statement, FTP Access Method
4
Chapter 6
LSFILE=’character-string’ in combination with the LS option, specifies a character string that enables you to request a listing of a subset of files from the working directory. Enclose the character string in quotation marks. Restriction: LSFILE= can be used only if LS is specified. Tip:
You can specify a wildcard as part of ’character-string ’.
The file attributes that are returned will vary, depending on the FTP server that is being accessed.
Tip:
Example: This statement lists all of the files that start with sales and end with sas: filename myfile ftp ’’ ls lsfile=’sales*.sas’ other-ftp-options;
MGET transfers multiple files, similar to the FTP command MGET. The whole transfer is treated as one file. However, as the transfer of each new file is started, the EOV= variable is set to 1.
Tip: Tip:
Specify MPROMPT to prompt the user before each file is sent.
MPROMPT specifies whether to prompt for confirmation that a file is to be read, if necessary, when the user executes the MGET option. Restriction: The MPROMPT option is not available on z/OS for batch processing.
NEW specifies that you want FTP to create the directory when you use the DIR option. Tip:
The NEW option will be ignored if the directory exists.
Restriction: The NEW option is not available under z/OS.
PASS=’password’ where password is the password to use with the user name specified in the USER= option. You can specify the PROMPT option instead of the PASS option, which tells the system to prompt you for the password.
Tip:
If the user name is anonymous, then the remote host might require that you specify your e-mail address as the password.
Tip:
To use an encoded password, use the PWENCODE procedure in order to disguise the text string, and then enter the encoded password for the PASS= option. For more information, see the“PWENCODE Procedure” in the Base SAS Procedures Guide.
Tip:
Featured in:
Example 6 on page 1501
PORT=portno where portno is the port that the FTP daemon monitors on the respective host. The portno can be any number between 0 and 65535 that uniquely identifies a service. In the Internet community, there is a list of predefined port numbers for specific services. For example, the default port for FTP is 21. A partial list of port numbers is usually available in the /etc/services file on any UNIX computer.
Tip:
PROMPT specifies to prompt for the user login password, if necessary.
Statements
4
FILENAME Statement, FTP Access Method
1497
Restriction: The PROMPT option is not available for batch processing under
z/OS. Interaction: If PROMPT is specified without USER=, then the user is prompted
for an ID, as well as a password. You can use the SAVEUSER on page 1498 option to save the user ID and password after the user ID and password prompt is successfully executed.
Tip:
RCMD= ’command ’ where command is the FTP ’SITE’ or ’service’ command to send to the FTP server. FTP servers use SITE commands to provide services that are specific to a system and are essential to file transfer but not common enough to be included in the protocol. For example, rcmd=’site rdw’ preserves the record descriptor word (RDW) of a z/OS variable blocked data set as a part of the data. See S370V and S370VS below. Interaction: Some FTP service commands might not run at a particular client site depending on the security permissions and the availability of the commands. Tip: If you transfer a file with the FTP access method and then cannot read the file, you might need to change the FTP server’s UMASK setting. If the FTP server supports a SITE UMASK setting, you can change the permissions of the file as shown in the following example: filename in ftp ’/mydir/accounting/file2.dat’ host="xxx.fyi.xxx.com" user="john" rcmd=’site umask 022’ prompt; data _null; file in; put a $80; run;
You can specify multiple FTP service commands if you separate them by semicolons. Some examples are as follows:
Tip:
rcmd=’ascii;site umask 002’
rcmd=’stat;site chmod 0400 ~mydir/abc.txt’
RECFM=recfm where recfm is one of three record formats: F
is fixed-record format. Thus, all records are of size LRECL with no line delimiters. Data is transferred in image (binary) mode. Alias: BINARY The BINARY option overrides the value of RECFM= in the FILENAME FTP statement, if specified, and forces a binary transfer.
S
is stream-record format. Data is transferred in image (binary) mode. Interaction: The amount of data that is read is controlled by the current LRECL value or by the value of the NBYTE= variable in the INFILE statement. The NBYTE= option specifies a variable that is equal to the amount of data to be read. This amount must be less than or equal to LRECL.
1498
FILENAME Statement, FTP Access Method
4
Chapter 6
See Also: The NBYTE= option on page 1548 in the INFILE
statement. V
is variable-record format (the default). In this format, records have varying lengths, and they are transferred in text (stream) mode. Interaction: Any record larger than LRECL is truncated. Tip: If you are using files with the IBM 370 Variable format or the IBM 370 Spanned Variable format, then you might want to use the S370V or S370VS options instead of the RECFM= option. See S370V and S370VS below.
Default: V
If you specify the RECFM= option and the S370V or S370VS option, then SAS ignores the RECFM= option.
Interaction:
RHELP issues the HELP command to the FTP server. The results of this command are returned as records. RSTAT issues the RSTAT command to the FTP server. The results of this command are returned as records. SAVEUSER saves the user ID and password after the user ID and password prompt are successfully executed. Interaction: The user ID and password are saved only for the duration of the SAS session or until you change the association between the fileref and the external file, or discontinue it with another FILENAME statement. S370V indicates that the file being read is in IBM 370 variable format. Interaction: If you specify this option and the RECFM= option, then SAS ignores the RECFM= option. Tip: The data is transferred in image or binary format and is in local data format. Thus, you must use appropriate SAS informats to read the data correctly on non-EBCDIC hosts. Tip: Use the rcmd=’site rdw’ option when you transfer a z/OS data set with a variable-record format to another z/OS data set with a variable-record format to preserve the record descriptor word (rdw) of each record. By default, most FTP servers remove the rdw that exists in each record before it is transferred. Typically, the ’SITE RDW’ command is not necessary when you transfer a data set with a z/OS variable-record format to ASCII, or when you transfer an ASCII file to a z/OS variable-record format. S370VS indicates that the file that is being read is in IBM 370 variable-spanned format. Interaction: If you specify this option and the RECFM= option, then SAS ignores the RECFM= option. Tip: The data is transferred in image or binary format and is in local data format. Thus, you must use appropriate SAS informats to read the data correctly on non-EBCDIC hosts. Tip: Use the rcmd=’site rdw’ option when you transfer a z/OS data set with a variable-record format to another z/OS data set with a variable-record format to preserve the record descriptor word (rdw) of each record. By default, most FTP servers remove the rdw that exists in each record before it is transferred.
Statements
4
FILENAME Statement, FTP Access Method
1499
Typically, the ’SITE RDW’ command is not necessary when you transfer a data set with a z/OS variable-record format to ASCII, or when you transfer an ASCII file to a z/OS variable-record format. USER=’username’ where username is used to log in to the FTP server. Restriction: The FTP access method does not support FTP proxy servers that require user ID authentication. Interaction: If PROMPT is specified, but USER= is not, then the user is prompted for an ID. Tip: You can specify a proxy server and credentials for an FTP server when using the FTP access method. The user ID and password that you need to log in to the FTP server is sent via the proxy server by using the user="userid@ftpservername" pass="password" host="proxy.server.xxx.com" syntax. Both anonymous and user ID
validation are supported. Featured in:
TERMSTR=’eol-char’ where eol-char is the line delimiter to use when RECFM=V. There are three valid values: CRLF
carriage return (CR) followed by line feed (LF).
LF
line feed only (the default).
NULL
NULL character (0x00).
Default: LF Restriction: Use this option only when RECFM=V.
Comparisons As with the FTP get and put commands, the FTP access method lets you download and upload files; however, this method directly reads files into your SAS session without first storing them on your system.
Examples
Example 1: Retrieving a Directory Listing
This example retrieves a directory listing from a host named mvshost1 for user smythe, and prompts smythe for a password: filename dir ftp ’’ ls user=’smythe’ host=’mvshost1.mvs.sas.com’ prompt; data _null_; infile dir; input; put _INFILE_; run;
Note: The quotation marks are empty because no file is being transferred. Because quotation marks are required by the syntax, however, you must include them. 4
Example 2: Reading a File from a Remote Host
This example reads a file called sales in the directory /u/kudzu/mydata from the remote UNIX host hp720:
1500
FILENAME Statement, FTP Access Method
4
Chapter 6
filename myfile ftp ’sales’ cd=’/u/kudzu/mydata’ user=’guest’ host=’hp720.hp.sas.com’ recfm=v prompt; data mydata / view=mydata; infile myfile; input x $10. y 4.; run; proc print data=mydata; run;
/* Create a view */
/* Print the data */
Example 3: Creating a File on a Remote Host
This example creates a file called
test.dat in a directory called c:\remote for the user bbailey on the host winnt.pc: filename create ftp ’c:\remote\test.dat’ host=’winnt.pc’ user=’bbailey’ prompt recfm=v; data _null_; file create; do i=1 to 10; put i=; end; run;
Example 4: Reading an S370V-Format File on z/OS This example reads an S370V-format file from a z/OS system. See RCMD= on page 1497 for more information about RCMD=’site rdw’. filename viewdata ftp ’sluggo.stat.data’ user=’sluggo’ host=’zoshost1’ s370v prompt rcmd=’site rdw’; data mydata / view=mydata; infile viewdata; input x $ebcdic8.; run; proc print data=mydata; run;
/* Create a view */
/* Print the data */
Example 5: Anonymously Logging In to FTP This example shows how to log in to FTP anonymously, if the host accepts anonymous logins. Note: Some anonymous FTP servers require a password. If required, your e-mail address is usually used. See PASS= on page 1496 under “FTP Options.” 4 filename anon ftp ’’ ls host=’130.96.6.1’ user=’anonymous’; data _null_; infile anon; input; list; run;
Statements
4
FILENAME Statement, FTP Access Method
1501
Note: The quotation marks following the argument FTP are empty. A filename is needed only when transferring a file, not when routing a command. The quotation marks, however, are required. 4
Example 6: Using an Encoded Password This example shows you how to use an encoded password in the FILENAME statement. In a separate SAS session, use the PWENCODE procedure to encode your password and make note of the output. proc pwencode in= "MyPass1"; run;
The following output appears in the SAS log: (sas001)TX1QYXNzMQ==
You can now use the entire encoded password string in your batch program. filename myfile ftp ’sales’ cd=’/u/kudzu/mydata’ user=’tjbarry’ host=’hp720.hp.mycompany.com’ pass="(sas001)TX1QYXMZ==";
Example 7: Importing a Transport Data Set This example uses the CIMPORT procedure to import a transport data set from a host named myshost1for user calvin. The new data set will reside locally in the SASUSER library. Note that user and password can be SAS macro variables. If you specify a fully qualified data set name, then use double quotation marks and single quotation marks. Otherwise, the system will append the profile prefix to the name that you specify. %let user=calvin; %let pw=xxxxx; filename inp ftp "’calvin.mat1.cpo’" user="&user" pass="&pw" rcmd=’binary’ host=’mvshost1’; proc cimport library=sasuser infile=inp; run;
Example 8: Transporting a SAS Library
This example uses the CPORT procedure to transport a SAS library to a host named mvshost1 for user calvin. It will create a new sequential file on the host called userid.mat64.cpo with the recfm of fb, lrecl of 80, and blocksize of 8000. filename inp ftp ’mat64.cpo’ user=’calvin’ pass="xxxx" host=’mvshost1’ lrecl=80 recfm=f blocksize=8000 rcmd=’site blocksize=800 recfm=fb lrecl=80’; proc cport library=mylib file=inp; run;
Example 9: Creating a Transport Library with Transport Engine This example creates a new SAS library on host mvshost1. The FILENAME statement assigns a fileref to the new data set. Note the use of the RCMD= option to specify important file attributes. The LIBNAME statement uses a libref that is the same as the fileref and assigns it to the XPORT engine. The PROC COPY step copies all data sets from the SAS library that are referenced by MYLIB to the XPORT engine. Output from the PROC CONTENTS step confirms that the copy was successful:
1502
FILENAME Statement, FTP Access Method
4
Chapter 6
filename inp ftp ’mat65.cpo’ user=’calvin’ pass="xxxx" host=’mvshost1’ lrecl=80 recfm=f blocksize=8000 rcmd=’site blocksize=8000 recfm=fb lrecl=80’; libname mylib ’SAS-library’; libname inp xport; proc copy in=mylib out=inp mt=data; run; proc contents data=inp._all_; run;
Note: For more information about the XPORT engine, see “The Transport Engine” in SAS Language Reference: Concepts and “XPORT Engine Limitations” in Moving and Accessing SAS Files. 4
Example 10: Reading and Writing from Directories
This example reads the file
ftpmem1 from a directory on a UNIX host, and writes the file ftpout1 to a different
directory on another UNIX host. filename indir ftp ’/usr/proj2/dir1’ DIR host="host1.mycompany.com" user="xxxx" prompt; filename outdir ftp ’/usr/proj2/dir2’ DIR FILEEXT host="host2.mycompany.com" user="xxxx" prompt; data _null_; infile indir(ftpmem1) truncover; input; file outdir(ftpout1); put _infile_; run;
The file ftpout1 is written to /usr/proj2/dir2/ftpout1.DATA. Note that a member type of DATA is appended to the ftpout1 file because the FILEEXT option was specified in the output file’s FILENAME statement. For more information, see FILEEXT on page 1495 . Note:
The DIR option is not needed for some ODS destinations.
4
The following example writes an output file and transfers it to an ODS-specified destination. The DIR option is not needed. filename output ftp "~user/ftpdir/" host="host.fyi.company.com" user="userid" pass="userpass" recfm=s debug; ods listing close; ods html body=’body.html’ path=output; proc print data=sashelp.class;run; ods html close; ods listing;
To export multiple graph files to a remote directory location, the DIR option must be specified on the FILENAME statement. Accordingly, when creating external graph files
Statements
4
FILENAME Statement, SFTP Access Method
1503
with the ODS HTML destination, two FILENAME statements are needed: one for the HTML files, and one for the graph files. The following example illustrates the need for two FILENAME statements. filename output1 ftp "~user/dir" fileext host="host.unx.company.com" user="userid" pass="userpass" recfm=s debug; filename output2 ftp "~user/dir" dir fileext host="host.unx.company.com" user="userid" pass="userpass" recfm=s debug; ods listing close; ods html body=’body.html’ path=output1 gpath=output2 frame=’frames.html’ contents=’contents.html’; proc gtestit;run;quit; ods html close; ods listing;
Example 11: Using a Proxy Server
This example uses a proxy server with the FTP access method. The user ID and password are sent via the proxy server. filename test ftp ’ ’ ls host=’proxy.server.xxx.com’ user=’userid@ftpservername’ pass=’xxxxxx’ cd=’pubsdir/’; data _null_; infile test truncover; input a $256.; put a=; run;
See Also Statements: “FILENAME Statement” on page 1470 “FILENAME Statement, CATALOG Access Method” on page 1476 “FILENAME Statement, EMAIL (SMTP) Access Method” on page 1482 “FILENAME Statement, SOCKET Access Method” on page 1509 “FILENAME Statement, SFTP Access Method” on page 1503 “FILENAME Statement, URL Access Method” on page 1513 “LIBNAME Statement” on page 1606
FILENAME Statement, SFTP Access Method Enables you to access remote files by using the SFTP protocol. Valid:
anywhere
Category: Data Access
1504
FILENAME Statement, SFTP Access Method
4
Chapter 6
Syntax FILENAME fileref SFTP ’external-file’ ;
Arguments fileref
is a valid fileref. Tip: The association between a fileref and an external file lasts only for the duration of the SAS session or until you change it or discontinue it with another FILENAME statement. You can change the fileref for a file as often as you want. SFTP
specifies the access method that enables you to use Secure File Transfer Protocol (SFTP) to read from or write to a file from any host computer that you can connect to on a network with an OpenSSH SSHD server running. ’external-file’
specifies the physical name of an external file that you want to read from or write to. The physical name is the name that is recognized by the operating environment. Operating Environment Information: For details about specifying the physical names of external files, see the SAS documentation for your operating environment.
4
If you are not transferring a file but performing a task such as retrieving a directory listing, then you do not need to specify an external filename. Instead, put empty quotation marks in the statement. Tip: You can associate a fileref with a single file or with an aggregate file storage location. Tip:
sftp-options
specifies details that are specific to your operating environment such as file attributes and processing attributes. Operating Environment Information: For more information on some of these SFTP options, see the SAS documentation for your operating environment. 4
SFTP Options BATCHFILE=’path’ specifies the fully qualified pathname and the filename of the batch file that contains the SFTP commands. These commands are submitted when the SFTP access method is executed. After the batch file processing ends, the SFTP connection is closed. Requirement: The path must be enclosed in quotation marks. Tip: After the batch file processing ends, the SFTP connection is closed and the filename assignment is no longer available. If subsequent DATA step processing requires the FILENAME SFTP statement, then another FILENAME SFTP statement is required. Featured in: Example 5 on page 1508 CD=’directory’ issues a command that changes the working directory for the file transfer to the directory that you specify. DEBUG
Statements
4
FILENAME Statement, SFTP Access Method
1505
writes informational messages to the SAS log. DIR enables you to access directory files. Specify the directory name in the external-file argument. You must use valid directory syntax for the specified host. Interaction: The CD and DIR options are mutually exclusive. If both are specified, SFTP ignores the CD option and SAS writes an informational note to the log. Tip: If you want SFTP to create the directory, then use the NEW option in conjunction with the DIR option. The NEW option will be ignored if the directory exists. Tip: If the NEW option is omitted and you specify an invalid directory, then a new directory will not be created and you will receive an error message. HOST=’host’ where host is the network name of the remote host with the OpenSSH SSHD server running. You can specify either the name of the host (for example, server.pc.mydomain.com) or the IP address of the computer (for example, 2001:db8::). LRECL=lrecl where lrecl is the logical record length of the data. Default: 256 Interaction: Alternatively, you can specify a global logical record length by using the LRECL= system option“LRECL= System Option” on page 1884. LS issues the LS command to the SFTP server. LS returns the contents of the working directory as records with no file attributes. Restriction: The LS option will not display files with leading periods, for example .xAuthority. Interaction: The LS and LSA options are mutually exclusive. If you specify both options, the LSA option takes precedence. Tip: To return a listing of a subset of files, use the LSFILE= option in addition to LS. LSA issues the LS command to the SFTP server. LSA returns all the contents of the working directory as records with no file attributes. Interaction: The LS and LSA options are mutually exclusive. If you specify both options, the LSA option takes precedence. Interaction: To display files without leading periods, for example .xAuthority, use the LS= option. Tip: To return a listing of a subset of files, use the LSFILE= option in addition to LSA. LSFILE=’character-string’ in combination with the LS option, specifies a character string that enables you to request a listing of a subset of files from the working directory. Enclose the character string in quotation marks. Restriction: LSFILE= can be used only if LS or LSA is specified. Tip: You can specify a wildcard as part of ’character-string ’. Example: This statement lists all of the files that start with sales and end with sas:
1506
FILENAME Statement, SFTP Access Method
4
Chapter 6
filename myfile sftp ’’ ls lsfile=’sales*.sas’ other-sftp-options;
MGET transfers multiple files, similar to the SFTP command MGET. The whole transfer is treated as one file. However, as the transfer of each new file is started, the EOV= variable is set to 1.
Tip:
NEW specifies that you want SFTP to create the directory when you use the DIR option. Restriction: The NEW option is not available under z/OS.
The NEW option will be ignored if the directory exists.
Tip:
OPTIONS= specifies SFTP configuration options such as port numbers. PATH specifies the location of the SFTP executable if it is not installed in the PATH or $PATH search path. It is recommended that the OpenSSH “SFTP” executable or PUTTY “PSFTP” executable be installed in a directory that is accessible via the PATH or $PATH search path.
Tip:
RECFM=recfm where recfm is one of two record formats: F is fixed-record format. Thus, all records are of size LRECL with no line delimiters. V is variable-record format (the default). In this format, records have varying lengths, and they are separated by newlines. Data is transferred in image (binary) mode. Default: V
USER=’username’ specifies the user name. Requirement:
The username is required by the PUTTY client on the Windows
host. The username is not typically required on LINUX or UNIX hosts when using public key authentication.
Tip:
Public key authentication using an SSH agent is the recommended way to connect to a remote SSHD server.
Tip:
WAIT_MILLISECONDS=milliseconds specifies the SFTP response time in milliseconds. Default: 1,500 milliseconds
If you receive a timeout message in the log, use the WAIT_MILLISECONDS option to increase the response time.
Tip:
Details The Basics The Secure File Transfer Protocol (SFTP) provides a secure connection and file transfers between two hosts (client and server) over a network. Both commands and
Statements
4
FILENAME Statement, SFTP Access Method
1507
data are encrypted. The client machine initiates a connection with the remote host (OpenSSH SSHD server). With the SFTP access method, you can read from or write to any host computer that you can connect to on a network with an OpenSSH SSHD server running. The client and server applications can reside on the same computer or on different computers that are connected by a network. Specific implementation details are dependent on the OpenSSH SSHD server version and how that site is configured. The SFTP access method relies on default send and reply messages to OpenSSH commands. Custom installs of OpenSSH that modify these messages will disable the SFTP access method. You must have the applicable client software installed to use the SFTP access method. The SFTP access method supports only the following SSH clients.
3 OpenSSH - UNIX 3 PUTTY – Windows Note:
Password validation is not supported for the SFTP access method.
4
Note: Public key authentication using an SSH agent is the recommended way to connect to a remote SSHD server. 4 Note: If you have trouble running the SFTP access method try to manually validate SFTP client access to an OpenSSH SSHD server without involving the SAS system. Manually validating SFTP client access without involving the SAS system will ensure that your SSH/SSHD configuration and key authentication is setup correctly. 4
SFTP Access Methods and SFTP Prompts The SFTP access method supports only the following prompts. Changing the prompt will disable the SFTP access method.
3 For OpenSSH: sftp> sftp >
3 For PUTTY: psftp>
Comparisons As with the SFTP get and put commands, the SFTP access method lets you download and upload files; however, this method directly reads files into your SAS session without first storing them on your system.
Examples
Example 1: Connecting to an SSHD Server at a Standard Port
This example reads a file called test.dat using the SFTP access method after connecting to the SSHD server a standard port: filename myfile sftp ’/users/xxxx/test.dat’ host="unixhost1"; data _null_; infile myfile truncover; input a $25.; run;
1508
FILENAME Statement, SFTP Access Method
4
Chapter 6
Example 2: Connecting to an SSHD Server at a Nonstandard Port This example reads a file called test.dat using the SFTP access method after connecting to the SSHD server at port 4117: filename myfile sftp ’/users/xxxx/test.dat’ host="unixhost1" options="-oPort=4117"; data _null_; infile myfile truncover; input a $25.;; run;
Example 3: Connecting a Windows PUTTY Client to an SSHD Server
This example writes a file called test.dat using the SFTP access method after connecting a Windows PUTTY client to the SSHD server with a userid of userid: filename outfile sftp ’/users/xxxx/test.dat’ host="unixhost1" user="userid"; data _null_; file outfile; do i=1 to 10; put i=; end; run;
Example 4: Reading Files from a Directory on the Remote Host
This example reads the files test.dat and test2.dat from a directory on the remote host. filename infile sftp ’/users/xxxx/’ host="unixhost1" dir; data _null_; infile infile(test.dat) truncover; input a $25.; infile infile(test2.dat) truncover; input b $25.; run;
Example 5: Using a Batch File In this example, when the INFILE statement is processed, the batch file associated with the FILENAME SFTP statement, sftpcmds, is executed. filename process sftp ’ ’ host="unixhost1" user="userid" batchfile="c:/stfpdir/sftpcmds.bat"; data _null_; infile process; run;
See Also Statements: “FILENAME “FILENAME “FILENAME “FILENAME
Statement” on page 1470 Statement, CATALOG Access Method” on page 1476 Statement, EMAIL (SMTP) Access Method” on page 1482 Statement, FTP Access Method” on page 1492
Statements
4
FILENAME Statement, SOCKET Access Method
1509
“FILENAME Statement, SOCKET Access Method” on page 1509 “FILENAME Statement, URL Access Method” on page 1513 “LIBNAME Statement” on page 1606 Barrett, Daniel J., Richard E. Silverman, and Robert G. Byrnes. 2005. SSH, The Secure Shell: A Definitive Guide. Sebastopol, CA: O’Reilly
FILENAME Statement, SOCKET Access Method Enables you to read from or write to a TCP/IP socket. Valid:
anywhere
Category: Data Access
Syntax uFILENAME fileref SOCKET ’hostname:portno’ ; vFILENAME fileref SOCKET ’:portno’ SERVER ;
Arguments fileref is a valid fileref. The association between a fileref and an external file lasts only for the duration of the SAS session or until you change it or discontinue it with another FILENAME statement. You can change the fileref for a file as often as you want.
Tip:
SOCKET specifies the access method that enables you to read from or write to a Transmission Control Protocol/Internet Protocol (TCP/IP) socket. ’hostname:portno’ is the name or IP address of the host and the TCP/IP port number to connect to. Tip:
Use this specification for client access to the socket.
’:portno’ is the port number to create for listening. Tip: Tip:
Use this specification for server mode. If you specify :0, the system will choose a number.
SERVER sets the TCP/IP socket to be a listening socket, thereby enabling the system to act as a server that is waiting for a connection. The system accepts all connections serially; only one connection is active at any one time.
Tip:
See Also:
The RECONN= option description on page 1510 under TCPIP-Options.
1510
FILENAME Statement, SOCKET Access Method
4
Chapter 6
TCPIP-Options BLOCKSIZE=blocksize where blocksize is the size of the socket data buffer in bytes. Default: 8192
ENCODING=encoding-value specifies the encoding to use when reading from or writing to the socket. The value for ENCODING= indicates that the socket has a different encoding from the current session encoding. When you read data from a socket, SAS transcodes the data from the specified encoding to the session encoding. When you write data to a socket, SAS transcodes the data from the session encoding to the specified encoding. For valid encoding values, see “Encoding Values for SAS Language Elements” in SAS National Language Support (NLS): Reference Guide. LRECL=lrecl where lrecl is the logical record length. Default: 256 Interaction: Alternatively, you can specify a global logical record length by using
the LRECL= system option“LRECL= System Option” on page 1884. RECFM=recfm where recfm is one of three record formats: F
is fixed record format. Thus, all records are of size LRECL with no line delimiters. Data are transferred in image (binary) mode.
S
is stream record format. Tip: Data are transferred in image (binary) mode. Interactions: The amount of data that is read is controlled by
the current LRECL value or the value of the NBYTE= variable in the INFILE statement. The NBYTE= option specifies a variable equal to the amount of data to be read. This amount must be less than or equal to LRECL. See Also: The NBYTE= option on page 1548 in the INFILE
statement. V
is variable record format (the default). Tip: In this format, records have varying lengths, and they are
transferred in text (stream) mode. Tip: Any record larger than LRECL is truncated. Default: V
RECONN=conn-limit where conn-limit is the maximum number of connections that the server will accept. Explanation: Because only one connection can be active at a time, a connection
must be disconnected before the server can accept another connection. When a new connection is accepted, the EOV= variable is set to 1. The server will continue to accept connections, one at a time, until conn-limit has been reached. TERMSTR=’eol-char’ where eol-char is the line delimiter to use when RECFM=V. There are three valid values: CRLF
carriage return (CR) followed by line feed (LF).
Statements
4
FILENAME Statement, SOCKET Access Method
LF
line feed only (the default).
NULL
NULL character (0x00).
1511
Default: LF Restriction: Use this option only when RECFM=V.
Details The Basics A TCP/IP socket is a communication link between two applications. The server application creates the socket and waits for a connection. The client application connects to the socket. With the SOCKET access method, you can use SAS to communicate with another application over a socket in either client or server mode. The client and server applications can reside on the same computer or on different computers that are connected by a network. As an example, you can develop an application using Microsoft Visual Basic that communicates with a SAS session that uses the TCP/IP sockets. Note that Visual Basic does not provide inherent TCP/IP support. You can obtain a custom control (VBX) from SAS Technical Support (free of charge) that allows a Visual Basic application to communicate through the sockets. uUsing the SOCKET Access Method in Client Mode In client mode, a local SAS application can use the SOCKET access method to communicate with a remote application that acts as a server (and waits for a connection). Before you can connect to a server, you must know:
3 the network name or IP address of the host computer running the server. 3 the port number that the remote application is listening to for new connections. The remote application can be another SAS application, but it doesn’t need to be. When the local SAS application connects to the remote application through the TCP/IP socket, the two applications can communicate by reading from and writing to the socket as if it were an external file. If at any time the remote side of the socket is disconnected, the local side will also automatically terminate.
vUsing the SOCKET Access Method in Server Mode
When the local SAS application is in server mode, it remains in a wait state until a remote application connects to it. To use the SOCKET access method in server mode, you need to know only the port number that you want the server to listen to for a connection. Typically, servers use well-known ports to listen for connections. These port numbers are reserved by the system for specific server applications. For more information about how well-known ports are defined on your system, refer to the documentation for your TCP/IP software or ask your system administrator. If the server application does not use a well-known port, then the system assigns a port number when it establishes the socket from the local application. However, because any client application that waits to connect to the server must know the port number, you should try to use a well-known port. While a local SAS server application is waiting for a connection, SAS is in a wait state. Each time a new connection is established, the EOV= variable in the DATA step is set to 1. Because the server accepts only one connection at a time, no new connections can be established until the current connection is closed. The connection closes automatically when the remote client application disconnects. The SOCKET access method continues to accept new connections until it reaches the limit set in the RECONN option.
1512
FILENAME Statement, SOCKET Access Method
4
Chapter 6
Examples
Example 1: Communicating between Two SAS Applications Over a TCP/IP Socket
This example shows how two SAS applications can talk over a TCP/IP socket. The local application is in server mode; the remote application is the client that connects to the server. This example assumes that the server host name is hp720.unx.sas.com, that the well-known port number is 5000, and that the server allows a maximum of three connections before closing the socket. Here is the program for the server application: filename local socket ’:5000’ server reconn=3; /*The server is using a reserved */ /*port number of 5000. */ data tcpip; infile local eov=v; input x $10; if v=1 then do; /* new connection when v=1 */ put ’new connection received’; end; output; run;
Here is the program for the remote client application: filename remote socket ’hp720.unx.sas.com:5000’; data _null_; file remote; do i=1 to 10; put i; end; run;
See Also Statements: “FILENAME Statement” on page 1470 “FILENAME Statement, CATALOG Access Method” on page 1476 “FILENAME Statement, EMAIL (SMTP) Access Method” on page 1482 “FILENAME Statement, FTP Access Method” on page 1492 “FILENAME Statement, URL Access Method” on page 1513
Statements
4
FILENAME Statement, URL Access Method
1513
FILENAME Statement, URL Access Method Enables you to access remote files by using the URL access method. Valid:
anywhere
Category: Data Access
Syntax FILENAME fileref URL ’external-file’ ;
Arguments fileref
is a valid fileref. Tip: The association between a fileref and an external file lasts only for the duration of the SAS session or until you change it or discontinue it with another FILENAME statement. You can change the fileref for a file as often as you want. URL
specifies the access method that enables you to read a file from any host computer that you can connect to on a network with a URL server running. Alias: HTTP ’external-file’
specifies the name of the file that you want to read from on a URL server. The Secure Socket Layer (SSL) protocol, https, can also be used to access the files. The file must be specified in one of these formats: http://hostname/file https://hostname/file http://hostname:portno/file https://hostname:portno/file Operating Environment Information: For details about specifying the physical names of external files, see the SAS documentation for your operating environment.
4
url-options
can be any of the following: AUTHDOMAIN="auth-domain" specifies the name of an authentication domain metadata object in order to connect to the proxy or Web server. The authentication domain references credentials (user ID and password) without your having to explicitly specify the credentials. The auth-domain name is case sensitive, and it must be enclosed in double quotation marks. An administrator creates authentication domain definitions while creating a user definition with the User Manager in SAS Management Console. The authentication domain is associated with one or more login metadata objects that provide access to the proxy or Web server and is resolved by the BASE engine calling the SAS Metadata Server and returning the authentication credentials.
1514
FILENAME Statement, URL Access Method
4
Chapter 6
Requirement: The authentication domain and the associated login definition
must be stored in a metadata repository, and the metadata server must be running in order to resolve the metadata object specification. Interaction: If you specify AUTHDOMAIN=, you do not need to specify USER=
and PASS=. See also: For more information about creating and using authentication domains,
see the discussion on credential management in the SAS Intelligence Platform: Security Administration Guide. BLOCKSIZE=blocksize where blocksize is the size of the URL data buffer in bytes. Default: 8K
DEBUG writes debugging information to the SAS log. Tip: The result of the HELP command is returned as records.
HEADERS=fileref specifies the fileref to which the header information is written when a file is opened by using the URL access method. The header information is the same information that is written to the SAS log. Requirement: The fileref must be defined in a previous FILENAME statement. Interaction: If you specify the HEADERS= option without specifying the DEBUG
option, the DEBUG option is automatically turned on. Interaction: By default, log information is overwritten. To append the log
information, you must specify the MOD option in the FILENAME statement that creates the fileref. LRECL=lrecl where lrecl is the logical record length of the data. Default: 256 Interaction: Alternatively, you can specify a global logical record length by using
the LRECL= system option“LRECL= System Option” on page 1884. PASS=’password’ where password is the password to use with the user name that is specified in the USER option. Tip: You can specify the PROMPT option instead of the PASS option, which tells
the system to prompt you for the password. Tip: To use an encoded password, use the PWENCODE procedure in order to
disguise the text string, and then enter the encoded password for the PASS= option. For more information, see the PWENCODE Procedure in the Base SAS Procedures Guide. PPASS=’password’ where password is the password to use with the user name that is specified in the PUSER option. The PPASS option is used to access the proxy server. Tip: You can specify the PROMPT option instead of the PPASS option, which tells
the system to prompt you for the password. Tip: To use an encoded password, use the PWENCODE procedure to disguise the
text string, and then enter the encoded password for the PASS= option. For more information, see the PWENCODE procedure in the Base SAS Procedures Guide.
Statements
4
FILENAME Statement, URL Access Method
1515
PROMPT specifies to prompt for the user login password if necessary. Tip: If you specify PROMPT, you do not need to specify PASS= or PPASS=. PROXY=url specifies the Uniform Resource Locator (URL) for the proxy server in one of these forms: http://hostname/ http://hostname:portno/ PUSER=’username’ where username is used to log on to the URL proxy server. Tip: If you specify puser=’*’, then the user is prompted for an ID. Interaction: If you specify the PUSER option, the USER option goes to the Web server regardless of whether you specify a proxy server. Interaction: If PROMPT is specified, but PUSER is not, the user is prompted for an ID as well as a password. RECFM=recfm where recfm is one of three record formats: F
is fixed-record format. Thus, all records are of size LRECL with no line delimiters. Data is transferred in image (binary) mode.
S
is stream-record format. Data is transferred in image (binary) mode. Alias: N Tip: The amount of data that is read is controlled by the current LRECL value or the value of the NBYTE= variable in the INFILE statement. The NBYTE= option specifies a variable that is equal to the amount of data to be read. This amount must be less than or equal to LRECL. See Also: The NBYTE= option on page 1548 in the INFILE statement.
V
is variable-record format (the default). In this format, records have varying lengths, and they are transferred in text (stream) mode. Tip: Any record larger than LRECL is truncated.
Default: V
TERMSTR=’eol-char’ where eol-char is the line delimiter to use when RECFM=V. There are four valid values: CR
carriage return (CR).
CRLF
carriage return (CR) followed by line feed (LF).
LF
line feed only (the default).
NULL
NULL character (0x00).
Default: LF Restriction: Use this option only when RECFM=V.
USER=’username’ where username is used to log on to the URL server. Tip: If you specify user=’*’, then the user is prompted for an ID.
1516
FILENAME Statement, URL Access Method
4
Chapter 6
Interaction: If you specify the USER option but do not specify the PUSER option,
where the USER option goes depends on whether you specify a proxy server. If you do not specify a proxy server, USER goes to the Web server. If you do specify a proxy server, USER will go to the proxy server. If you specify the PUSER option, the USER option goes to the Web server regardless of whether you specify a proxy server. Interaction: If PROMPT is specified, but USER or PUSER is not, the user is
prompted for an ID as well as a password.
Details The Secure Sockets Layer (SSL) protocol is used when the URL begins with “https” instead of “http”. The SSL protocol provides network security and privacy. Developed by Netscape Communications, SSL uses encryption algorithms that include RC2, RC4, DES, tripleDES, IDEA, and MD5. Not limited to providing only encryption services, SSL can also perform client and server authentication and use message authentication codes. SSL is supported by both Netscape Navigator and Internet Explorer. Many Web sites use the protocol to provide confidential user information such as credit card numbers. The SSL protocol is application independent, enabling protocols such as HTTP, FTP, and Telnet to be layered transparently above it. SSL is optimized for HTTP. Operating Environment Information: Using the FILENAME statement requires information that is specific to your operating environment. The URL access method is fully documented here, but for more information about how to specify filenames, see the SAS documentation for your operating environment. 4
Examples
Example 1: Accessing a File at a Web Site
This example accesses document
test.datat site www.a.com: filename foo url ’http://www.a.com/test.dat’ proxy=’http://www.gt.sas.com’;
Example 2: Specifying a User ID and a Password
This example accesses document
file1.html at site www.b.com using the SSL protocol and requires a user ID and
password: filename foo url ’https://www.b.com/file1.html’ user=’jones’ prompt;
Statements
4
FILENAME Statement, WebDAV Access Method
1517
Example 3: Reading the First 15 Records from a URL File This example reads the first 15 records from a URL file and writes them to the SAS log with a PUT statement: filename foo url ’http://support.sas.com/techsup/service_intro.html’; data _null_; infile foo length=len; input record $varying200. len; put record $varying200. len; if _n_=15 then stop; run;
See Also Statements: “FILENAME Statement” on page 1470 “FILENAME Statement, CATALOG Access Method” on page 1476 “FILENAME Statement, EMAIL (SMTP) Access Method” on page 1482 “FILENAME Statement, FTP Access Method” on page 1492 “FILENAME Statement, SOCKET Access Method” on page 1509 “FILENAME Statement, SFTP Access Method” on page 1503 “Secure Sockets Layer (SSL)r” in Encryption in SAS
FILENAME Statement, WebDAV Access Method Enables you to access remote files by using the WebDAV protocol. Anywhere Category: Data Access Restriction: Access to WebDAV servers is not supported on Open VMS. Valid:
Syntax FILENAME filref WEBDAV ’external-file’ ;
Arguments fileref is a valid fileref. Tip: The association between a fileref and an external file lasts only for the duration of the SAS session or until you change it or discontinue it with another FILENAME statement. You can change the fileref for a file as often as you want. WEBDAV specifies the access method that enables you to use WebDAV (Web Distributed Authoring and Versioning) to read from or write to a file from any host machine that you can connect to on a network with a WebDAV server running.
1518
FILENAME Statement, WebDAV Access Method
4
Chapter 6
’external-file’ specifies the name of the file that you want to read from or write to a WebDAV server. The external file must be in one of these forms: http://hostname/path-to-the-file https://hostname/path-to-the-file http://hostname:port/path-to-the-file https://hostname:port/path-to-the-file When using the HTTPS communication protocol, you must use the SSL (Secure Sockets Layer) protocol that provides secure network communications. For more information, see Encryption in SAS.
Requirement:
Operating Environment Information: For details about specifying the physical names of external files, see the SAS documentation for your operating environment. 4
WebDAV Options webdav-options can be any of the following: DEBUG writes debugging information to the SAS log. DIR enables you to access directory files. Specify the directory name in the external-file argument. You must use valid directory syntax for the specified host. See the FILEEXT option on page 1518 for information about specifying filename extensions.
Tip:
ENCODING=’encoding-value’ specifies the encoding to use when SAS is reading from or writing to an external file. The value for ENCODING= indicates that the external file has a different encoding from the current session encoding. When you read data from an external file, SAS transcodes the data from the specified encoding to the session encoding. When you write data to an external file, SAS transcodes the data from the session encoding to the specified encoding. Default: SAS assumes that an external file is in the same encoding as the
session encoding. See Also: “Encoding Values in SAS Language Elements” in the SAS National
Language Support (NLS): Reference Guide FILEEXT specifies that a file extension is automatically appended to the filename when you use the DIR option. The autocall macro facility always passes the extension .SAS to the file access method as the extension to use when opening files in the autocall library. The DATA step always passes the extension .DATA. If you define a fileref for an autocall macro library and the files in that library have a file extension of .SAS, use the FILEEXT option. If the files in that library do not have an extension, do not use the FILEEXT option. For example, if you define a fileref for an input file in the DATA step and the file X has an extension of .DATA, you would use the FILEEXT option to read the file X.DATA. If you use the INFILE or FILE statement, enclose the member name and extension in quotation marks to preserve case.
Interaction:
Statements
4
FILENAME Statement, WebDAV Access Method
1519
The FILEEXT option will be ignored if you specify a file extension in the FILE or INFILE statement.
Tip:
See Also:
LOWCASE_MEMNAME option on page 1519
LOCALCACHE=”directory name” specifies a directory where a temporary subdirectory is created to hold local copies of the server files. Each fileref has its own unique subdirectory. If a directory is not specified, then the subdirectories are created in the SAS Work directory. SAS deletes the temporary files when the SAS program completes. Default:
SAS Work directory
LOCKDURATION=n specifies the number of minutes that the files that are written through the WebDAV fileref are locked. SAS unlocks the files when the SAS program successfully finishes executing. If the SAS program fails, then the locks expire after the time allotted. Default:
30 minutes
LOWCASE_MEMNAME enables autocall macro retrieval of lowercase directory or member names from WebDAV servers. Restriction: SAS autocall macro retrieval always searches for uppercase
directory member names. Mixed-case directory or member names are not supported. See Also:
FILEEXT option on page 1518
LRECL=lrecl where lrecl is the logical record length of the data. Default: 256 Interaction: Alternatively, you can specify a global logical record length by using
the LRECL= system option“LRECL= System Option” on page 1884. MOD Places the file in update mode and appends updates to the bottom of the file. PASS=’password’ where password is the password to use with the user name that is specified in the USER option. The password is case sensitive and it must be enclosed in single or double quotation marks. Alias: PASSWORD=, PW=, PWD=
To use an encoded password, use the PWENCODE procedure in order to disguise the text string, and then enter the encoded password for the PASS= option. For more information, see “The PWENCODE Procedure” in the Base SAS Procedures Guide.
Tip:
PROXY=url specifies the Uniform Resource Locator (URL) for the proxy server in one of these forms: http://hostname/ http://hostname:port/ RECFM=recfm where recfm is one of two record formats: S
is stream-record format. Data is transferred in image (binary) mode.
1520
FILENAME Statement, WebDAV Access Method
4
Chapter 6
Tip: The amount of data that is read is controlled by the
current LRECL value or the value of the NBYTE= variable in the INFILE statement. The NBYTE= option specifies a variable that is equal to the amount of data to be read. This amount must be less than or equal to LRECL. See Also: The NBYTE= option on page 1548 in the INFILE statement. V
is variable-record format (the default). In this format, records have varying lengths, and they are transferred in text (stream) mode. Tip: Any record larger than LRECL is truncated.
Default: V
USER=’username’ where username is used to log on to the URL server. The user ID is case sensitive and it must be enclosed in single or double quotation marks. Alias: UID=
Details When you access a WebDAV server to update a file, the file is pulled from the WebDAV server to your local disk storage for processing. When this processing is complete, the file is pushed back to the WebDAV server for storage. The file is removed from the local disk storage when it is pushed back. The Secure Sockets Layer (SSL) protocol is used when the URL begins with “https” instead of “http”. The SSL protocol provides network security and privacy. Developed by Netscape Communications, SSL uses encryption algorithms that include RC2, RC4, DES, tripleDES, IDEA, and MD5. Not limited to providing only encryption services, SSL can also perform client and server authentication and use message authentication codes. SSL is supported by both Netscape Navigator and Internet Explorer. Many Web sites use the protocol to provide confidential user information such as credit card numbers. The SSL protocol is application independent, which enables protocols such as HTTP, FTP, and Telnet to be layered transparently above it. SSL is optimized for HTTP. Note: WebDAV servers have defined levels of permissions at both the directory and file level. The WebDAV access method honors those permissions. For example, if a file is available as read-only, the user will not be able to modify it. 4 Operating Environment Information: Using the FILENAME statement requires information that is specific to your operating environment. The WebDAV access method is fully documented here, but for more information about how to specify filenames, see the SAS documentation for your operating environment. 4
Examples
Example 1: Accessing a File at a Web Site
This example accesses the file
rawFile.txt at site www.mycompany.com. filename foo webdav ’https://www.mycompany.com/production/files/rawFile.txt’ user=’wong’ pass=’jd75ld’; data _null_;
Statements
4
FILENAME Statement, WebDAV Access Method
1521
infile foo; input a $80.; run;
Example 2: Using a Proxy Server
This example accesses the file acctgfile.dat by using the proxy server otherwebsvr:80. filename foo webdav ’https://webserver.com/webdav/acctgfile.dat’ user=’sanchez’ pass=’239sk349exz’ proxy=’http://otherwebsvr.com:80’; data _null_; infile foo; input a $80.; run;
Example 3: Writing to a New Member of a Directory
This example writes the file
SHOES to the directory TESTING. filename writeit webdav "https://webserver.com:8443/webdav/testing/" dir user="webuser" pass=XXXXXXXXX; data _null_; file writeit(shoes); set sashelp.shoes; put region $25. product $14.; run;
Example 4: Reading from a Member of a Directory
This example reads the file SHOES
from the directory TESTING1. filename readit webdav "https://webserver.com:8443/webdav/testing1/" dir user="webuser" pass=XXXXXXXXX; data shoes; length region $25 product $14; infile readit(shoes); input region $25. product $14.; run;
Example 5: Using a WebDAV Location as an Autocall Macro Library
By default, the autocall macro facility expects uppercase filenames. This example accesses the file MYTEST in the autocall macro library WRITEIT. filename writeit webdav "https://webserver.com/webdav/macrolib" dir fileext user="webuser" pass=XXXXXXXXX; options SASAUTOS=(writeit); /* expects a file called MYTEST.SAS */ %MYTEST;
Example 6: Accessing a Lowercased Autocall Macro Member The following example accesses the file testmem.sas in the autocall macro library LIST. The LOWCASE_MEMNAME option is used to access the file, which is in lowercase.
1522
FILENAME Statement, WebDAV Access Method
4
Chapter 6
filename list webdav "https://t1234.na.fyi.com:8443/accounting/" dir fileext user="xxxxx" pass="xxxxx" LOWCASE_MEMNAME; options sasautos=(list); %testmem;
Example 7: Using a %INCLUDE Statement and Macro Invocation to Access a Lowercased Autocall Macro Member The following example accesses the file testmem.sas in the autocall macro library MYTEST. Because the file is accessed by using the %INCLUDE statement, case sensitivity is preserved. filename mytest webdav "https://t1234.na.fyi.com:8443/payroll/" dir user="xxxxxx" pass="xxxxx"; %include mytest(testmem.sas) /source2; %testmem;
If the filename was in uppercase, the reference to the filename in the %INCLUDE statement and macro call needs to be uppercase. %include mytest(TESTMEM.SAS) /source2; %TESTMEM;
Example 8: Accessing a File with a Mixed-Case Name
The following example accesses the file fileNOext from the production directory. Because the file is quoted in the INFILE statement, case sensitivity is preserved and the file extension is ignored. filename test webdav "https://t1234.na.fyi.com:8443/production" dir user="xxxxxx" pass="xxxxx"; data _null_; infile test(’fileNOext’); input; list; run;
Example 9: Using the FILEEXT Option to Automatically Attach a File Extension
The following example accesses the file testmem.sas from the sales directory. The FILEEXT option automatically adds .DATA as the file extension. The member name that is read is testmem.DATA. filename listing webdav "https://t1234.na.fyi.com:8443/sales" dir fileext user="xxxxxx" pass="xxxxx"; data _null_; infile listing(testmem); input; list; run;
See Also Statements: “FILENAME “FILENAME “FILENAME “FILENAME “FILENAME “FILENAME
Statement” on page 1470 Statement, CATALOG Access Method” on page 1476 Statement, EMAIL (SMTP) Access Method” on page 1482 Statement, FTP Access Method” on page 1492 Statement, SOCKET Access Method” on page 1509 Statement, URL Access Method” on page 1513
Statements
4
FOOTNOTE Statement
“LIBNAME Statement for WebDAV Server Access” on page 1615
FOOTNOTE Statement Writes up to 10 lines of text at the bottom of the procedure or DATA step output. Valid:
anywhere
Category: Output Control Requirement: See:
You must specify the FOOTNOTE option if you use a FILE statement.
FOOTNOTE Statement in the documentation for your operating environment.
Syntax FOOTNOTE ;
Without Arguments Using FOOTNOTE without arguments cancels all existing footnotes.
Arguments n specifies the relative line to be occupied by the footnote. For footnotes, lines are pushed up from the bottom. The FOOTNOTE statement with the highest number appears on the bottom line. Range: n can range from 1 to 10. Tip:
Default: If you omit n, SAS assumes a value of 1.
ods-format-options specifies formatting options for the ODS HTML, RTF, and PRINTER(PDF) destinations. BOLD specifies that the footnote text is bold font weight. ODS Destinations: HTML, RTF, PRINTER
COLOR=color specifies the footnote text color. Alias: C ODS Destinations: HTML, RTF, PRINTER Featured in: Example 3 on page 1729
BCOLOR=color specifies the background color of the footnote block. ODS Destinations: HTML, RTF, PRINTER
FONT=font-face specifies the font to use. If you supply multiple fonts, then the destination device uses the first one that is installed on your system.
1523
1524
4
FOOTNOTE Statement
Chapter 6
Alias: F ODS Destinations: HTML, RTF, PRINTER
HEIGHT=size specifies the point size. Alias: H ODS Destinations: HTML, RTF, PRINTER Featured in: Example 3 on page 1729
ITALIC specifies that the footnote text is in italic style. ODS Destinations: HTML, RTF, PRINTER
JUSTIFY= CENTER | LEFT | RIGHT specifies justification. CENTER specifies center justification. Alias: C
LEFT specifies left justification. Alias: L
RIGHT specifies right justification. Alias: R Alias: J ODS Destinations: HTML, RTF, PRINTER Featured in: Example 3 on page 1729
LINK=’url’ specifies a hyperlink. Tip: The visual properties for LINK= always come from the current style. ODS Destinations: HTML, RTF, PRINTER
UNDERLIN= 0 | 1 | 2 | 3 specifies whether the subsequent text is underlined. 0 indicates no underlining. 1, 2, and 3 indicates underlining. Alias: U Tip: ODS generates the same type of underline for values 1, 2, and 3.
However, SAS/GRAPH uses values 1, 2, and 3 to generate increasingly thicker underlines. ODS Destinations: HTML, RTF, PRINTER
Note: The defaults for how ODS renders the FOOTNOTE statement come from style elements that relate to system footnotes in the current style. The FOOTNOTE statement syntax with ods-format-options is a way to override the settings that are provided by the current style. The current style varies according to the ODS destination. For more information on how to determine the current style, see “What Are Style Definitions, Style Elements, and Style Attributes?” and “Concepts: Style Definitions and the TEMPLATE Procedure” in the SAS Output Delivery System: User’s Guide. 4 You can specify these options by letter, word, or words by preceding each letter or word of the text by the option.
Tip:
Statements
4
FOOTNOTE Statement
1525
For example, this code will make the footnote “Red, White, and Blue” appear in different colors. footnote color=red "Red," color=white "White, and" color=blue "Blue";
’text’ | “text” specifies the text of the footnote in single or double quotation marks For compatibility with previous releases, SAS accepts some text without quotation marks. When you write new programs or update existing programs, always enclose text in quotation marks.
Tip:
If you use an automatic macro variable in the title text, you must enclose the title text in double quotation marks. The SAS macro facility will resolve the macro variable only if the text is in double quotation marks.
Tip:
If you use single quotation marks (") or double quotation marks ("") together (with no space in between them) as the string of text, SAS will output a single quotation mark ( ’) or double quotation mark ("), respectively.
Tip:
Details A FOOTNOTE statement takes effect when the step or RUN group with which it is associated executes. After you specify a footnote for a line, SAS repeats the same footnote on all pages until you cancel or redefine the footnote for that line. When a FOOTNOTE statement is specified for a given line, it cancels the previous FOOTNOTE statement for that line and for all footnote lines with higher numbers. Operating Environment Information: The maximum footnote length that is allowed depends on the operating environment and the value of the LINESIZE= system option. Refer to the SAS documentation for your operating environment for more information. 4
Comparisons You can also create footnotes with the FOOTNOTES window. For more information, refer to the online Help for the window. You can modify footnotes with the Output Delivery System. See Example 3 on page 1729.
Examples These examples of a FOOTNOTE statement result in the same footnote:
3
footnote8 "Managers’ Meeting";
3
footnote8 ’Managers’’ Meeting’;
These are examples of FOOTNOTE statements that use some of the formatting options for the ODS HTML, RTF, and PRINTER(PDF) destinations. For the complete example, see Example 3 on page 1729. footnote j=left height=20pt color=red "Prepared " c=’#FF9900’ "on"; footnote2 j=center color=blue height=24pt "&SYSDATE9"; footnote3 link=’http://support.sas.com’ "SAS";
1526
4
FORMAT Statement
Chapter 6
See Also Statement: “TITLE Statement” on page 1725 “The TEMPLATE Procedure” in the SAS Output Delivery System: User’s Guide
FORMAT Statement Associates formats with variables. Valid:
in a DATA step or PROC step
Information Type: Declarative Category:
Syntax FORMAT variable-1 < . . . variable-n> ; FORMAT variable-1 < . . . variable-n> format ; FORMAT variable-1 < . . . variable-n> format variable-1 format;
Arguments variable
names one or more variables for SAS to associate with a format. You must specify at least one variable. To disassociate a format from a variable, use the variable in a FORMAT statement without specifying a format in a DATA step or in PROC DATASETS. In a DATA step, place this FORMAT statement after the SET statement. See Example 3 on page 1529. You can also use PROC DATASETS.
Tip:
format
specifies the format that is listed for writing the values of the variables. Tip: Formats that are associated with variables by using a FORMAT statement behave like formats that are used with a colon modifier in a subsequent PUT statement. For details on using a colon modifier, see “PUT Statement, List” on page 1678. See also: “Formats by Category” on page 99 DEFAULT=default-format
specifies a temporary default format for displaying the values of variables that are not listed in the FORMAT statement. These default formats apply only to the current DATA step; they are not permanently associated with variables in the output data set. A DEFAULT= format specification applies to 3 variables that are not named in a FORMAT or ATTRIB statement
3 variables that are not permanently associated with a format within a SAS data set
Statements
4
FORMAT Statement
1527
3 variables that are not written with the explicit use of a format. Default: If you omit DEFAULT=, SAS uses BESTw. as the default numeric format
and $w. as the default character format. Restriction: Use this option only in a DATA step.
A DEFAULT= specification can occur anywhere in a FORMAT statement. It can specify either a numeric default, a character default, or both.
Tip:
Featured in:
Example 1 on page 1527
Details The FORMAT statement can use standard SAS formats or user-written formats that have been previously defined in PROC FORMAT. A single FORMAT statement can associate the same format with several variables, or it can associate different formats with different variables. If a variable appears in multiple FORMAT statements, SAS uses the format that is assigned last. You use a FORMAT statement in the DATA step to permanently associate a format with a variable. SAS changes the descriptor information of the SAS data set that contains the variable. You can use a FORMAT statement in some PROC steps, but the rules are different. For more information, see Base SAS Procedures Guide.
Comparisons Both the ATTRIB and FORMAT statements can associate formats with variables, and both statements can change the format that is associated with a variable. You can use the FORMAT statement in PROC DATASETS to change or remove the format that is associated with a variable. You can also associate, change, or disassociate formats and variables in existing SAS data sets through the windowing environment.
Examples Example 1: Assigning Formats and Defaults
This example uses a FORMAT statement to assign formats and default formats for numeric and character variables. The default formats are not associated with variables in the data set but affect how the PUT statement writes the variables in the current DATA step. data tstfmt; format W $char3. Y 10.3 default=8.2 $char8.; W=’Good morning.’; X=12.1; Y=13.2; Z=’Howdy-doody’; put W/X/Y/Z; run; proc contents data=tstfmt; run; proc print data=tstfmt; run;
The following output shows a partial listing from PROC CONTENTS, as well as the report that PROC PRINT generates.
1528
FORMAT Statement
4
Output 6.5
Chapter 6
Partial Listing from PROC CONTENTS and the PROC PRINT Report The SAS System
3
CONTENTS PROCEDURE -----Alphabetic List of Variables and Attributes----# Variable Type Len Pos Format ---------------------------------------------1 W Char 3 16 $CHAR3. 3 X Num 8 8 2 Y Num 8 0 10.3 4 Z Char 11 19
Output 6.6
PROC PRINT Report The SAS System OBS 1
W Goo
Y 13.200
X 12.1
4 Z Howdy-doody
The default formats apply to variables X and Z while the assigned formats apply to the variables W and Y. The PUT statement produces this result: ----+----1----+----2 Goo 12.10 13.200 Howdy-do
Example 2: Associating Multiple Variables with a Single Format This example uses the FORMAT statement to assign a single format to multiple variables. data report; input Item $ 1--6 Material $ 8--14 Investment 16--22 Profit 24--31; format Item Material $upcase9. Investment Profit dollar15.2; datalines; shirts cotton 2256354 83952175 ties silk 498678 2349615 suits silk 9482146 69839563 belts leather 7693 14893 shoes leather 7936712 22964 ; run; options pageno=1 nodate ls=80 ps=64; proc print data=report; title ’Profit Summary: Kellam Manufacturing Company’; run;
Statements
4
GO TO Statement
1529
Output 6.7 Results from Associating Multiple Variables with a Single Format Profit Summary: Kellam Manufacturing Company Obs 1 2 3 4 5
Item
Material
SHIRTS TIES SUITS BELTS SHOES
COTTON SILK SILK LEATHER LEATHER
1
Investment
Profit
$2,256,354.00 $498,678.00 $9,482,146.00 $7,693.00 $7,936,712.00
$83,952,175.00 $2,349,615.00 $69,839,563.00 $14,893.00 $22,964.00
Example 3: Removing a Format This example disassociates an existing format from a variable in a SAS data set. The order of the FORMAT and the SET statements is important. data rtest; set rtest; format x; run;
See Also Statement: “ATTRIB Statement” on page 1400 “The DATASETS Procedure” in Base SAS Procedures Guide
GO TO Statement Directs program execution immediately to the statement label that is specified and, if followed by a RETURN statement, returns execution to the beginning of the DATA step. Valid:
in a DATA step
Category: Control Type: Executable Alias:
GOTO
Syntax GO TO label;
Arguments
label
specifies a statement label that identifies the GO TO destination. The destination must be within the same DATA step. You must specify the label argument.
1530
GO TO Statement
4
Chapter 6
Comparisons The GO TO statement and the LINK statement are similar. However, a GO TO statement is often used without a RETURN statement, whereas a LINK statement is usually used with an explicit RETURN statement. The action of a subsequent RETURN statement differs between the GO TO and LINK statements. A RETURN statement after a LINK statement returns execution to the statement that follows the LINK statement. A RETURN after a GO TO statement returns execution to the beginning of the DATA step (unless a LINK statement precedes the GO TO statement. In that case, execution continues with the first statement after the LINK statement). GO TO statements can often be replaced by DO-END and IF-THEN/ELSE programming logic.
Examples Use the GO TO statement as shown here.
3 In this example, if the condition is true, the GO TO statement instructs SAS to jump to a label called ADD and to continue execution from there. If the condition is false, SAS executes the PUT statement and the statement that is associated with the GO TO label: data info; input x; if 1) from the numeric data. The second INPUT statement parses the value in the buffer.
1562
INFILE Statement
4
Chapter 6
data _null_; length city number $16. minutes charge 8; infile phonbill firstobs=2; input @; _infile_ = compress(_infile_, ’’); input city number minutes charge; put city= number= minutes= charge=; run;
The program writes the following lines to the SAS log: city=Jackson number=415-555-2384 minutes=25 charge=2.45 city=Jefferson number=813-555-2356 minutes=15 charge=1.62 city=Joliet number=913-555-3223 minutes=65 charge=10.32
Example 10: Accessing the Input Buffers of Multiple Files
This example uses both the _INFILE_ automatic variable and the _INFILE_= option to read multiple files and access the input buffers for each of them. The following code creates four files: three data files and one file that contains the names of all the data files. The second DATA step reads the filenames file, opens each data file, and writes the contents to the log. Because the PUT statement needs _INFILE_ for the filenames file and the data file, one of the _INFILE_ variables is referenced with fname. data _null_; do i = 1 to 3; fname= ’external-data-file’ || put(i,1.) || ’.dat’; file datfiles filevar=fname; do j = 1 to 5; put i j; end; file ’external-filenames-file’; put fname; end; run; data _null_; infile ’external-filenames-file’ _infile_=fname; input; infile datfiles filevar=fname end=eof; do while(^eof); input; put fname _infile_; end; run;
The program writes the following lines to the SAS log: NOTE: The infile ’external-filenames-file’ is: File Name=external-filenames-file, RECFM=V, LRECL=256 NOTE: The infile DATFILES is: File Name=external-data-file1.dat, RECFM=V, LRECL=256
Statements
external-data-file1.dat external-data-file1.dat external-data-file1.dat external-data-file1.dat external-data-file1.dat
1 1 1 1 1
4
INFILE Statement
1563
1 2 3 4 5
NOTE: The infile DATFILES is File Name=external-data-file2.dat, RECFM=V, LRECL=256 external-data-file2.dat external-data-file2.dat external-data-file2.dat external-data-file2.dat external-data-file2.dat
2 2 2 2 2
1 2 3 4 5
NOTE: The infile DATFILES is File Name=external-data-file3.dat, RECFM=V, LRECL=256 external-data-file3.dat external-data-file3.dat external-data-file3.dat external-data-file3.dat external-data-file3.dat
3 3 3 3 3
1 2 3 4 5
Example 11: Specifying an Encoding When Reading an External File
This example creates a SAS data set from an external file. The external file’s encoding is in UTF-8, and the current SAS session encoding is Wlatin1. By default, SAS assumes that the external file is in the same encoding as the session encoding, which causes the character data to be written to the new SAS data set incorrectly. To tell SAS what encoding to use when reading the external file, specify the ENCODING= option. When you tell SAS that the external file is in UTF-8, SAS then transcodes the external file from UTF-8 to the current session encoding when writing to the new SAS data set. Therefore, the data is written to the new data set correctly in Wlatin1. libname myfiles ’SAS-library’; filename extfile ’external-file’; data myfiles.unicode; infile extfile encoding="utf-8"; input Make $ Model $ Year; run;
See Also Statements: “FILENAME Statement” on page 1470 “INPUT Statement” on page 1567 “PUT Statement” on page 1656
1564
INFORMAT Statement
4
Chapter 6
INFORMAT Statement Associates informats with variables. in a DATA step or PROC step Category: Information Type: Declarative Valid:
Syntax INFORMAT variable-1 < …variable-n> ; INFORMAT < variable-1> ; INFORMAT variable-1 < …variable-n> informat < DEFAULT=default-informat>;
Arguments variable
specifies one or more variables to associate with an informat. You must specify at least one variable when specifying an informat or when including no other arguments. Specifying a variable is optional when using a DEFAULT= informat specification. Tip: To disassociate an informat from a variable, use the variable’s name in an INFORMAT statement without specifying an informat. Place the INFORMAT statement after the SET statement. See Example 3 on page 1567. informat
specifies the informat for reading the values of the variables that are listed in the INFORMAT statement. Tip: If an informat is associated with a variable by using the INFORMAT statement, and that same informat is not associated with that same variable in the INPUT statement, then that informat will behave like informats that you specify with a colon (:) modifier in an INPUT statement. SAS reads the variables by using list input with an informat. For example, you can use the : modifier with an informat to read character values that are longer than eight bytes, or numeric values that contain nonstandard values. For details, see “INPUT Statement, List” on page 1589. See Also: “Informats by Category” on page 1232 Featured in: Example 2 on page 1566 DEFAULT= default-informat
specifies a temporary default informat for reading values of the variables that are listed in the INFORMAT statement. If no variable is specified, then the DEFAULT= informat specification applies a temporary default informat for reading values of all the variables of that type included in the DATA step. Numeric informats are applied to numeric variables, and character informats are applied to character variables. These default informats apply only to the current DATA step. A DEFAULT= informat specification applies to 3 variables that are not named in an INFORMAT or ATTRIB statement 3 variables that are not permanently associated with an informat within a SAS data set
Statements
4
INFORMAT Statement
1565
3 variables that are not read with an explicit informat in the current DATA step. Default: If you omit DEFAULT=, SAS uses w.d as the default numeric informat and
$w. as the default character informat. Restriction: Use this argument only in a DATA step.
A DEFAULT= specification can occur anywhere in an INFORMAT statement. It can specify either a numeric default, a character default, or both.
Tip:
Featured in:
Example 1 on page 1566
Details The Basics An INFORMAT statement in a DATA step permanently associates an informat with a variable. You can specify standard SAS informats or user-written informats, previously defined in PROC FORMAT. A single INFORMAT statement can associate the same informat with several variables, or it can associate different informats with different variables. If a variable appears in multiple INFORMAT statements, SAS uses the informat that is assigned last. CAUTION:
Because an INFORMAT statement defines the length of previously undefined character variables, you can truncate the values of character variables in a DATA step if an INFORMAT statement precedes a SET statement. 4
How SAS Treats Variables when You Assign Informats with the INFORMAT Statement Informats that are associated with variables by using the INFORMAT statement behave like informats that are used with modified list input. SAS reads the variables by using the scanning feature of list input, but applies the informat. In modified list input, SAS
3 does not use the value of w in an informat to specify column positions or input field widths in an external file
3 uses the value of w in an informat to specify the length of previously undefined character variables
3 ignores the value of w in numeric informats 3 uses the value of d in an informat in the same way it usually does for numeric informats
3 treats blanks that are embedded as input data as delimiters unless you change their status with a DLM= or DLMSTR= option specification in an INFILE statement. If you have coded the INPUT statement to use another style of input, such as formatted input or column input, that style of input is not used when you use the INFORMAT statement.
Comparisons 3 Both the ATTRIB and INFORMAT statements can associate informats with variables, and both statements can change the informat that is associated with a variable. You can also use the INFORMAT statement in PROC DATASETS to change or remove the informat that is associated with a variable. The SAS windowing environment allows you to associate, change, or disassociate informats and variables in existing SAS data sets.
3 SAS changes the descriptor information of the SAS data set that contains the variable. You can use an INFORMAT statement in some PROC steps, but the
1566
INFORMAT Statement
4
Chapter 6
rules are different. See “The FORMAT Procedure” in Base SAS Procedures Guide for more information.
Examples Example 1: Specifying Default Informats This example uses an INFORMAT statement to associate a default numeric informat: data tstinfmt; informat default=3.1; input x; put x; datalines; 111 222 333 ;
The PUT statement produces these results: 11.1 22.2 33.3
Example 2: Specifying Numeric and Character Informats
This example associates a character informat and a numeric informat with SAS variables. Although the character variables do not fully occupy 15 column positions, the INPUT statement reads the data records correctly by using modified list input: data name; informat FirstName LastName $15. n1 6.2 n2 7.3; input firstname lastname n1 n2; datalines; Alexander Robinson 35 11 ; proc contents data=name; run; proc print data=name; run;
The following output shows a partial listing from PROC CONTENTS, as well as the report PROC PRINT generates. Output 6.11
Associating Numeric and Character Informats with SAS Variables The SAS System CONTENTS PROCEDURE
-----Alphabetic List of Variables and Attributes----# Variable Type Len Pos Informat -----------------------------------------------1 FirstName Char 15 16 $15. 2 LastName Char 15 31 $15. 3 n1 Num 8 0 6.2 4 n2 Num 8 8 7.3
3
4
Statements
The SAS System
INPUT Statement
1567
4
OBS
FirstName
LastName
n1
1
Alexander
Robinson
0.35
n2 0.011
Example 3: Removing an Informat
This example disassociates an existing informat. The order of the INFORMAT and SET statements is important. data rtest; set rtest; informat x; run;
See Also Statements: “ATTRIB Statement” on page 1400 “INPUT Statement” on page 1567 “INPUT Statement, List” on page 1589
INPUT Statement Describes the arrangement of values in the input data record and assigns input values to the corresponding SAS variables. Valid:
in a DATA step
Category: File-handling Type: Executable
Syntax INPUT < @|@@>;
Without Arguments The INPUT statement with no arguments is called a null INPUT statement. The null INPUT statement
3 brings an input data record into the input buffer without creating any SAS variables
3 releases an input data record that is held by a trailing @ or a double trailing @. For an example, see Example 2 on page 1579.
1568
INPUT Statement
4
Chapter 6
Arguments specification(s) can include variable names a variable that is assigned input values. (variable-list) specifies a list of variables that are assigned input values. Requirement: The (variable-list) is followed by an (informat-list). See Also: “How to Group Variables and Informats” on page 1587
$ specifies to store the variable value as a character value rather than as a numeric value. Tip: If the variable is previously defined as character, $ is not required. Featured in: Example 1 on page 1578
pointer-control moves the input pointer to a specified line or column in the input buffer. See: “Column Pointer Controls” on page 1569 and “Line Pointer Controls” on
page 1571 column-specifications specifies the columns of the input record that contain the value to read. Tip: Informats are ignored. Only standard character and numeric data can
be read correctly with this method. See: “Column Input” on page 1572 Featured in: Example 1 on page 1578
format-modifier allows modified list input or controls the amount of information that is reported in the SAS log when an error in an input value occurs. Tip: Use modified list input to read data that cannot be read with simple list
input. See: “When to Use List Input” on page 1591 See: “Format Modifiers for Error Reporting” on page 1571 Featured in: Example 6 on page 1581
informat. specifies an informat to use to read the variable value. Tip: You can use modified list input to read data with informats. Modified
list input is useful when the data require informats but cannot be read with formatted input because the values are not aligned in columns. See: “Formatted Input” on page 1573 and “List Input” on page 1573 Featured in: Example 2 on page 1589
(informat-list) specifies a list of informats to use to read the values for the preceding list of variables. Restriction: The (informat-list) must follow the (variable-list). See: “How to Group Variables and Informats” on page 1587
Statements
4
INPUT Statement
1569
@ holds an input record for the execution of the next INPUT statement within the same iteration of the DATA step. This line-hold specifier is called trailing @. Restriction: The trailing @ must be the last item in the INPUT statement.
The trailing @ prevents the next INPUT statement from automatically releasing the current input record and reading the next record into the input buffer. It is useful when you need to read from a record multiple times.
Tip:
See Also:
“Using Line-Hold Specifiers” on page 1574
Featured in: Example 3 on page 1579
@@ holds the input record for the execution of the next INPUT statement across iterations of the DATA step. This line-hold specifier is called double trailing @. Restriction: The double trailing @ must be the last item in the INPUT statement.
The double trailing @ is useful when each input line contains values for several observations, or when a record needs to be reread on the next iteration of the DATA step. .
Tip:
See Also:
“Using Line-Hold Specifiers” on page 1574
Featured in: Example 4 on page 1580
Column Pointer Controls @n moves the pointer to column n. Range: a positive integer
If n is not an integer, SAS truncates the decimal value and uses only the integer value. If n is zero or negative, the pointer moves to column 1.
Tip:
Example: @15 moves the pointer to column 15: input @15 name $10.;
Featured in: Example 7 on page 1581
@numeric-variable moves the pointer to the column given by the value of numeric-variable. Range: a positive integer
If numeric-variable is not an integer, SAS truncates the decimal value and only uses the integer value. If numeric-variable is zero or negative, the pointer moves to column 1.
Tip:
Example: The value of the variable A moves the pointer to column 15: a=15; input @a name $10.;
Featured in: Example 5 on page 1580
@(expression) moves the pointer to the column that is given by the value of expression. Restriction: Expression must result in a positive integer.
If the value of expression is not an integer, SAS truncates the decimal value and only uses the integer value. If it is zero or negative, the pointer moves to column 1.
Tip:
1570
INPUT Statement
4
Chapter 6
Example: The result of the expression moves the pointer to column 15: b=5; input @(b*3) name $10.;
@’character-string’ locates the specified series of characters in the input record and moves the pointer to the first column after character-string. @character-variable locates the series of characters in the input record that is given by the value of character-variable and moves the pointer to the first column after that series of characters. Example: The following statement reads in the WEEKDAY character variable. The second @1 moves the pointer to the beginning of the input line. The value for SALES is read from the next non-blank column after the value of WEEKDAY: input @1 day 1. @5 weekday $10. @1 @weekday sales 8.2;
Featured in:
Example 6 on page 1581
@(character-expression) locates the series of characters in the input record that is given by the value of character-expression and moves the pointer to the first column after the series. Featured in: Example 6 on page 1581 +n moves the pointer n columns. Range: a positive integer or zero Tip: If n is not an integer, SAS truncates the decimal value and uses only the integer value. If the value is greater than the length of the input buffer, the pointer moves to column 1 of the next record. Example: This statement moves the pointer to column 23, reads a value for LENGTH from columns 23 through 26, advances the pointer five columns, and reads a value for WIDTH from columns 32 through 35: input @23 length 4. +5 width 4.;
Featured in:
Example 7 on page 1581
+numeric-variable moves the pointer the number of columns that is given by the value of numeric-variable. Range: a positive or negative integer or zero Tip: If numeric-variable is not an integer, SAS truncates the decimal value and uses only the integer value. If numeric-variable is negative, the pointer moves backward. If the current column position becomes less than 1, the pointer moves to column 1. If the value is zero, the pointer does not move. If the value is greater than the length of the input buffer, the pointer moves to column 1 of the next record. Featured in: Example 7 on page 1581 +(expression) moves the pointer the number of columns given by expression. Range: expression must result in a positive or negative integer or zero. Tip: If expression is not an integer, SAS truncates the decimal value and uses only the integer value. If expression is negative, the pointer moves backward. If
Statements
4
INPUT Statement
1571
the current column position becomes less than 1, the pointer moves to column 1. If the value is zero, the pointer does not move. If the value is greater than the length of the input buffer, the pointer moves to column 1 of the next record.
Line Pointer Controls #n moves the pointer to record n. Range: a positive integer Interaction: The N= option in the INFILE statement can affect the number of
records the INPUT statement reads and the placement of the input pointer after each iteration of the DATA step. See the option N= on page 1547. Example: The #2 moves the pointer to the second record to read the value for ID
from columns 3 and 4: input name $10. #2 id 3-4;
#numeric-variable moves the pointer to the record that is given by the value of numeric-variable. Range: a positive integer
If the value of numeric-variable is not an integer, SAS truncates the decimal value and uses only the integer value.
Tip:
#(expression) moves the pointer to the record that is given by the value of expression. Range: expression must result in a positive integer.
If the value of expression is not an integer, SAS truncates the decimal value and uses only the integer value.
Tip:
/ advances the pointer to column 1 of the next input record. Example: The values for NAME and AGE are read from the first input record
before the pointer moves to the second record to read the value of ID from columns 3 and 4: input name age / id 3-4;
Format Modifiers for Error Reporting ? suppresses printing the invalid data note when SAS encounters invalid data values. See Also:
“How Invalid Data is Handled” on page 1577
?? suppresses printing the messages and the input lines when SAS encounters invalid data values. The automatic variable _ERROR_ is not set to 1 for the invalid observation. See Also: “How Invalid Data is Handled” on page 1577
1572
INPUT Statement
4
Chapter 6
Details When to Use INPUT
Use the INPUT statement to read raw data from an external file or in-stream data. If your data are stored in an external file, you can specify the file in an INFILE statement. The INFILE statement must execute before the INPUT statement that reads the data records. If your data are in-stream, a DATALINES statement must precede the data lines in the job stream. If your data contain semicolons, use a DATALINES4 statement before the data lines. A DATA step that reads raw data can include multiple INPUT statements. You can also use the INFILE statement to read in-stream data by specifying a filename of DATALINES on the INFILE statement before the INPUT statement. Using DATALINES on the INFILE statement allows you to use most of the options available on the INFILE statement with in-stream data. To read data that are already stored in a SAS data set, use a SET statement. To read database or PC file-format data that are created by other software, use the SET statement after you access the data with the LIBNAME statement. See the SAS/ACCESS documentation for more information. Operating Environment Information: LOG files that are generated under z/OS and captured with PROC PRINTTO contain an ASA control character in column 1. If you are using the INPUT statement to read a LOG file that was generated under z/OS, you must account for this character if you use column input or column pointer controls. 4
Input Styles There are four ways to describe a record’s values in the INPUT statement: 3 column 3 list (simple and modified)
3 formatted 3 named. Each variable value is read by using one of these input styles. An INPUT statement can contain any or all of the available input styles, depending on the arrangement of data values in the input records. However, once named input is used in an INPUT statement, you cannot use another input style.
Column Input
With column input, the column numbers follow the variable name in the INPUT statement. These numbers indicate where the variable values are found in the input data records: input name $ 1-8 age 11-12;
This INPUT statement can read the following data records: ----+----1----+----2----+ Peterson 21 Morgan 17
Statements
4
INPUT Statement
1573
Because NAME is a character variable, a $ appears between the variable name and column numbers. For more information, see “INPUT Statement, Column” on page 1583.
List Input
With list input, the variable names are simply listed in the INPUT statement. A $ follows the name of each character variable: input name $ age;
This INPUT statement can read data values that are separated by blanks or aligned in columns (with at least one blank between): ----+----1----+----2----+ Peterson 21 Morgan 17
For more information, see “INPUT Statement, List” on page 1589.
Formatted Input
With formatted input, an informat follows the variable name in the INPUT statement. The informat gives the data type and the field width of an input value. Informats also allow you to read data that are stored in nonstandard form, such as packed decimal, or numbers that contain special characters such as commas. input name $char8. +2 income comma6.;
This INPUT statement reads these data records correctly: ----+----1----+----2----+ Peterson 21,000 Morgan 17,132
The pointer control of +2 moves the input pointer to the field that contains the value for the variable INCOME. For more information, see “INPUT Statement, Formatted” on page 1585.
Named Input
With named input, you specify the name of the variable followed by an equal sign. SAS looks for a variable name and an equal sign in the input record: input name= $ age=;
This INPUT statement reads the following data records correctly: ----+----1----+----2----+ name=Peterson age=21 name=Morgan age=17
For more information, see “INPUT Statement, Named” on page 1595.
Multiple Styles in a Single INPUT Statement An INPUT statement can contain any or all of the different input styles: input idno name $18. team $ 25-30 startwght endwght;
This INPUT statement reads the following data records correctly: ----+----1----+----2----+----3----+---023 David Shaw red 189 165 049 Amelia Serrano yellow 189 165
The value of IDNO, STARTWGHT, and ENDWGHT are read with list input, the value of NAME with formatted input, and the value of TEAM with column input. Note: Once named input is used in an INPUT statement, you cannot change input styles. 4
1574
INPUT Statement
4
Chapter 6
Pointer Controls As SAS reads values from the input data records into the input buffer, it keeps track of its position with a pointer. The INPUT statement provides three ways to control the movement of the pointer: column pointer controls reset the pointer’s column position when the data values in the data records are read. line pointer controls reset the pointer’s line position when the data values in the data records are read. line-hold specifiers hold an input record in the input buffer so that another INPUT statement can process it. By default, the INPUT statement releases the previous record and reads another record. With column and line pointer controls, you can specify an absolute line number or column number to move the pointer or you can specify a column or line location relative to the current pointer position. Table 6.6 on page 1574 lists the pointer controls that are available with the INPUT statement. Table 6.6 Pointer Controls Available in the INPUT Statement Pointer Controls
Relative
Absolute
column pointer controls
+n
@n
+numeric-variable
@numeric-variable
+(expression)
@(expression) @’character-string’ @character-variable @(character-expression)
line pointer controls
/
#n #numeric-variable #(expression)
line-hold specifiers
Note:
@
(not applicable)
@@
(not applicable)
Always specify pointer controls before the variable to which they apply.
4
You can use the COLUMN= and LINE= options in the INFILE statement to determine the pointer’s current column and line location.
Using Column and Line Pointer Controls Column pointer controls indicate the column in which an input value starts. Use line pointer controls within the INPUT statement to move to the next input record or to define the number of input records per observation. Line pointer controls specify which input record to read. To read multiple data records into the input buffer, use the N= option in the INFILE statement to specify the number of records. If you omit N=, you need to take special precautions. For more information, see “Reading More Than One Record per Observation” on page 1576. Using Line-Hold Specifiers record when
Line-hold specifiers keep the pointer on the current input
Statements
4
INPUT Statement
1575
3 a data record is read by more than one INPUT statement (trailing @) 3 one input line has values for more than one observation (double trailing @) 3 a record needs to be reread on the next iteration of the DATA step (double trailing @). Use a single trailing @ to allow the next INPUT statement to read from the same record. Use a double trailing @ to hold a record for the next INPUT statement across iterations of the DATA step. Normally, each INPUT statement in a DATA step reads a new data record into the input buffer. When you use a trailing @, the following occurs:
3 The pointer position does not change. 3 No new record is read into the input buffer. 3 The next INPUT statement for the same iteration of the DATA step continues to read the same record rather than a new one. SAS releases a record held by a trailing @ when
3 a null INPUT statement executes: input;
3 an INPUT statement without a trailing @ executes 3 the next iteration of the DATA step begins. Normally, when you use a double trailing @ (@@), the INPUT statement for the next iteration of the DATA step continues to read the same record. SAS releases the record that is held by a double trailing @
3 immediately if the pointer moves past the end of the input record 3 immediately if a null INPUT statement executes: input;
3 when the next iteration of the DATA step begins if an INPUT statement with a single trailing @ executes later in the DATA step: input @;
Pointer Location After Reading Understanding the location of the input pointer after a value is read is important, especially if you combine input styles in a single INPUT statement. With column and formatted input, the pointer reads the columns that are indicated in the INPUT statement and stops in the next column. With list input, however, the pointer scans data records to locate data values and reads a blank to indicate that a value has ended. After reading a value with list input, the pointer stops in the second column after the value. For example, you can read these data records with list, column, and formatted input: ----+----1----+----2----+----3 REGION1 49670 REGION2 97540 REGION3 86342
This INPUT statement uses list input to read the data records: input region $ jansales;
After reading a value for REGION, the pointer stops in column 9. ----+----1----+----2----+----3 REGION1 49670 "
1576
INPUT Statement
4
Chapter 6
These INPUT statements use column and formatted input to read the data records:
3 column input input region $ 1-7 jansales 12-16;
3 formatted input input region $7. +4 jansales 5.; input region $7. @12 jansales 5.;
To read a value for the variable REGION, the INPUT statements instruct the pointer to read seven columns and stop in column 8. ----+----1----+----2----+----3 REGION1 49670 "
Reading More Than One Record per Observation The highest number that follows the # pointer control in the INPUT statement determines how many input data records are read into the input buffer. Use the N= option in the INFILE statement to change the number of records. For example, in this statement, the highest value after the # is 3: input @31 age 3. #3 id 3-4 #2 @6 name $20.;
Unless you use N= in the associated INFILE statement, the INPUT statement reads three input records each time the DATA step executes. When each observation has multiple input records but values from the last record are not read, you must use a # pointer control in the INPUT statement or N= in the INFILE statement to specify the last input record. For example, if there are four records per observation, but only values from the first two input records are read, use this INPUT statement: input name $ 1-10 #2 age 13-14 #4;
When you have advanced to the next record with the / pointer control, use the # pointer control in the INPUT statement or the N= option in the INFILE statement to set the number of records that are read into the input buffer. To move the pointer back to an earlier record, use a # pointer control. For example, this statement requires the #2 pointer control, unless the INFILE statement uses the N= option, to read two records: input a / b #1 @52 c #2;
The INPUT statement assigns A a value from the first record. The pointer advances to the next input record to assign B a value. Then the pointer returns from the second record to column 1 of the first record and moves to column 52 to assign C a value. The #2 pointer control identifies two input records for each observation so that the pointer can return to the first record for the value of C. If the number of input records per observation varies, use the N= option in the INFILE statement to give the maximum number of records per observation. For more information, see the N= option on page 1547.
Reading Past the End of a Line When you use @ or + pointer controls with a value that moves the pointer to or past the end of the current record and the next value is to be read from the current column, SAS goes to column 1 of the next record to read it. It also writes this message to the SAS log: NOTE: SAS went to a new line when INPUT statement reached past the end of a line.
Statements
4
INPUT Statement
1577
You can alter the default behavior (the FLOWOVER option) in the INFILE statement. Use the STOPOVER option in the INFILE statement to treat this condition as an error and to stop building the data set. Use the MISSOVER option in the INFILE statement to set the remaining INPUT statement variables to missing values if the pointer reaches the end of a record. Use the TRUNCOVER option in the INFILE statement to read column input or formatted input when the last variable that is read by the INPUT statement contains varying-length data.
Positioning the Pointer Before the Record When a column pointer control tries to move the pointer to a position before the beginning of the record, the pointer is positioned in column 1. For example, this INPUT statement specifies that the pointer is located in column −2 after the first value is read: data test; input a @(a-3) b; datalines; 2 ;
Therefore, SAS moves the pointer to column 1 after the value of A is read. Both variables A and B contain the same value.
How Invalid Data is Handled When SAS encounters an invalid character in an input value for the variable indicated, it 3 sets the value of the variable that is being read to missing or the value that is specified with the INVALIDDATA= system option. For more information see “INVALIDDATA= System Option” on page 1874. 3 prints an invalid data note in the SAS log. 3 prints the input line and column number that contains the invalid value in the SAS log. Unprintable characters appear in hexadecimal. To help determine column numbers, SAS prints a rule line above the input line. 3 sets the automatic variable _ERROR_ to 1 for the current observation. The format modifiers for error reporting control the amount of information that is printed in the SAS log. Both the ? and ?? modifier suppress the invalid data message. However, the ?? modifier also resets the automatic variable _ERROR_ to 0. For example, these two sets of statements are equivalent:
3 3
input x ?? 10-12; input x ? 10-12; _error_=0;
In either case, SAS sets invalid values of X to missing values. For information on the causes of invalid data, see SAS Language Reference: Concepts.
End-of-File End-of-file occurs when an INPUT statement reaches the end of the data. If a DATA step tries to read another record after it reaches an end-of-file then execution stops. If you want the DATA step to continue to execute, use the END= or EOF= option in the INFILE statement. Then you can write SAS program statements to detect the end-of-file, and to stop the execution of the INPUT statement but continue with the DATA step. For more information, see “INFILE Statement” on page 1541.
1578
INPUT Statement
4
Chapter 6
Arrays The INPUT statement can use array references to read input data values. You can use an array reference in a pointer control if it is enclosed in parentheses. See Example 6 on page 1581. Use the array subscript asterisk (*) to input all elements of a previously defined explicit array. SAS allows single or multidimensional arrays. Enclose the subscript in braces, brackets, or parentheses. The form of this statement is INPUT array-name{*};
You can use arrays with list, column, or formatted input. However, you cannot input values to an array that is defined with _TEMPORARY_ and that uses the asterisk subscript. For example, these statements create variables X1 through X100 and assign data values to the variables using the 2. informat: array x{100}; input x{*} 2.;
Comparisons 3 The INPUT statement reads raw data in external files or data lines that are entered in-stream (following the DATALINES statement) that need to be described to SAS. The SET statement reads a SAS data set, which already contains descriptive information about the data values.
3 The INPUT statement reads data while the PUT statement writes data values, text strings, or both to the SAS log or to an external file. 3 The INPUT statement can read data from external files; the INFILE statement points to that file and has options that control how that file is read.
Examples Example 1: Using Multiple Styles of Input in One INPUT Statement several input styles in a single INPUT statement: data club1; input Idno Name $18. Team $ 25-30 Startwght Endwght; datalines; 023 David Shaw red 189 165 049 Amelia Serrano yellow 189 165 ... more data lines ... ;
Variable
Type of Input
Idno, Startwght, Endwght
list input
Name
formatted input
Team
column input
This example uses
Statements
4
INPUT Statement
1579
Example 2: Using a Null INPUT Statement This example uses an INPUT statement with no arguments. The DATA step copies records from the input file to the output file without creating any SAS variables: data _null_; infile file-specification-1; file file-specification-2; input; put _infile_; run;
Example 3: Holding a Record in the Input Buffer This example reads a file that contains two kinds of input data records and creates a SAS data set from these records. One type of data record contains information about a particular college course. The second type of record contains information about the students enrolled in the course. You need two INPUT statements to read the two records and to assign the values to different variables that use different formats. Records that contain class information have a C in column 1; records that contain student information have an S in column 1, as shown here: ----+----1----+----2----+ C HIST101 Watson S Williams 0459 S Flores 5423 C MATH202 Sen S Lee 7085
To know which INPUT statement to use, check each record as it is read. Use an INPUT statement that reads only the variable that tells whether the record contains class or student. data schedule(drop=type); infile file-specification; retain Course Professor; input type $1. @; if type=’C’ then input course $ professor $; else if type=’S’ then do; input Name $10. Id; output schedule; end; run; proc print; run;
The first INPUT statement reads the TYPE value from column 1 of every line. Because this INPUT statement ends with a trailing @, the next INPUT statement in the DATA step reads the same line. The IF-THEN statements that follow check whether the record is a class or student line before another INPUT statement reads the rest of the line. The INPUT statements without a trailing @ release the held line. The RETAIN statement saves the values about the particular college course. The DATA step writes an observation to the SCHEDULE data set after a student record is read. The following output that PROC PRINT generates shows the resulting data set SCHEDULE.
1580
INPUT Statement
4
Output 6.12
Chapter 6
Data Set Schedule The SAS System OBS 1 2 3
Course HIST101 HIST101 MATH202
Professor Watson Watson Sen
1 Name Williams Flores Lee
Id 459 5423 7085
Example 4: Holding a Record Across Iterations of the DATA Step
This example shows how to create multiple observations for each input data record. Each record contains several NAME and AGE values. The DATA step reads a NAME value and an AGE value, outputs an observation, and then reads another set of NAME and AGE values to output, and so on, until all the input values in the record are processed. data test; input name $ age @@; datalines; John 13 Monica 12 Sue 15 Stephen 10 Marc 22 Lily 17 ;
The INPUT statement uses the double trailing @ to control the input pointer across iterations of the DATA step. The SAS data set contains six observations.
Example 5: Positioning the Pointer with a Numeric Variable
This example uses a numeric variable to position the pointer. A raw data file contains records with the employment figures for several offices of a multinational company. The input data records are ----+----1----+----2----+----3----+ 8 New York 1 USA 14 5 Cary 1 USA 2274 3 Chicago 1 USA 37 22 Tokyo 5 ASIA 80 5 Vancouver 2 CANADA 6 9 Milano 4 EUROPE 123
The first column has the column position for the office location. The next numeric column is the region category. The geographic region occurs before the number of employees in that office. You determine the office location by combining the @numeric-variable pointer control with a trailing @. To read the records, use two INPUT statements. The first INPUT statement obtains the value for the @ numeric-variable pointer control. The second INPUT statement uses this value to determine the column that the pointer moves to. data office (drop=x); infile file-specification; input x @; if 1 (variable-list) (informat-list) ; INPUT (variable-list) ( informat.) ;
Arguments
pointer-control
moves the input pointer to a specified line or column in the input buffer. “Column Pointer Controls” on page 1569 and “Line Pointer Controls” on page 1571
See:
variable
specifies a variable that is assigned input values. Requirement: Featured in:
The (variable-list) is followed by an (informat-list). Example 1 on page 1588
(variable-list)
specifies a list of variables that are assigned input values. See:
“How to Group Variables and Informats” on page 1587
Featured in:
Example 2 on page 1589
informat.
specifies a SAS informat to use to read the variable values. Decimal points in the actual input values override decimal specifications in a numeric informat.
Tip:
See Also: Chapter 5, “Informats,” on page 1215 Featured in:
Example 1 on page 1588
(informat-list)
specifies a list of informats to use to read the values for the preceding list of variables In the INPUT statement, (informat-list) can include informat. specifies an informat to use to read the variable values. pointer-control specifies one of these pointer controls to use to position a value: @, #, /, or +. n* specifies to repeat n times the next informat in an informat list. Example: This statement uses the 7.2 informat to read GRADES1, GRADES2,
and GRADES3 and the 5.2 informat to read GRADES4 and GRADES5: input (grades1-grades5)(3*7.2, 2*5.2);
Restriction: The (informat-list) must follow the (variable-list). See:
“How to Group Variables and Informats” on page 1587
Featured in:
Example 2 on page 1589
Statements
4
INPUT Statement, Formatted
1587
@
holds an input record for the execution of the next INPUT statement within the same iteration of the DATA step. This line-hold specifier is called trailing @. Restriction: The trailing @ must be the last item in the INPUT statement. Tip: The trailing @ prevents the next INPUT statement from automatically releasing the current input record and reading the next record into the input buffer. It is useful when you need to read from a record multiple times. See: “Using Line-Hold Specifiers” on page 1574 @@
holds an input record for the execution of the next INPUT statement across iterations of the DATA step. This line-hold specifier is called double trailing @. Restriction: The double trailing @ must be the last item in the INPUT statement. Tip: The double trailing @ is useful when each input line contains values for several observations. See: “Using Line-Hold Specifiers” on page 1574
Details When to Use Formatted Input With formatted input, an informat follows a variable name and defines how SAS reads the values of this variable. An informat gives the data type and the field width of an input value. Informats also read data that are stored in nonstandard form, such as packed decimal, or numbers that contain special characters such as commas.* See “Definition of Informats” on page 1217 for descriptions of SAS informats. Simple formatted input requires that the variables be in the same order as their corresponding values in the input data. You can use pointer controls to read variables in any order. For more information, see “INPUT Statement” on page 1567. Missing Values
Generally, SAS represents missing values in formatted input with a single period for a numeric value and with blanks for a character value. The informat that you use with formatted input determines how SAS interprets a blank. For example, $CHAR.w reads the blanks as part of the value, whereas BZ.w converts a blank to zero.
Reading Variable-Length Records
By default, SAS uses the FLOWOVER option to read varying-length data records. If the record contains fewer values than expected, the INPUT statement reads the values from the next data record. To read varying-length data. you might need to use the TRUNCOVER option in the INFILE statement. For more information, see “Reading Past the End of a Line” on page 1553.
How to Group Variables and Informats When the input values are arranged in a pattern, you can group the informat list. A grouped informat list consists of two lists: 3 the names of the variables to read enclosed in parentheses 3 the corresponding informats separated by either blanks or commas and enclosed in parentheses. Informat lists can make an INPUT statement shorter because the informat list is recycled until all variables are read and the numbered variable names can be used in abbreviated form. Using informat lists avoids listing the individual variables. For example, if the values for the five variables SCORE1 through SCORE5 are stored as four columns per value without intervening blanks, this INPUT statement reads the values: * See SAS Language Reference: Concepts for information on standard and nonstandard data values.
1588
INPUT Statement, Formatted
4
Chapter 6
input (score1-score5) (4. 4. 4. 4. 4.);
However, if you specify more variables than informats, the INPUT statement reuses the informat list to read the remaining variables. A shorter version of the previous statement is input (score1-score5) (4.);
You can use as many informat lists as necessary in an INPUT statement, but do not nest the informat lists. After all the values in the variable list are read, the INPUT statement ignores any directions that remain in the informat list. For an example, see Example 3 on page 1589. The n* modifier in an informat list specifies to repeat the next informat n times. For example, input (name score1-score5) ($10. 5*4.);
How to Store Informats The informats that you specify in the INPUT statement are not stored with the SAS data set. Informats that you specify with the INFORMAT or ATTRIB statement are permanently stored. Therefore, you can read a data value with a permanently stored informat in a later DATA step without having to specify the informat or use PROC FSEDIT to enter data in the correct format.
Comparisons When a variable is read with formatted input, the pointer movement is similar to the pointer movement of column input. The pointer moves the length that the informat specifies and stops at the next column. To read data with informats that are not aligned in columns, use modified list input. Using modified list input allows you to take advantage of the scanning feature in list input. See “When to Use List Input” on page 1591.
Examples Example 1: Formatted Input with Pointer Controls
This INPUT statement uses
informats and pointer controls: data sales; infile file-specification; input item $10. +5 jan comma5. +5 feb comma5. +5 mar comma5.; run;
It can read these input data records: ----+----1----+----2----+----3----+----4 trucks 1,382 2,789 3,556 vans 1,265 2,543 3,987 sedans 2,391 3,011 3,658
The value for ITEM is read from the first 10 columns in a record. The pointer stops in column 11. The trailing blanks are discarded and the value of ITEM is written to the program data vector. Next, the pointer moves five columns to the right before the INPUT statement uses the COMMA5. informat to read the value of JAN. This informat uses five as the field width to read numeric values that contain a comma. Once again, the pointer moves five columns to the right before the INPUT statement uses the COMMA5. informat to read the values of FEB and MAR.
Statements
4
INPUT Statement, List
1589
Example 2: Using Informat Lists
This INPUT statement uses the character informat $10. to read the values of the variable NAME and uses the numeric informat 4. to read the values of the five variables SCORE1 through SCORE5: data scores; input (name score1-score5) ($10. 5*4.); datalines; Whittaker 121 114 137 156 142 Smythe 111 97 122 143 127 ;
Example 3: Including More Informat Specifications Than Necessary
This informat list includes more specifications than are necessary when the INPUT statement executes: data test; input (x y z) (2.,+1); datalines; 2 24 36 0 20 30 ;
The INPUT statement reads the value of X with the 2. informat. Then, the +1 column pointer control moves the pointer forward one column. Next, the value of Y is read with the 2. informat. Again, the +1 column pointer moves the pointer forward one column. Then, the value of Z is read with the 2. informat. For the third iteration, the INPUT statement ignores the +1 pointer control.
See Also Statements: “INPUT Statement” on page 1567 “INPUT Statement, List” on page 1589
INPUT Statement, List Scans the input data record for input values and assigns them to the corresponding SAS variables. in a DATA step Category: File-handling Type: Executable Valid:
Syntax INPUT variable ; INPUT variable ;
Arguments
1590
4
INPUT Statement, List
Chapter 6
pointer-control
moves the input pointer to a specified line or column in the input buffer. “Column Pointer Controls” on page 1569 and “Line Pointer Controls” on page 1571
See:
Featured in:
Example 2 on page 1593
variable
specifies a variable that is assigned input values. $
indicates to store a variable value as a character value rather than as a numeric value. Tip:
If the variable is previously defined as character, $ is not required.
Featured in:
Example 1 on page 1593
&
indicates that a character value can have one or more single embedded blanks. This format modifier reads the value from the next non-blank column until the pointer reaches two consecutive blanks, the defined length of the variable, or the end of the input line, whichever comes first. Restriction: The & modifier must follow the variable name and $ sign that it affects.
If you specify an informat after the & modifier, the terminating condition for the format modifier remains two blanks.
Tip:
See:
“Modified List Input” on page 1592 Example 2 on page 1593
Featured in: :
enables you to specify an informat that the INPUT statement uses to read the variable value. For a character variable, this format modifier reads the value from the next non-blank column until the pointer reaches the next blank column, the defined length of the variable, or the end of the data line, whichever comes first. For a numeric variable, this format modifier reads the value from the next non-blank column until the pointer reaches the next blank column or the end of the data line, whichever comes first. If the length of the variable has not been previously defined, then its value is read and stored with the informat length. Tip: The pointer continues to read until the next blank column is reached. However, if the field is longer than the formatted length, then the value is truncated to the length of variable. Tip:
“Modified List Input” on page 1592 Featured in: Example 3 on page 1594 and Example 5 on page 1594 See:
~
indicates to treat single quotation marks, double quotation marks, and delimiters in character values in a special way. This format modifier reads delimiters within quoted character values as characters instead of as delimiters and retains the quotation marks when the value is written to a variable. Restriction: You must use the DSD option in an INFILE statement. Otherwise, the INPUT statement ignores this option. “Modified List Input” on page 1592 Featured in: Example 5 on page 1594 See:
informat.
specifies an informat to use to read the variable values.
Statements
4
INPUT Statement, List
1591
Decimal points in the actual input values always override decimal specifications in a numeric informat. See Also: “Definition of Informats” on page 1217 Featured in: Example 3 on page 1594 and Example 5 on page 1594 Tip:
@
holds an input record for the execution of the next INPUT statement within the same iteration of the DATA step. This line-hold specifier is called trailing @. Restriction: The trailing @ must be the last item in the INPUT statement. Tip: The trailing @ prevents the next INPUT statement from automatically releasing the current input record and reading the next record into the input buffer. It is useful when you need to read from a record multiple times. See: “Using Line-Hold Specifiers” on page 1574 @@
holds an input record for the execution of the next INPUT statement across iterations of the DATA step. This line-hold specifier is called double trailing @. Restriction: The double trailing @ must be the last item in the INPUT statement. Tip: The double trailing @ is useful when each input line contains values for several observations. See: “Using Line-Hold Specifiers” on page 1574
Details When to Use List Input
List input requires that you specify the variable names in the INPUT statement in the same order that the fields appear in the input data records. SAS scans the data line to locate the next value but ignores additional intervening blanks. List input does not require that the data are located in specific columns. However, you must separate each value from the next by at least one blank unless the delimiter between values is changed. By default, the delimiter for data values is one blank space or the end of the input record. List input will not skip over any data values to read subsequent values, but it can ignore all values after a given point in the data record. However, pointer controls enable you to change the order that the data values are read. There are two types of list input: 3 simple list input 3 modified list input. Modified list input makes the INPUT statement more versatile because you can use a format modifier to overcome several of the restrictions of simple list input. See “Modified List Input” on page 1592.
Simple List Input
Simple list input places several restrictions on the type of data that the INPUT statement can read: 3 By default, at least one blank must separate the input values. Use the DLM= or DLMSTR= option or the DSD option in the INFILE statement to specify a delimiter other than a blank. 3 Represent each missing value with a period, not a blank, or two adjacent delimiters. 3 Character input values cannot be longer than 8 bytes unless the variable is given a longer length in an earlier LENGTH, ATTRIB, or INFORMAT statement.
1592
INPUT Statement, List
4
Chapter 6
3 Character values cannot contain embedded blanks unless you change the delimiter. 3 Data must be in standard numeric or character format.* Modified List Input List input is more versatile when you use format modifiers. The format modifiers are as follows: Format Modifier
Purpose
&
reads character values that contain embedded blanks.
:
reads data values that need the additional instructions that informats can provide but that are not aligned in columns. **
~
reads delimiters within quoted character values as characters and retains the quotation marks.
** Use formatted input and pointer controls to quickly read data values that are aligned in columns.
For example, use the : modifier with an informat to read character values that are longer than 8 bytes or numeric values that contain nonstandard values. Because list input interprets a blank as a delimiter, use modified list input to read values that contain blanks. The & modifier reads character values that contain single embedded blanks. However, the data values must be separated by two or more blanks. To read values that contain leading, trailing, or embedded blanks with list input, use the DLM= or DLMSTR= option in the INFILE statement to specify another character as the delimiter. See Example 5 on page 1594. If your input data use blanks as delimiters and they contain leading, trailing, or embedded blanks, you might need to use either column input or formatted input. If quotation marks surround the delimited values, you can use list input with the DSD option in the INFILE statement.
Comparisons How Modified List Input and Formatted Input Differ
Modified list input has a scanning feature that can use informats to read data which are not aligned in columns. Formatted input causes the pointer to move like that of column input to read a variable value. The pointer moves the length that is specified in the informat and stops at the next column. This DATA step uses modified list input to read the first data value and formatted input to read the second: data jansales; input item : $10. amount comma5.; datalines; trucks 1,382 vans 1,235 sedans 2,391 ;
The value of ITEM is read with modified list input. The INPUT statement stops reading when the pointer finds a blank space. The pointer then moves to the second column after the end of the field, which is the correct position to read the AMOUNT value with formatted input. Formatted input, on the other hand, continues to read the entire width of the field. This INPUT statement uses formatted input to read both data values: * See SAS Language Reference: Concepts for the information about standard and nonstandard data values.
Statements
4
INPUT Statement, List
1593
input item $10. +1 amount comma5.;
To read this data correctly with formatted input, the second data value must occur th after the 10 column of the first value, as shown here: ----+----1----+----2 trucks 1,382 vans 1,235 sedans 2,391
Also, after the value of ITEM is read with formatted input, you must use the pointer control +1 to move the pointer to the column where the value AMOUNT begins.
When Data Contains Quotation Marks When you use the DSD option in an INFILE statement, which sets the delimiter to a comma, the INPUT statement removes quotation marks before a value is written to a variable. When you also use the tilde (~) modifier in an INPUT statement, the INPUT statement maintains quotation marks as part of the value.
Examples
Example 1: Reading Unaligned Data with Simple List Input The INPUT statement in this DATA step uses simple list input to read the input data records: data scores; input name $ score1 score2 score3 team $; datalines; Joe 11 32 76 red Mitchel 13 29 82 blue Susan 14 27 74 green ;
The next INPUT statement reads only the first four fields in the previous data lines, which demonstrates that you are not required to read all the fields in the record: input name $ score1 score2 score3;
Example 2: Reading Character Data That Contains Embedded Blanks
The INPUT statement in this DATA step uses the & format modifier with list input to read character values that contain embedded blanks. data list; infile file-specification; input name $ & score; run;
It can read these input data records: ----+----1----+----2----+----3----+ Joseph 11 Joergensen red Mitchel 13 Mc Allister blue Su Ellen 14 Fischer-Simon green
The & modifier follows the variable it affects in the INPUT statement. Because this format modifier follows NAME, at least two blanks must separate the NAME field from the SCORE field in the input data records. You can also specify an informat with a format modifier, as shown here:
1594
4
INPUT Statement, List
Chapter 6
input name $ & +3 lastname & $15. team $;
In addition, this INPUT statement reads the same data to demonstrate that you are not required to read all the values in an input record. The +3 column pointer control moves the pointer past the score value in order to read the value for LASTNAME and TEAM.
Example 3: Reading Unaligned Data with Informats
This DATA step uses modified list
input to read data values with an informat: data jansales; input item : $10. amount; datalines; trucks 1382 vans 1235 sedans 2391 ;
The $10. informat allows a character variable of up to ten characters to be read.
Example 4: Reading Comma-Delimited Data with List Input and an Informat
This DATA step uses the DELIMITER= option in the INFILE statement to read list input values that are separated by commas instead of blanks. The example uses an informat to read the date, and a format to write the date. options pageno=1 nodate ls=80 ps=64; data scores2; length Team $ 14; infile datalines delimiter=’,’; input Name $ Score1-Score3 Team $ Final_Date:MMDDYY10.; format final_date weekdate17.; datalines; Joe,11,32,76,Red Racers,2/3/2007 Mitchell,13,29,82,Blue Bunnies,4/5/2007 Susan,14,27,74,Green Gazelles,11/13/2007 ; proc print data=scores2; var Name Team Score1-Score3 Final_Date; title ’Soccer Player Scores’; run;
Output 6.14
Output from Comma-Delimited Data Soccer Player Scores
Obs 1 2 3
Name Joe Mitchell Susan
1
Team
Score1
Score2
Score3
Red Racers Blue Bunnies Green Gazelles
11 13 14
32 29 27
76 82 74
Final_Date Mon, Feb 3, 2007 Sat, Apr 5, 2007 Thu, Nov 13, 2007
Example 5: Reading Delimited Data with Modified List Input This DATA step uses the DSD option in an INFILE statement and the tilde (~) format modifier in an INPUT
Statements
4
INPUT Statement, Named
1595
statement to retain the quotation marks in character data and to read a character in a string that is enclosed in quotation marks as a character instead of as a delimiter. data scores; infile datalines dsd; input Name : $9. Score1-Score3 Team ~ $25. Div $; datalines; Joseph,11,32,76,"Red Racers, Washington",AAA Mitchel,13,29,82,"Blue Bunnies, Richmond",AAA Sue Ellen,14,27,74,"Green Gazelles, Atlanta",AA ;
The output that PROC PRINT generates shows the resulting SCORES data set. The values for TEAM contain the quotation marks. Output 6.15
SCORES Data Set The SAS System
OBS Name 1 2 3
Score1 Score2 Score3
Joseph Mitchel Sue Ellen
11 13 14
32 29 27
76 82 74
1 Team
Div
"Red Racers, Washington" AAA "Blue Bunnies, Richmond" AAA "Green Gazelles, Atlanta" AA
See Also Statements: “INFILE Statement” on page 1541 “INPUT Statement” on page 1567 “INPUT Statement, Formatted” on page 1585
INPUT Statement, Named Reads data values that appear after a variable name that is followed by an equal sign and assigns them to corresponding SAS variables. Valid:
in a DATA step
Category: File-handling Type: Executable
Syntax INPUT variable= ; INPUT variable= informat. ; INPUT variable= start-column ;
1596
INPUT Statement, Named
4
Chapter 6
Arguments pointer-control
moves the input pointer to a specified line or column in the input buffer. See: “Column Pointer Controls” on page 1569 and “Line Pointer Controls” on page 1571 variable=
specifies a variable whose value is read by the INPUT statement. In the input data record, the field has the form variable=value
Featured in:
Example 3 on page 1598
$
indicates to store a variable value as a character value rather than as a numeric value. Tip: If the variable is previously defined as character, $ is not required. Featured in: Example 3 on page 1598 informat.
specifies an informat that indicates the data type of the input values, but not how the values are read. Tip: Use the INFORMAT statement to associate an informat with a variable. See: Chapter 5, “Informats,” on page 1215 Featured in: Example 3 on page 1598 start-column
specifies the column that the INPUT statement uses to begin scanning in the input data records for the variable. The variable name does not have to begin here. -end-column
determines the default length of the variable. @
holds an input record for the execution of the next INPUT statement within the same iteration of the DATA step. This line-hold specifier is called trailing @. Restriction: The trailing @ must be the last item in the INPUT statement. Tip: The trailing @ prevents the next INPUT statement from automatically releasing the current input record and reading the next record into the input buffer. It is useful when you need to read from a record multiple times. See: “Using Line-Hold Specifiers” on page 1574 @@
holds an input record for the execution of the next INPUT statement across iterations of the DATA step. This line-hold specifier is called double trailing @. Restriction: The double trailing @ must be the last item in the INPUT statement. Tip: The double trailing @ is useful when each input line contains values for several observations. See: “Using Line-Hold Specifiers” on page 1574
Details When to Use Named Input Named input reads the input data records that contain a variable name followed by an equal sign and a value for the variable. The INPUT
Statements
4
INPUT Statement, Named
1597
statement reads the input data record at the current location of the input pointer. If the input data records contain data values at the start of the record that the INPUT statement cannot read with named input, use another input style to read them. However, once the INPUT statement starts to read named input, SAS expects that all the remaining values are in this form. See Example 3 on page 1598. You do not have to specify the variables in the INPUT statement in the same order that they occur in the data records. Also, you do not have to specify a variable for each field in the record. However, if you do not specify a variable in the INPUT statement that another statement uses (for example, ATTRIB, FORMAT, INFORMAT, LENGTH statement) and it occurs in the input data record, the INPUT statement automatically reads the value. SAS writes a note to the log that the variable is uninitialized. When you do not specify a variable for all the named input data values, SAS sets _ERROR_ to 1 and writes a note to the log. For example, data list; input name=$ age=; datalines; name=John age=34 gender=M ;
The note that SAS writes to the log states that GENDER is not defined and _ERROR_ is set to 1.
Restrictions 3 After you start to read with named input, you cannot switch to another input style or use pointer controls. All the remaining values in the input data record must be in the form variable=value. SAS treats the values that are not in named input form as invalid data. 3 If named input values continue after the end of the current input line, use a slash (/) at the end of the input line. The slash tells SAS to move the pointer to the next line and to continue to read with named input. For example, input name=$ age=;
can read this input data record: name=John / age=34
3 If you use named input to read character values that contain embedded blanks, put two blanks before and after the data value, as you would with list input. See Example 4 on page 1598.
3 You cannot reference an array with an asterisk or an expression subscript.
Examples Example 1: Using List and Named Input
This DATA step uses list input with named
input to read input data records. data list; length name $ 20 gender $ 1; informat dob ddmmyy8.; input id name= gender= age= dob=; datalines; 4798 name=COLIN gender=m age=23 dob=16/02/75 2653 name=MICHELE gender=f age=46 dob=17/02/73 ; proc print data=list; run;
1598
INPUT Statement, Named
4
Chapter 6
The INPUT statement uses list input to read the ID variable. The remaining variables NAME, GENDER, AGE, and DOB are read with named input. The LENGTH statement prevents the INPUT statement from truncating the character values for the variable name to a length of eight.
Example 2: Using Named Input with Variables in Random Order Using the same data as in the previous example, this DATA step also uses list input and named input to read input data records. However, in this example, the order of the values in the data is different for the two rows, except for the ID value, which must come first. data list; length name $ 20 gender $ 1; informat dob ddmmyy8.; input id dob= name= age= gender=; datalines; 4798 gender=m name=COLIN age=23 dob=16/02/75 2653 name=MICHELE dob=17/02/73 age=46 gender=f ; proc print data=list; run;
Example 3: Using Named Input with Another Input Style
This DATA step uses list
input and named input to read input data records: data list; input id name=$20. gender=$; informat dob ddmmyy8.; datalines; 4798 gender=m name=COLIN age=23 dob=16/02/75 2653 name=MICHELE age=46 gender=f ; proc print data=list; run;
The INPUT statement uses list input to read the first variable, ID. The remaining variables NAME, GENDER, and DOB are read with named input. These variables are not read in order. The $20. informat with NAME= prevents the INPUT statement from truncating the character value to a length of eight. The INPUT statement reads the DOB= field because the INFORMAT statement refers to this variable. It skips the AGE= field altogether. SAS writes notes to the log that DOB is uninitialized, AGE is not defined, and _ERROR_ is set to 1.
Example 4: Reading Character Variables with Embedded Blanks This DATA step reads character variables that contain embedded blanks with named input: data list2; informat header $30. name $15.; input header= name=; datalines; header= age=60 AND UP name=PHILIP ;
Two spaces precede and follow the value of the variable HEADER, which is AGE=60 AND UP. The field also contains an equal sign.
See Also Statement:
Statements
4
KEEP Statement
1599
“INPUT Statement” on page 1567
KEEP Statement Specifies the variables to include in output SAS data sets. Valid:
in a DATA step
Category: Information Type: Declarative
Syntax KEEP variable-list;
Arguments
variable-list
specifies the names of the variables to write to the output data set. Tip:
List the variables in any form that SAS allows.
Details The KEEP statement causes a DATA step to write only the variables that you specify to one or more SAS data sets. The KEEP statement applies to all SAS data sets that are created within the same DATA step and can appear anywhere in the step. If no KEEP or DROP statement appears, all data sets that are created in the DATA step contain all variables. Note: step. 4
Do not use both the KEEP and DROP statements within the same DATA
Comparisons 3 The KEEP statement cannot be used in SAS PROC steps. The KEEP= data set option can.
3 The KEEP statement applies to all output data sets that are named in the DATA statement. To write different variables to different data sets, you must use the KEEP= data set option.
3 The DROP statement is a parallel statement that specifies variables to omit from the output data set.
3 The KEEP and DROP statements select variables to include in or exclude from output data sets. The subsetting IF statement selects observations.
3 Do not confuse the KEEP statement with the RETAIN statement. The RETAIN statement causes SAS to hold the value of a variable from one iteration of the DATA step to the next iteration. The KEEP statement does not affect the value of variables but only specifies which variables to include in any output data sets.
1600
4
LABEL Statement
Chapter 6
Examples 3 These examples show the correct syntax for listing variables in the KEEP statement:
3
keep name address city state zip phone;
3
keep rep1-rep5;
3 This example uses the KEEP statement to include only the variables NAME and AVG in the output through SCORE20, from which AVG is calculated, are not written to the data set AVERAGE. data keep name avg; infile file-specification; input name $ score1-score20; avg=mean(of score1-score20); run;
See Also Data Set Option: “KEEP= Data Set Option” on page 35 Statements: “DROP Statement” on page 1449 “IF Statement, Subsetting” on page 1531 “RETAIN Statement” on page 1694
LABEL Statement Assigns descriptive labels to variables. in a DATA step Category: Information Type: Declarative Valid:
Syntax LABEL variable-1=label-1 . . . < variable-n=label-n>; LABEL variable-1=’ ’ … ;
Arguments variable
specifies the variable that you want to label. Tip: You can specify additional pairs of labels and variables. label
specifies a label of up to 256 characters, including blanks. Tip: You can specify additional pairs of labels and variables.
Statements
4
Labels, Statement
1601
For more information about including quotation marks as part of the label, see “Character Constants” in SAS Language Reference: Concepts. Restriction: If the label includes a semicolon (;) or an equal sign (=), you must enclose the label in either single or double quotation marks. Restriction: If the label includes single quotation marks (’), then you must enclose the label in double quotation marks. Tip:
’’
removes a label from a variable. Enclose a single blank space in quotation marks to remove an existing label.
Details Using a LABEL statement in a DATA step permanently associates labels with variables by affecting the descriptor information of the SAS data set that contains the variables. You can associate any number of variables with labels in a single LABEL statement. You can use a LABEL statement in a PROC step, but the rules are different. See the Base SAS Procedures Guide for more information.
Comparisons Both the ATTRIB and LABEL statements can associate labels with variables and change a label that is associated with a variable.
Examples Example 1: Specifying Labels 3 label compound=Type of
Here are several LABEL statements: Drug;
3
label date="Today’s Date";
3
label n=’Mark’’s Experiment Number’;
3
label score1="Grade on April 1 Test" score2="Grade on May 1 Test";
Example 2: Removing a Label
This example removes an existing label:
data rtest; set rtest; label x=’ ’; run;
See Also Statement: “ATTRIB Statement” on page 1400
Labels, Statement Identifies a statement that is referred to by another statement. Valid:
in a DATA step
1602
Labels, Statement
4
Chapter 6
Control Type: Declarative Category:
Syntax label: statement;
Arguments label
specifies any SAS name, which is followed by a colon (:). You must specify the label argument. statement
specifies any executable statement, including a null statement (;). You must specify the statement argument. Restriction: No two statements in a DATA step can have the same label. Restriction: If a statement in a DATA step is labeled, it should be referenced by a statement or option in the same step. Tip: A null statement can have a label: ABC:;
Details The statement label identifies the destination of either a GO TO statement, a LINK statement, the HEADER= option in a FILE statement, or the EOF= option in an INFILE statement.
Comparisons The LABEL statement assigns a descriptive label to a variable. A statement label identifies a statement or group of statements that are referred to in the same DATA step by another statement, such as a GO TO statement.
Examples In this example, if Stock=0, the GO TO statement causes SAS to jump to the statement that is labeled reorder. When Stock is not 0, execution continues to the RETURN statement and then returns to the beginning of the DATA step for the next observation. data Inventory Order; input Item $ Stock @; /* go to label reorder: */ if Stock=0 then go to reorder; output Inventory; return; /* destination of GO TO statement */ reorder: input Supplier $; put ’ORDER ITEM ’ Item ’FROM ’ Supplier; output Order;
Statements
4
LEAVE Statement
1603
datalines; milk 0 A bread 3 B ;
See Also Statements: “GO TO Statement” on page 1529 “LINK Statement” on page 1619 Statement Options: HEADER= option in the FILE statement on page 1460 EOF= option in the INFILE statement on page 1545
LEAVE Statement Stops processing the current loop and resumes with the next statement in the sequence. Valid:
in a DATA step
Category: Control Type: Executable
Syntax LEAVE;
Without Arguments The LEAVE statement stops the processing of the current DO loop or SELECT group and continues DATA step processing with the next statement following the DO loop or SELECT group.
Details You can use the LEAVE statement to exit a DO loop or SELECT group prematurely based on a condition.
Comparisons 3 The LEAVE statement causes processing of the current loop to end. The CONTINUE statement stops the processing of the current iteration of a loop and resumes with the next iteration. 3 You can use the LEAVE statement in a DO loop or in a SELECT group. You can use the CONTINUE statement only in a DO loop.
Examples This DATA step demonstrates using the LEAVE statement to stop the processing of a DO loop under a given condition. In this example, the IF/THEN statement checks the
1604
LENGTH Statement
4
Chapter 6
value of BONUS. When the value of BONUS reaches 500, the maximum amount allowed, the LEAVE statement stops the processing of the DO loop. data week; input name $ idno start_yr status $ dept $; bonus=0; do year= start_yr to 1991; if bonus ge 500 then leave; bonus+50; end; datalines; Jones 9011 1990 PT PUB Thomas 876 1976 PT HR Barnes 7899 1991 FT TECH Harrell 1250 1975 FT HR Richards 1002 1990 FT DEV Kelly 85 1981 PT PUB Stone 091 1990 PT MAIT ;
LENGTH Statement Specifies the number of bytes for storing variables. Valid:
in a DATA step
Category: Type: See:
Information
Declarative LENGTH Statement in the documentation for your operating environment.
Syntax LENGTH variable-specification(s);
Arguments variable-specification
is a required argument and has the form variable(s)length where variable specifies one or more variables that are to be assigned a length. This includes any variables in the DATA step, including those dropped from the output data set. Restriction: Array references are not allowed. Tip: If the variable is character, the length applies to the program data vector and
the output data set. If the variable is numeric, the length applies only to the output data set. $
Statements
4
LENGTH Statement
1605
specifies that the preceding variables are character variables. Default: SAS assumes that the variables are numeric.
length specifies a numeric constant that is the number of bytes used for storing variable values. Range: For numeric variables, 2 to 8 or 3 to 8, depending on your operating
environment. For character variables, 1 to 32767 under all operating environments. DEFAULT=n
changes the default number of bytes that SAS uses to store the values of any newly created numeric variables. Default: 8 Range: 2 to 8 or 3 to 8, depending on your operating environment.
CAUTION:
Avoid shortening numeric variables that contain fractions. The precision of a numeric variable is closely tied to its length, especially when the variable contains fractional values. You can safely shorten variables that contain integers according to the rules that are given in the SAS documentation for your operating environment, but shortening variables that contain fractions might eliminate important precision. 4
Details In general, the length of a variable depends on
3 whether the variable is numeric or character 3 how the variable was created 3 whether a LENGTH or ATTRIB statement is present. Subject to the rules for assigning lengths, lengths that are assigned with the LENGTH statement can be changed in the ATTRIB statement and vice versa. See “SAS Variables” in SAS Language Reference: Concepts for information on assigning lengths to variables. Operating Environment Information: Valid variable lengths depend on your operating environment. For details, see the SAS documentation for your operating environment. 4
Comparisons The ATTRIB statement can assign the length as well as other attributes of variables.
Examples This example uses a LENGTH statement to set the length of the character variable NAME to 25. It also changes the default number of bytes that SAS uses to store the values of newly created numeric variables from 8 to 4. The TRIM function removes trailing blanks from LASTNAME before it is concatenated with a comma (,) , a blank space, and the value of FIRSTNAME. If you omit the LENGTH statement, SAS sets the length of NAME to 32. data testlength; informat FirstName LastName $15. n1 6.2; input firstname lastname n1 n2; length name $25 default=4;
1606
LIBNAME Statement
4
Chapter 6
name=trim(lastname)||’, ’||firstname; datalines; Alexander Robinson 35 11 ; proc contents data=testlength; run; proc print data=testlength; run;
The following output shows a partial listing from PROC CONTENTS, as well as the report that PROC PRINT generates. Output 6.16
Setting the Length of a Variable The SAS System
3
CONTENTS PROCEDURE -----Alphabetic List of Variables and Attributes----# Variable Type Len Pos Informat -----------------------------------------------1 FirstName Char 15 8 $15. 2 LastName Char 15 23 $15. 3 n1 Num 4 0 6.2 4 n2 Num 4 4 5 name Char 25 38
The SAS System
4
OBS
FirstName
LastName
n1
n2
name
1
Alexander
Robinson
0.35000
11
Robinson, Alexander
See Also Statement: “ATTRIB Statement” on page 1400 For information on the use of the LENGTH statement in PROC steps, see Base SAS Procedures Guide
LIBNAME Statement Associates or disassociates a SAS library with a libref (a shortcut name), clears one or all librefs, lists the characteristics of a SAS library, concatenates SAS libraries, or concatenates SAS catalogs. Anywhere Data Access See: LIBNAME Statement in the documentation for your operating environment Valid:
Category:
Statements
4
LIBNAME Statement
1607
Syntax uLIBNAME libref ’SAS-library’ < options > ; vLIBNAME libref CLEAR | _ALL_ CLEAR; wLIBNAME libref LIST | _ALL_ LIST; xyLIBNAME libref (library-specification-1 ) < options >;
Arguments
libref
is a shortcut name or a “nickname” for the aggregate storage location where your SAS files are stored. It is any SAS name when you are assigning a new libref. When you are disassociating a libref from a SAS library or when you are listing attributes, specify a libref that was previously assigned. Range: 1 to 8 characters
The association between a libref and a SAS library lasts only for the duration of the SAS session or until you change it or discontinue it with another LIBNAME statement.
Tip:
’SAS-library’
must be the physical name for the SAS library. The physical name is the name that is recognized by the operating environment. Enclose the physical name in single or double quotation marks. Operating Environment Information: For details about specifying the physical names of files, see the SAS documentation for your operating environment. 4 library-specification
is two or more SAS libraries that are specified by physical names, previously assigned librefs, or a combination of the two. Separate each specification with either a blank or a comma and enclose the entire list in parentheses. ’SAS-library’ is the physical name of a SAS library, enclosed in quotation marks. libref is the name of a previously assigned libref. Restriction: When concatenating libraries, you cannot specify options that are
specific to an engine or an operating environment. Featured in:
Example 2 on page 1613
See Also: “Rules for Library Concatenation” on page 1612 engine
is an engine name. Usually, SAS automatically determines the appropriate engine to use for accessing the files in the library. If you want to create a new library with an engine other than the default engine, then you can override the automatic selection.
Tip:
1608
4
LIBNAME Statement
Chapter 6
For a list of valid engines, see the SAS documentation for your operating environment. For background information about engines, see SAS Language Reference: Concepts.
See:
CLEAR
disassociates one or more currently assigned librefs. Specify libref to disassociate a single libref. Specify _ALL_ to disassociate all currently assigned librefs.
Tip: _ALL_
specifies that the CLEAR or LIST argument applies to all currently assigned librefs. LIST
writes the attributes of one or more SAS libraries to the SAS log. Specify libref to list the attributes of a single SAS library. Specify _ALL_ to list the attributes of all SAS libraries that have librefs in your current session.
Tip:
Options ACCESS=READONLY|TEMP READONLY
assigns a read-only attribute to an entire SAS library. SAS will not allow you to open a data set in the library in order to update information or write new information.
TEMP
specifies that the SAS library be treated as a scratch library. That is, the system will not consume CPU cycles to ensure that the files in a TEMP library do not become corrupted. Tip: Use ACCESS=TEMP to save resources only when the data
is recoverable. Operating Environment Information: Some operating environments support LIBNAME statement options that have similar functions to the ACCESS= option. See the SAS documentation for your operating environment. 4 COMPRESS=NO | YES | CHAR | BINARY controls the compression of observations in output SAS data sets for a SAS library. NO specifies that the observations in a newly created SAS data set be uncompressed (fixed-length records). YES | CHAR specifies that the observations in a newly created SAS data set be compressed (variable-length records) by SAS using RLE (Run Length Encoding). RLE compresses observations by reducing repeated consecutive characters (including blanks) to two-byte or three-byte representations. Tip: Use this compression algorithm for character data.
BINARY specifies that the observations in a newly created SAS data set be compressed (variable-length records) by SAS using RDC (Ross Data Compression). RDC combines run-length encoding and sliding-window compression to compress the file. Tip: This method is highly effective for compressing medium to large (several
hundred bytes or larger) blocks of binary data (numeric variables). Because the compression function operates on a single record at a time, the
Statements
4
LIBNAME Statement
1609
record length needs to be several hundred bytes or larger for effective compression. For the COPY procedure, the default value CLONE uses the compression attribute from the input data set for the output data set instead of the value specified in the COMPRESS= option. For more information about CLONE and NOCLONE, see the COPY statement in the DATASETS procedure in the Base SAS Procedures Guide. This interaction does not apply when using SAS/SHARE or SAS/CONNECT.
Interaction:
CVPBYTES=bytes specifies the number of bytes to expand character variable lengths when processing a SAS data file that requires transcoding. “CVPBYTES=, CVPENGINE=, and CVPMULTIPLIER= Options” in the SAS National Language Support (NLS): Reference Guide
See:
CVPENGINE|CVPENG=engine specifies the engine to use in order to process a SAS data file that requires transcoding. “CVPBYTES=, CVPENGINE=, and CVPMULTIPLIER= Options” in the SAS National Language Support (NLS): Reference Guide
See:
CVPMULTIPLIER|CVPMULT=multiplier specifies a multiplier value in order to expand character variable lengths when processing a SAS data file that requires transcoding. “CVPBYTES=, CVPENGINE=, and CVPMULTIPLIER= Options” in the SAS National Language Support (NLS): Reference Guide
See:
INENCODING=ANY | ASCIIANY | EBCDICANY | encoding-value overrides the encoding when you are reading (input processing) SAS data sets in the SAS library. “INENCODING= and OUTENCODING= Options” in the SAS National Language Support (NLS): Reference Guide
See:
OUTENCODING= OUTENCODING=ANY | ASCIIANY | EBCDICANY | encoding-value overrides the encoding when you are creating (output processing) SAS data sets in the SAS library. The “INENCODING= and OUTENCODING= Options” in the SAS National Language Support (NLS): Reference Guide
See:
OUTREP=format specifies the data representation for the SAS library, which is the form in which data is stored in a particular operating environment. Different operating environments use different standards or conventions for storing floating-point numbers (for example, IEEE or IBM Mainframe); for character encoding (ASCII or EBCDIC); for the ordering of bytes in memory (big Endian or little Endian); for word alignment (4-byte boundaries or 8-byte boundaries); for integer data-type length (16-bit, 32-bit, or 64-bit); and for doubles (byte-swapped or not). Native data representation refers to an environment in which the data representation is comparable to the CPU that is accessing the file. For example, a file that is in Windows data representation is native to the Windows operating environment. By default, SAS creates a new SAS data set by using the native data representation of the CPU that is running SAS. Specifying the OUTREP= option enables you to create files within the native environment that use a foreign data representation. For example, in a UNIX environment, you can create a SAS data
1610
LIBNAME Statement
4
Chapter 6
set that uses a Windows data representation. Existing data sets that are written to the library are given the new data representation. For the COPY procedure, the default value CLONE uses the data representation from the input data set instead of the value specified in the OUTREP= option. For more information about CLONE and NOCLONE, see the COPY statement in the DATASETS procedure in the Base SAS Procedures Guide. This interaction does not apply when using SAS/SHARE or SAS/CONNECT.
Interaction:
Interaction: The COPY procedure (with NOCLONE) and the MIGRATE
procedure can use the LIBNAME option OUTREP= for DATA, VIEW, ACCESS, MDDB and DMDB member types. Otherwise, only DATA member types are affected by the OUTREP= LIBNAME option. Interaction: Transcoding could result in character data loss when encodings are
incompatible. For information about encoding and transcoding, see SAS National Language Support (NLS): Reference Guide. Values for OUTREP= are listed in the following table: Table 6.7 Data Representation Values for OUTREP= Option OUTREP= Value
Alias*
Environment
ALPHA_TRU64
ALPHA_OSF
Compaq Tru64 UNIX
ALPHA_VMS_32
ALPHA_VMS
OpenVMS on Alpha
ALPHA_VMS_64
OpenVMS on Alpha
HP_IA64
HP_ITANIUM
HP UX on Itanium 64-bit platform
HP_UX_32
HP_UX
HP UX on 32-bit platform
HP_UX_64
HP UX on 64-bit platform
INTEL_ABI
ABI UNIX on Intel 32-bit platform
LINUX_32
LINUX
Linux for Intel Architecture on 32-bit platform
LINUX_IA64
Linux for Itanium-based system on 64-bit platform
LINUX_X86_64
LINUX on x64 64-bit platform
MIPS_ABI
ABI UNIX on 32-bit platform
MVS_32
MVS
OS2 RS_6000_AIX_32
OS/2 on Intel 32-bit platform RS_6000_AIX
RS_6000_AIX_64 SOLARIS_32
z/OS on 32-bit platform
AIX UNIX on 32-bit RS/6000 AIX UNIX on 64-bit RS/6000
SOLARIS
Solaris on SPARC 32-bit platform
SOLARIS_64
Solaris on SPARC 64-bit platform
SOLARIS_X86_64
Solaris on x64 64-bit platform
VAX_VMS
OpenVMS VAX
VMS_IA64
OpenVMS for HP Integrity servers 64-bit platform
Statements
4
LIBNAME Statement
OUTREP= Value
Alias*
Environment
WINDOWS_32
WINDOWS
Microsoft Windows on 32-bit platform
WINDOWS_64
1611
Microsoft Windows 64-bit Edition (for both Itanium-based systems and x64)
* It is recommended that you use the current values. compatibility only.
The aliases are available for
REPEMPTY=YES|NO controls replacement of like-named temporary or permanent SAS data sets when the new one is empty. YES
specifies that a new empty data set with a given name replace an existing data set with the same name. This is the default. Interaction: When REPEMPTY=YES and REPLACE=NO, then the data set is not replaced.
NO
specifies that a new empty data set with a given name not replace an existing data set with the same name. Tip: Use REPEMPTY=NO to prevent the following syntax error from replacing the existing data set MYLIB.B with the new empty data set MYLIB.B that is created by mistake: libname libref SAS-library REPEMPTY=NO; data mylib.a set mylib.b;
Tip: For both the convenience of replacing existing data sets
with new ones that contain data and the protection of not overwriting existing data sets with new empty ones that are created by mistake, set REPLACE=YES and REPEMPTY=NO. Comparison: For an individual data set, the REPEMPTY= data set option
overrides the setting of the REPEMPTY= option in the LIBNAME statement. See Also:
“REPEMPTY= Data Set Option” on page 53
Engine Host Options engine-host-options are one or more options that are listed in the general form keyword=value. Operating Environment Information: For a list of valid specifications, see the SAS documentation for your operating environment. 4 Restriction: When concatenating libraries, you cannot specify options that are
specific to an engine or an operating environment.
Details uAssociating a Libref with a SAS Library The association between a libref and a SAS library lasts only for the duration of the SAS session or until you change the libref or discontinue it with another LIBNAME statement. The simplest form of the LIBNAME statement specifies only a libref and the physical name of a SAS library: LIBNAME libref ’SAS-library’; See Example 1 on page 1613.
1612
LIBNAME Statement
4
Chapter 6
An engine specification is usually not necessary. If the situation is ambiguous, SAS uses the setting of the ENGINE= system option to determine the default engine. If all data sets in the library are associated with a single engine, then SAS uses that engine as the default. In either situation, you can override the default by specifying another engine with the ENGINE= system option: LIBNAME libref engine ’SAS-library’ ;
Operating Environment Information: Using the LIBNAME statement requires host-specific information. See the SAS documentation for your operating environment before using this statement. 4
vDisassociating a Libref from a SAS Library
To disassociate a libref from a SAS library, use a LIBNAME statement by specifying the libref and the CLEAR option. You can clear a single, specified libref or all current librefs. LIBNAME libref CLEAR | _ALL_ CLEAR;
wWriting SAS Library Attributes to the SAS Log
Use a LIBNAME statement to write the attributes of one or more SAS libraries to the SAS log. Specify libref to list the attributes of one SAS library; use _ALL_ to list the attributes of all SAS libraries that have been assigned librefs in your current SAS session. LIBNAME libref LIST | _ALL_ LIST;
xConcatenating SAS Libraries
When you logically concatenate two or more SAS libraries, you can reference them all with one libref. You can specify a library with its physical filename or its previously assigned libref. LIBNAME libref < engine> (library–specification–1 ) < options >; In the same LIBNAME statement you can use any combination of specifications: librefs, physical filenames, or a combination of librefs and physical filenames. See Example 2 on page 1613.
yConcatenating SAS Catalogs
When you logically concatenate two or more SAS libraries, you also concatenate the SAS catalogs that have the same name. For example, if three SAS libraries each contain a catalog named CATALOG1, then when you concatenate them, you create a catalog concatenation for the catalogs that have the same name. See Example 3 on page 1614. LIBNAME libref < engine> (library–specification–1 ) < options >;
Rules for Library Concatenation After you create a library concatenation, you can specify the libref in any context that accepts a simple (non-concatenated) libref. These rules determine how SAS files (that is, members of SAS libraries) are located among the concatenated libraries: 1 When a SAS file is opened for input or update, the concatenated libraries are searched and the first occurrence of the specified file is used. 2 When a SAS file is opened for output, it is created in the first library that is listed in the concatenation. Note: A new SAS file is created in the first library even if there is a file with the same name in another part of the concatenation. 4
Statements
4
LIBNAME Statement
1613
3 When you delete or rename a SAS file, only the first occurrence of the file is
affected. 4 Anytime a list of SAS files is displayed, only one occurrence of a filename is shown.
Note: Even if the name occurs multiple times in the concatenation, only the first occurrence is shown. 4 5 A SAS file that is logically connected to another file (such as an index to a data
6 7
8
9
set) is listed only if the parent file resides in that same library. For example, if library ONE contains A.DATA, and library TWO contains A.DATA and A.INDEX, only A.DATA from library ONE is listed. (See rule 4.) If any library in the concatenation is sequential, then all of the libraries are treated as sequential. The attributes of the first library that is specified determine the attributes of the concatenation. For example, if the first SAS library that is listed is “read only,” then the entire concatenated library is “read only.” If you specify any options or engines, they apply only to the libraries that you specified with the complete physical name, not to any library that you specified with a libref. If you alter a libref after it has been assigned in a concatenation, it will not affect the concatenation.
Comparisons 3 Use the LIBNAME statement to reference a SAS library. Use the FILENAME statement to reference an external file. Use the LIBNAME, SAS/ACCESS statement to access DBMS tables. 3 Use the CATNAME statement to concatenate SAS catalogs. Use the LIBNAME statement to concatenate SAS catalogs. The CATNAME statement enables you to specify the names of the catalogs that you want to concatenate. The LIBNAME statement concatenates all like-named catalogs in the specified SAS libraries.
Examples
Example 1: Assigning and Using a Libref This example assigns the libref SALES to an aggregate storage location that is specified in quotation marks as a physical filename. The DATA step creates SALES.QUARTER1 and stores it in that location. The PROC PRINT step references it by its two-level name, SALES.QUARTER1. libname sales ’SAS-library’; data sales.quarter1; infile ’your-input-file’; input salesrep $20. +6 jansales febsales marsales; run; proc print data=sales.quarter1; run;
Example 2: Logically Concatenating SAS Libraries 3 This example concatenates three SAS libraries by specifying the physical filename of each:
1614
LIBNAME Statement
4
Chapter 6
libname allmine (’file-1’ ’file-2’ ’file-3’);
3 This example assigns librefs to two SAS libraries, one that contains SAS 6 files and one that contains SAS 9 files. This technique is useful for updating your files and applications from SAS 6 to SAS 9, while allowing you to have convenient access to both sets of files: libname v6 ’v6--SAS-library’; libname v9 ’v9--SAS-library’; libname allmine (v9 v6);
3 This example shows that you can specify both librefs and physical filenames in the same concatenation specification: libname allmine (v9 v6 ’some-filename’);
Example 3: Concatenating SAS Catalogs
This example concatenates three SAS libraries by specifying the physical filename of each and assigns the libref ALLMINE to the concatenated libraries: libname allmine (’file-1’ ’file-2’ ’file-3’);
If each library contains a SAS catalog named MYCAT, then using ALLMINE.MYCAT as a libref.catref provides access to the catalog entries that are stored in all three catalogs named MYCAT. To logically concatenate SAS catalogs with different names, see “CATNAME Statement” on page 1410.
Example 4: Permanently Storing Data Sets with One-Level Names If you want the convenience of specifying only a one-level name for permanent, not temporary, SAS files, then use the USER= system option. This example stores the data set QUARTER1 permanently without using a LIBNAME statement first to assign a libref to a storage location: options user=’SAS-library’; data quarter1; infile ’your-input-file’; input salesrep $20. +6 jansales febsales marsales; run; proc print data=quarter1; run;
See Also Data Set Options: ENCODING in the SAS National Language Support (NLS): Reference Guide Statements: “CATNAME Statement” on page 1410 for a discussion of concatenating SAS catalogs “FILENAME Statement” on page 1470
Statements
4
LIBNAME Statement for WebDAV Server Access
1615
“LIBNAME Statement” for character variable processing in order to transcode a SAS file in SAS National Language Support (NLS): Reference Guide “LIBNAME Statement” for the Output Delivery System (ODS) in SAS Output Delivery System: User’s Guide “LIBNAME Statement” for SAS metadata in SAS Language Interfaces to Metadata “LIBNAME Statement” for Scalable Performance Data (SPD) in SAS Scalable Performance Data Engine: Reference “LIBNAME statement” for XML documents in SAS XML LIBNAME Engine: User’s Guide “LIBNAME Statement” for SAS/ACCESS in SAS/ACCESS for Relational Databases: Reference “LIBNAME Statement” for SAS/CONNECT in SAS/CONNECT User’s Guide “LIBNAME Statement” for SAS/CONNECT, TCP/IP pipes in SAS/CONNECT User’s Guide “LIBNAME Statement” for SAS/SHARE in SAS/SHARE User’s Guide System Option: “USER= System Option” on page 1983
LIBNAME Statement for WebDAV Server Access Associates a libref with a SAS library and enables access to a WebDAV (Web-based Distributed Authoring And Versioning) server. Valid:
Anywhere
Category: Data Access Restriction: See also:
Access to WebDAV servers is not supported on OpenVMS or z/OS. Base SAS LIBNAME Statement
Syntax LIBNAME libref ’SAS-library’ < options> WEBDAV USER="user-ID" PASSWORD="user-password" WEBDAV options; LIBNAME libref CLEAR | _ALL_ CLEAR; LIBNAME libref LIST | _ALL_ LIST;
Arguments libref specifies a shortcut name for the aggregate storage location where your SAS files are stored. Tip: The association between a libref and a SAS library lasts only for the duration of the SAS session or until you change it or discontinue it with another LIBNAME statement. ’SAS-library’
1616
LIBNAME Statement for WebDAV Server Access
4
Chapter 6
specifies the URL location (path) on a WebDAV server. The URL specifies either HTTP or HTTPS communication protocols. Only one data library is supported when using the WebDAV extension to the LIBNAME statement.
Restriction:
When using the HTTPS communication protocol, you must use the SSL (Secure Sockets Layer) protocol that provides secure network communications. For more information, see Encryption in SAS.
Requirement:
engine specifies the name of a valid SAS engine. Restriction:
REMOTE engines are not supported with the WebDAV options.
For a list of valid engines, see the SAS documentation for your operating environment.
See:
CLEAR disassociates one or more currently assigned librefs. When a libref using a WebDAV server is cleared, the cached files stored locally are deleted also. Specify libref to disassociate a single libref. Specify _ALL_ to disassociate all currently assigned librefs.
Tip:
LIST writes the attributes of one or more SAS libraries to the SAS log. Specify libref to list the attributes of a single SAS library. Specify _ALL_ to list the attributes of all SAS libraries that have librefs in your current session.
Tip:
_ALL_ specifies that the CLEAR or LIST argument applies to all currently assigned librefs.
Options For valid LIBNAME statement options, see “LIBNAME Statement” on page 1606.
WebDAV Specific Options WEBDAV specifies that the libref access a WebDAV server. USER="user-ID" specifies the user name for access to the WebDAV server. The user ID is case sensitive and it must be enclosed in single or double quotation marks. Alias:
UID
If PROMPT is specified, but USER= is not, then the user is prompted for an ID as well as a password.
Tip:
PASSWORD="user-password" specifies a password for the user to access the WebDAV server. The password is case sensitive and it must be enclosed in single or double quotation marks. Alias: Tip:
PWD=, PW=, PASS= You can specify the PROMPT option instead of the PASSWORD= option.
PROMPT specifies to prompt for the user login password, if necessary. Interaction: If PROMPT is specified without USER=, then the user is prompted
for an ID, as well as a password.
Statements
4
LIBNAME Statement for WebDAV Server Access
1617
If you specify the PROMPT option, you do not need to specify the PASSWORD= option.
Tip:
AUTHDOMAIN="auth-domain" specifies the name of an authentication domain metadata object in order to connect to the WebDAV server. The authentication domain references credentials (user ID and password) without your having to explicitly specify the credentials. The auth-domain name is case sensitive, and it must be enclosed in double quotation marks. An administrator creates authentication domain definitions while creating a user definition with the User Manager in SAS Management Console. The authentication domain is associated with one or more login metadata objects that provide access to the WebDAV server and is resolved by the BASE engine calling the SAS Metadata Server and returning the authentication credentials. The authentication domain and the associated login definition must be stored in a metadata repository, and the metadata server must be running in order to resolve the metadata object specification. Interaction: If you specify AUTHDOMAIN=, you do not need to specify USER= and PASSWORD=. See also: For complete information about creating and using authentication domains, see the discussion on credential management in the SAS Intelligence Platform: Security Administration Guide. Requirement:
PROXY=url specifies the Uniform Resource Locator (URL) for the proxy server in one of these forms: "http://hostname" "http://hostname:port" LOCALCACHE="directory name" specifies a directory where a temporary subdirectory is created to hold local copies of the server files. Each libref has its own unique subdirectory. If a directory is not specified, then the subdirectories are created in the SAS WORK directory. SAS deletes the temporary files when the SAS program completes. Default: SAS WORK directory LOCKDURATION=n specifies the number of minutes that the files written through the WebDAV libref are locked. SAS unlocks the files when the SAS program successfully completes. If the SAS program fails, then the locks expire after the time allotted. Default: 30
Data Set Options That Function Differently with a WebDAV Server The following table lists the data set options that have different functionality when using a WebDAV server. All other data set options will function as described in the SAS Language Reference: Dictionary.
1618
LIBNAME Statement for WebDAV Server Access
4
Chapter 6
Table 6.8 Data Set Option Functionality with a WebDAV Server Data Set Option
WebDAV Storage Functionality
CNTLLEV=
LIB locks all data sets in the library before writing the data into the local cache. All members are unlocked after the DATA step has completed and the data set has been written back to the WebDAV server. MEM locks the member before writing the data into the local cache. Member is unlocked after the DATA step has completed and the data has been written back to the WebDAV server. REC is not supported. WebDAV allows updates to the entire data set only.
FILECLOSE
The VxTAPE engine is not supported; therefore this option is ignored.
GENMAX=
This functionality is not supported because the maximum number of revisions to keep cannot be specified in the WebDAV server.
GENNUM=
This functionality is not supported in WebDAV.
IDXNAME=
Users can specify an index to use if one exists.
INDEX=
Indexes can be created in the local cache and saved on the WebDAV server.
TOBSNO=
Remote engines are not supported; therefore this option is ignored.
Details WebDAV File Processing
When accessing a WebDAV server, the file is pulled from the WebDAV server to your local disk storage for processing. When you complete the updating, the file is pushed back to the WebDAV server for storage. The file is removed from the local disk storage when it is pushed back.
Multiple Librefs to a WebDAV Library
When you assign a libref to a file on a WebDAV server, the path (URL location), user ID, and password are associated with that libref. After the first libref has been assigned, the user ID and password will be validated on subsequent attempts to assign another libref to the same library. Note: Lock errors that you typically would not see might occur if either a different user ID or the password, or both, are used in the subsequent attempt to assign a libref to the same library. 4
Locked Files on a WebDAV Server
In local libraries, SAS locks a file when you open it to prevent other users from altering the file while it is being read. WebDAV locks require write access to a library, and there is no concept of a read lock. In addition, WebDAV servers can go down, come back up, or go offline at any time. Consequently, SAS honors a lock request on a file on a WebDAV server only if the file is already locked by another user.
Statements
4
LINK Statement
1619
Example The following example associates the libref davdata with the WebDAV directory /users/mydir/datadir on the WebDAV server www.webserver.com: libname davdata v9 "https://www.webserver.com/users/mydir/datadir" webdav user="mydir" pw="12345";
See Also Statements: “FILENAME Statement, WebDAV Access Method” on page 1517 “LIBNAME Statement” on page 1606
LINK Statement Directs program execution immediately to the statement label that is specified and, if followed by a RETURN statement, returns execution to the statement that follows the LINK statement. in a DATA step Category: Control Type: Executable Valid:
Syntax LINK label;
Arguments label
specifies a statement label that identifies the LINK destination. You must specify the label argument.
Details The LINK statement tells SAS to jump immediately to the statement label that is indicated in the LINK statement and to continue executing statements from that point until a RETURN statement is executed. The RETURN statement sends program control to the statement immediately following the LINK statement. The LINK statement and the destination must be in the same DATA step. The destination is identified by a statement label in the LINK statement. The LINK statement can branch to a group of statements that contain another LINK statement. This arrangement is known as nesting. To avoid infinite looping, SAS has set a default number of nested LINK statements. You can have up to 10 LINK statements with no intervening RETURN statements. When more than one LINK statement has been executed, a RETURN statement tells SAS to return to the statement that follows the last LINK statement that was executed. However, you can use the /STACK option in the DATA statement to increase the number of nested LINK statements.
1620
LINK Statement
4
Chapter 6
Comparisons The difference between the LINK statement and the GO TO statement is in the action of a subsequent RETURN statement. A RETURN statement after a LINK statement returns execution to the statement that follows LINK. A RETURN statement after a GO TO statement returns execution to the beginning of the DATA step, unless a LINK statement precedes GO TO. In that case, execution continues with the first statement after LINK. In addition, a LINK statement is usually used with an explicit RETURN statement, whereas a GO TO statement is often used without a RETURN statement. When your program executes a group of statements at several points in the program, using the LINK statement simplifies coding and makes program logic easier to follow. If your program executes a group of statements at only one point in the program, using DO-group logic rather than LINK-RETURN logic is simpler.
Examples In this example, when the value of variable TYPE is aluv, the LINK statement diverts program execution to the statements that are associated with the label CALCU. The program executes until it encounters the RETURN statement, which sends program execution back to the first statement that follows LINK. SAS executes the assignment statement, writes the observation, and then returns to the top of the DATA step to read the next record. When the value of TYPE is not aluv, SAS executes the assignment statement, writes the observation, and returns to the top of the DATA step. data hydro; input type $ depth station $; /* link to label calcu: */ if type =’aluv’ then link calcu; date=today(); /* return to top of step */ return; calcu: if station=’site_1’ then elevatn=6650-depth; else if station=’site_2’ then elevatn=5500-depth; /* return to date=today(); */ return; datalines; aluv 523 site_1 uppa 234 site_2 aluv 666 site_2 ...more data lines... ;
See Also Statements: “DATA Statement” on page 1416 “DO Statement” on page 1441 “GO TO Statement” on page 1529 “Labels, Statement” on page 1601 “RETURN Statement” on page 1699
Statements
4
LIST Statement
1621
LIST Statement Writes to the SAS log the input data record for the observation that is being processed. Valid:
in a DATA step
Category: Action Type: Executable
Syntax LIST;
Without Arguments The LIST statement causes the input data record for the observation being processed to be written to the SAS log.
Details The LIST statement operates only on data that is read with an INPUT statement; it has no effect on data that is read with a SET, MERGE, MODIFY, or UPDATE statement. In the SAS log, a ruler that indicates column positions appears before the first record listed. For variable-length records (RECFM=V), SAS writes the record length at the end of the input line. SAS does not write the length for fixed-length records (RECFM=F), unless the amount of data read does not equal the record length (LRECL).
Comparisons Action
LIST Statement
PUT Statement
Writes when
at the end of each iteration of the DATA step
immediately
Writes what
the input data records exactly as they appear
the variables or literals specified
Writes where
only to the SAS log
to the SAS log, the SAS output destination, or to any external file
Works with
INPUT statement only
any data-reading statement
Handles hexadecimal values
automatically prints a hexadecimal value if it encounters an unprintable character
represents characters in hexadecimal only when a hexadecimal format is given
Examples Example 1: Listing Records That Contain Missing Data This example uses the LIST statement to write to the SAS log any input records that contain missing data. Because of the #3 line pointer control in the INPUT statement,
1622
LIST Statement
4
Chapter 6
SAS reads three input records to create a single observation. Therefore, the LIST statement writes the three current input records to the SAS log each time a value for W2AMT is missing. data employee; input ssn 1-9 #3 w2amt 1-6; if w2amt=. then list; datalines; 23456789 JAMES SMITH 356.79 345671234 Jeffrey Thomas . ;
Output 6.17
Log Listing of Missing Data
RULE:----+----1----+----2----+----3----+----4----+----5----+---9 345671234 10 Jeffrey Thomas 11 .
The numbers 9, 10, and 11 are line numbers in the SAS log.
Example 2: Listing the Record Length of Variable-Length Records This example uses as input an external file that contains variable-length ID numbers. The RECFM=V option is specified in the INFILE statement, and the LIST statement writes the records to the SAS log. When the file has variable-length records, as indicated by the RECFM=V option in this example, SAS writes the record length at the end of each record that is listed in the SAS log. data employee; infile ’your-external-file’ recfm=v; input id $; list; run;
Output 6.18 RULE: 1 2 3 4 5 6
Log Listing of Variable-Length Records and Record Lengths ----+----1----+----2----+----3----+----4----+----5--23456789 8 123456789 9 5555555555 10 345671234 9 2345678910 10 2345678 7
See Also Statement: “PUT Statement” on page 1656
Statements
4
%LIST Statement
1623
%LIST Statement Displays lines that are entered in the current session. Valid:
anywhere
Category: Program Control
Syntax %LIST< n >;
Without Arguments In interactive line mode processing, if you use the %LIST statement without arguments, it displays all previously entered program lines.
Arguments n displays line n. n–m displays lines n through m. Alias: n:m
Details Where and When to Use The %LIST statement can be used anywhere in a SAS job except between a DATALINES or DATALINES4 statement and the matching semicolon (;) or semicolons (;;;;). This statement is useful mainly in interactive line mode sessions to display SAS program code on the monitor. It is also useful to determine lines to include when you use the %INCLUDE statement. Interactions CAUTION:
In all modes of execution, the SPOOL system option controls whether SAS statements are saved. When the SPOOL system option is in effect in interactive line mode, all SAS statements and data lines are saved automatically when they are submitted. You can display them by using the %LIST statement. When NOSPOOL is in effect, %LIST cannot display previous lines. 4
Examples This %LIST statement displays lines 10 through 20: %list 10-20;
See Also
1624
LOCK Statement
4
Chapter 6
Statement: “%INCLUDE Statement” on page 1534 System Option: “SPOOL System Option” on page 1945
LOCK Statement Acquires and releases an exclusive lock on an existing SAS file. Anywhere
Valid:
Category:
Program Control
You cannot lock a SAS file that another SAS session is currently accessing (either from an exclusive lock or because the file is open).
Restriction:
The LOCK statement syntax is the same whether you issue the statement in a single-user environment or in a client/server environment. However, some LOCK statement functionality applies only to a client/server environment.
Restriction:
Syntax LOCK libref> < LIST | QUERY | SHOW | CLEAR> ;
Arguments libref
is a name that is associated with a SAS library. The libref (library reference) must be a valid SAS name. If the libref is SASUSER or WORK, you must specify it. In a single-user environment, you typically would not issue the LOCK statement to exclusively lock a library. To lock a library that is accessed via a multiuser SAS/SHARE server, see the LOCK statement in the SAS/SHARE User’s Guide.
Tip:
member-name
is a valid SAS name that specifies a member of the SAS library that is associated with the libref. Restriction: The SAS file must be created before you can request a lock. For
information about locking a member of a SAS library when the member does not exist, see the SAS/SHARE User’s Guide. member-type
is the type of SAS file to be locked. For example, valid values are DATA, VIEW, CATALOG, MDDB, and so on. The default is DATA. entry-name
is the name of the catalog entry to be locked. In a single-user environment, if you issue the LOCK statement to lock an individual catalog entry, the entire catalog is locked; you typically would not issue the LOCK statement to exclusively lock a catalog entry. To lock a catalog entry in
Tip:
Statements
4
LOCK Statement
1625
a library that is accessed via a multiuser SAS/SHARE server, see the LOCK statement in the SAS/SHARE User’s Guide. entry-type
is the type of the catalog entry to be locked. Tip: In a single-user environment, if you issue the LOCK statement to lock an individual catalog entry, the entire catalog is locked; you typically would not issue the LOCK statement to exclusively lock a catalog entry. To lock a catalog entry in a library that is accessed via a multiuser SAS/SHARE server, see the LOCK statement in the SAS/SHARE User’s Guide. LIST | QUERY | SHOW
writes to the SAS log whether you have an exclusive lock on the specified SAS file. Tip: This option provides more information in a client/server environment. To use this option in a client/server environment, see the LOCK statement in the SAS/SHARE User’s Guide. CLEAR
releases a lock on the specified SAS file that was acquired by using the LOCK statement in your SAS session.
Details General Information
The LOCK statement enables you to acquire and release an exclusive lock on an existing SAS file. Once an exclusive lock is obtained, no other SAS session can read or write to the file until the lock is released. You release an exclusive lock by using the CLEAR option.
Acquiring Exclusive Access to a SAS File in a Single-User Environment
Each time you issue a SAS statement or a procedure to process a SAS file, the file is opened for input, update, or output processing. At the end of the step, the file is closed. In a program with multiple tasks, a file could be opened and closed multiple times. Because multiple SAS sessions in a single-user environment can access the same SAS file, issuing the LOCK statement to acquire an exclusive lock on the file protects data while it is being updated in a multistep program. For example, consider a nightly update process that consists of a DATA step to remove observations that are no longer useful, a SORT procedure to sort the file, and a DATASETS procedure to rebuild the file’s indexes. If another SAS session accesses the file between any of the steps, the SORT and DATASETS procedures would fail, because they require member-level locking (exclusive) access to the file. Including the LOCK statement before the DATA step provides the needed protection by acquiring exclusive access to the file. If the LOCK statement is successful, a SAS session that attempts to access the file between steps will be denied access, and the nightly update process runs uninterrupted. See Example 1 on page 1626.
Return Codes for the LOCK Statement The SAS macro variable SYSLCKRC contains the return code from the LOCK statement. The following actions result in a nonzero value in SYSLCKRC: 3 You try to lock a file but cannot obtain the lock (for example, the file was in use or is locked by another SAS session). 3 You use a LOCK statement with the LIST option to list a lock. 3 You use a LOCK statement with the CLEAR option to release a lock that you do not have. For more information about the SYSLCKRC SAS macro variable, see SAS Macro Language: Reference.
1626
LOCK Statement
4
Chapter 6
Comparisons 3 With SAS/SHARE software, you can also use the LOCK statement. Some LOCK statement functionality applies only to a client/server environment.
3 The CNTLLEV= data set option specifies the level at which shared update access to a SAS data set is denied.
Examples Example 1: Locking a SAS File
The following SAS program illustrates the process of locking a SAS data set. Including the LOCK statement provides protection for the multistep program by acquiring exclusive access to the file. Any SAS session that attempts to access the file between steps will be denied access, which ensures that the program runs uninterrupted. libname mydata ’SAS-library’; lock mydata.census; u data mydata.census; v modify mydata.census; (statements to remove obsolete observations) run; proc sort force data=mydata.census; w by CrimeRate; run; proc datasets library=mydata; x modify census; index create CrimeRate; quit; lock mydata.census clear; y
1 Acquires exclusive access to the SAS data set MYDATA.CENSUS. 2 Opens MYDATA.CENSUS to remove observations that are no longer useful. At
the end of the DATA step, the file is closed. However, because of the exclusive lock, any other SAS session that attempts to access the file is denied access. 3 Opens MYDATA.CENSUS to sort the file. At the end of the procedure, the file is
closed but not available to another SAS session. 4 Opens MYDATA.CENSUS to rebuild the file’s index. At the end of the procedure,
the file is closed but still not available to another SAS session. 5 Releases the exclusive lock on MYDATA.CENSUS. The data set is now available to
other SAS sessions.
See Also Data Set Option: “CNTLLEV= Data Set Option” on page 18 For information on locking a data object in a library that is accessed via a multiuser SAS/SHARE server, see the LOCK statement in the SAS/SHARE User’s Guide.
Statements
4
LOSTCARD Statement
1627
LOSTCARD Statement Resynchronizes the input data when SAS encounters a missing or invalid record in data that has multiple records per observation. Valid: in a DATA step Category: Action Type: Executable
Syntax LOSTCARD;
Without Arguments The LOSTCARD statement prevents SAS from reading a record from the next group when the current group has a missing record.
Details When to Use LOSTCARD When SAS reads multiple records to create a single observation, it does not discover that a record is missing until it reaches the end of the data. If there is a missing record in your data, the values for subsequent observations in the SAS data set might be incorrect. Using LOSTCARD prevents SAS from reading a record from the next group when the current group has fewer records than SAS expected. LOSTCARD is most useful when the input data have a fixed number of records per observation and when each record for an observation contains an identification variable that has the same value. LOSTCARD usually appears in conditional processing, for example, in the THEN clause of an IF-THEN statement, or in a statement in a SELECT group. When LOSTCARD Executes
When LOSTCARD executes, SAS takes several steps:
1 Writes three items to the SAS log: a lost card message, a ruler, and all the records
that it read in its attempt to build the current observation. 2 Discards the first record in the group of records being read, does not write an 3 4 5
6
observation, and returns processing to the beginning of the DATA step. Does not increment the automatic variable _N_ by 1. (Normally, SAS increments _N_ by 1 at the beginning of each DATA step iteration.) Attempts to build an observation by beginning with the second record in the group, and reads the number of records that the INPUT statement specifies. Repeats steps 1 through 4 when the IF condition for a lost card is still true. To make the log more readable, SAS prints the message and ruler only once for a given group of records. In addition, SAS prints each record only once, even if a record is used in successive attempts to build an observation. Builds an observation and writes it to the SAS data set when the IF condition for a lost card is no longer true.
Examples This example uses the LOSTCARD statement in a conditional construct to identify missing data records and to resynchronize the input data:
1628
LOSTCARD Statement
4
Chapter 6
data inspect; input id 1-3 #3 id3 if id ne id2 do; put ’DATA lostcard; end; datalines; 301 32 301 61432 301 127 302 61 302 83171 400 46 409 23145 400 197 411 53 411 99551 411 139 ;
age 8-9 #2 id2 1-3 loc 1-3 wt; or id ne id3 then RECORD ERROR: ’ id= id2= id3=;
The DATA step reads three input records before writing an observation. If the identification number in record 1 (variable ID) does not match the identification number in the second record (ID2) or third record (ID3), a record is incorrectly entered or omitted. The IF-THEN DO statement specifies that if an identification number is invalid, SAS prints the message that is specified in the PUT statement message and executes the LOSTCARD statement. In this example, the third record for the second observation (ID3=400) is missing. The second record for the third observation is incorrectly entered (ID=400 while ID2=409). Therefore, the data set contains two observations with ID values 301 and 411. There are no observations for ID=302 or ID=400. The PUT and LOSTCARD statements write these statements to the SAS log when the DATA step executes: Output 6.19
DATA RECORD ERROR: id=302 id2=302 id3=400 NOTE: LOST CARD. RULE:----+----1----+----2----+----3----+----4----+----5----+---14 302 61 15 302 83171 16 400 46 DATA RECORD ERROR: id=302 id2=400 id3=409 NOTE: LOST CARD. 17 409 23145 DATA RECORD ERROR: id=400 id2=409 id3=400 NOTE: LOST CARD. 18 400 197 DATA RECORD ERROR: id=409 id2=400 id3=411 NOTE: LOST CARD. 19 411 53 DATA RECORD ERROR: id=400 id2=411 id3=411 NOTE: LOST CARD. 20 411 99551
The numbers 14, 15, 16, 17, 18, 19, and 20 are line numbers in the SAS log.
Statements
4
MERGE Statement
1629
See Also Statement: “IF-THEN/ELSE Statement” on page 1532
MERGE Statement Joins observations from two or more SAS data sets into a single observation. in a DATA step Category: File-handling Type: Executable Valid:
Syntax MERGE SAS-data-set-1 SAS-data-set-2 ;
Arguments SAS-data-set
specifies at least two existing SAS data sets from which observations are read. You can specify individual data sets, data set lists, or a combination of both. Tip: You can specify additional SAS data sets. See: “Using Data Set Lists with MERGE” on page 1630 (data-set-options)
specifies one or more SAS data set options in parentheses after a SAS data set name. Explanation: The data set options specify actions that SAS is to take when it reads observations into the DATA step for processing. For a list of data set options, see “Data Set Options by Category” on page 12. Tip: Data set options that apply to a data set list apply to all of the data sets in the list. END=variable
names and creates a temporary variable that contains an end-of-file indicator. Explanation: The variable, which is initialized to 0, is set to 1 when the MERGE statement processes the last observation. If the input data sets have different numbers of observations, the END= variable is set to 1 when MERGE processes the last observation from all data sets. Tip: The END= variable is not added to any SAS data set that is being created.
Details Overview The MERGE statement is flexible and has a variety of uses in SAS programming. This section describes basic uses of MERGE. Other applications include
1630
MERGE Statement
4
Chapter 6
using more than one BY variable, merging more than two data sets, and merging a few observations with all observations in another data set. For more information, see “How to Prepare Your Data Sets” in SAS Language Reference: Concepts.
Using Data Set Lists with MERGE You can use data set lists with the MERGE statement. Data set lists provide a quick way to reference existing groups of data sets. These data set lists must be either name prefix lists or numbered range lists. Name prefix lists refer to all data sets that begin with a specified character string. For example, merge SALES1:; tells SAS to merge all data sets starting with "SALES1" such as SALES1, SALES10, SALES11, and SALES12. Numbered range lists require you to have a series of data sets with the same name, except for the last character or characters, which are consecutive numbers. In a numbered range list, you can begin with any number and end with any number. For example, these lists refer to the same data sets: sales1 sales2 sales3 sales4 sales1-sales4
Note: If the numeric suffix of the first data set name contains leading zeros, the number of digits in the numeric suffix of the last data set name must be greater than or equal to the number of digits in the first data set name; otherwise, an error will occur. For example, the data set lists sales001–sales99 and sales01–sales9 will cause an error. The data set list sales001–sales999 is valid. If the numeric suffix of the first data set name does not contain leading zeros, the number of digits in the numeric suffix of the first and last data set names do not have to be equal. For example, the data set list sales1–sales999 is valid. 4 Some other rules to consider when using numbered data set lists are as follows: 3 You can specify groups of ranges. merge cost1-cost4 cost11-cost14 cost21-cost24;
3 You can mix numbered range lists with name prefix lists. merge cost1-cost4 cost2: cost33-37;
3 You can mix single data sets with data set lists. merge cost1 cost10-cost20 cost30;
3 Quotation marks around data set lists are ignored. /* these two lines are the same */ merge sales1-sales4; merge ’sales1’n-’sales4’n;
3 Spaces in data set names are invalid. If quotation marks are used, trailing blanks are ignored. /* blanks in these statements will cause errors */ merge sales 1-sales 4; merge ’sales 1’n - ’sales 4’n; /* trailing blanks in this statement will be ignored */ merge ’sales1 ’n - ’sales4 ’n;
3 The maximum numeric suffix is 2147483647. /* this suffix will cause an error */ merge prod2000000000-prod2934850239;
Statements
4
MERGE Statement
1631
3 Physical pathnames are not allowed. /* physical pathnames will cause an error */ &let work_path = %sysfunc(pathname(WORK)); merge "&work_path\dept.sas7bdat"-"&work_path\emp.sas7bdat" ;
One-to-One Merging
One-to-one merging combines observations from two or more SAS data sets into a single observation in a new data set. To perform a one-to-one merge, use the MERGE statement without a BY statement. SAS combines the first observation from all data sets that are named in the MERGE statement into the first observation in the new data set, the second observation from all data sets into the second observation in the new data set, and so on. In a one-to-one merge, the number of observations in the new data set is equal to the number of observations in the largest data set named in the MERGE statement. See Example 1 for an example of a one-to-one merge. For more information, see “Reading, Combining, and Modifying SAS Data Sets” in SAS Language Reference: Concepts. CAUTION:
Use care when you combine data sets with a one-to-one merge. One-to-one merges can sometimes produce undesirable results. Test your program on representative samples of the data sets before you use this method. 4
Match-Merging
Match-merging combines observations from two or more SAS data sets into a single observation in a new data set according to the values of a common variable. The number of observations in the new data set is the sum of the largest number of observations in each BY group in all data sets. To perform a match-merge, use a BY statement immediately after the MERGE statement. The variables in the BY statement must be common to all data sets. Only one BY statement can accompany each MERGE statement in a DATA step. The data sets that are listed in the MERGE statement must be sorted in order of the values of the variables that are listed in the BY statement, or they must have an appropriate index. See Example 2 for an example of a match-merge. For more information, see “Reading, Combining, and Modifying SAS Data Sets” in SAS Language Reference: Concepts. Note: The MERGE statement does not produce a Cartesian product on a many-to-many match-merge. Instead it performs a one-to-one merge while there are observations in the BY group in at least one data set. When all observations in the BY group have been read from one data set and there are still more observations in another data set, SAS performs a one-to-many merge until all observations have been read for the BY group. 4
Comparisons 3 MERGE combines observations from two or more SAS data sets. UPDATE combines observations from exactly two SAS data sets. UPDATE changes or updates the values of selected observations in a master data set as well. UPDATE also might add observations.
3 Like UPDATE, MODIFY combines observations from two SAS data sets by changing or updating values of selected observations in a master data set.
3 The results that are obtained by reading observations using two or more SET statements are similar to the results that are obtained by using the MERGE statement with no BY statement. However, with the SET statements, SAS stops processing before all observations are read from all data sets if the number of observations are not equal. In contrast, SAS continues processing all observations in all data sets named in the MERGE statement.
1632
4
MISSING Statement
Chapter 6
Examples Example 1: One-to-One Merging
This example shows how to combine observations from two data sets into a single observation in a new data set: data benefits.qtr1; merge benefits.jan benefits.feb; run;
Example 2: Match-Merging This example shows how to combine observations from two data sets into a single observation in a new data set according to the values of a variable that is specified in the BY statement: data inventry; merge stock orders; by partnum; run;
Example 3: Merging with a Data Set List
This example uses a data list to define the
data sets that are merged. data d008; job=3; emp=19; data d009; job=3; sal=50; data d010; job=4; emp=97; data d011; job=4; sal=15; data comb; merge d008-d011; by job; run; proc print data=comb; run;
run; run; run; run;
See Also Statements: “BY Statement” on page 1403 “MODIFY Statement” on page 1634 “SET Statement” on page 1711 “UPDATE Statement” on page 1733 “Reading, Combining, and Modifying SAS Data Sets” in SAS Language Reference: Concepts
MISSING Statement Assigns characters in your input data to represent special missing values for numeric data. Valid:
anywhere
Category:
Information
Statements
4
MISSING Statement
1633
Syntax MISSING character(s);
Arguments character
is the value in your input data that represents a special missing value. Range: Special missing values can be any of the 26 letters of the alphabet
(uppercase or lowercase) or the underscore (_). You can specify more than one character.
Tip:
Details The MISSING statement usually appears within a DATA step, but it is global in scope.
Comparisons The MISSING= system option allows you to specify a character to be printed when numeric variables contain ordinary missing values (.). If your data contain characters that represent special missing values, such as a or z, do not use the MISSING= option to define them; simply define these values in a MISSING statement.
Examples With survey data, you might want to identify certain kinds of missing data. For example, in the data, an A can mean that the respondent is not at home at the time of the survey; an R can mean that the respondent refused to answer. Use the MISSING statement to identify to SAS that the values A and R in the input data lines are to be considered special missing values rather than invalid numeric data values: data survey; missing a r; input id answer; datalines; 001 2 002 R 003 1 004 A 005 2 ;
The resulting data set SURVEY contains exactly the values that are coded in the input data.
See Also Statement: “UPDATE Statement” on page 1733 System Option: “MISSING= System Option” on page 1886
1634
MODIFY Statement
4
Chapter 6
MODIFY Statement Replaces, deletes, and appends observations in an existing SAS data set in place but does not create an additional copy. in a DATA step Category: File-handling Type: Executable Valid:
Restriction:
Cannot modify the descriptor portion of a SAS data set, such as adding a
variable
Syntax uMODIFY master-data-set transaction-data-set ; BY by-variable; vMODIFY master-data-set KEY=index ; wMODIFY master-data-set POINT=variable; xMODIFY master-data-set ; CAUTION:
Damage to the SAS data set can occur if the system terminates abnormally during a DATA step that contains the MODIFY statement. Observations in native SAS data files might have incorrect data values, or the data file might become unreadable. DBMS tables that are referenced by views are not affected. 4 Note: If you modify a password-protected data set, specify the password with the appropriate data set option (ALTER= or PW=) within the MODIFY statement, and not in the DATA statement. 4
Arguments master-data-set
specifies the SAS data set that you want to modify. Restriction: This data set must also appear in the DATA statement. Restriction: The following restrictions apply:
3 For sequential and matching access, the master data set can be a SAS data file, a SAS/ACCESS view, an SQL view, or a DBMS engine for the LIBNAME statement. It cannot be a DATA step view or a pass-through view.
3 For random access using POINT=, the master data set must be a SAS data file or an SQL view that references a SAS data file.
3 For direct access using KEY=, the master data set can be a SAS data file or the DBMS engine for the LIBNAME statement. If it is a SAS file, it must be indexed and the index name must be specified on the KEY= option.
Statements
4
MODIFY Statement
1635
3 For a DBMS, the KEY= is set to the keyword DBKEY and the column names to use as an index must be specified on the DBKEY= data set option. These column names are used in constructing a WHERE expression that is passed to the DBMS. transaction-data-set
specifies the SAS data set that provides the values for matching access. These values are the values that you want to use to update the master data set. Restriction: Specify this data set only when the DATA step contains a BY statement. by-variable
specifies one or more variables by which you identify corresponding observations. END=variable
creates and names a temporary variable that contains an end-of-file indicator. Explanation: The variable, which is initialized to zero, is set to 1 when the MODIFY statement reads the last observation of the data set being modified (for sequential access x) or the last observation of the transaction data set (for matching access u). It is also set to 1 when MODIFY cannot find a match for a KEY= value (random access v w). This variable is not added to any data set. Restriction: Do not use this argument in the same MODIFY statement with the POINT= argument. POINT= indicates that MODIFY uses random access. The value of the END= variable is never set to 1 for random access. KEY=index
specifies a simple or composite index of the SAS data file that is being modified. The KEY= argument retrieves observations from that SAS data file based on index values that are supplied by like-named variables in another source of information. Default: If the KEY= value is not found, the automatic variable _ERROR_ is set to 1, and the automatic variable _IORC_ receives the value corresponding to the SYSRC autocall macro’s mnemonic _DSENOM. See “Automatic Variable _IORC_ and the SYSRC Autocall Macro” on page 1638 . Restriction: KEY= processing is different for SAS/ACCESS engines. See the SAS/ACCESS documentation for more information. Tip: Examples of sources for index values include a separate SAS data set named in a SET statement and an external file that is read by an INPUT statement. Tip: If duplicates exist in the master file, only the first occurrence is updated unless you use a DO-LOOP to execute a SET statement for the data set that is listed on the KEY=option for all duplicates in the master data set. If duplicates exist in the transaction data set, and they are consecutive, use the UNIQUE option to force the search for a match in the master data set to begin at the top of the index. Write an accumulation statement to add each duplicate transaction to the observation in master. Without the UNIQUE option, only the first duplicate transaction observation updates the master. If the duplicates in the transaction data set are not consecutive, the search begins at the beginning of the index each time, so that each duplicate is applied to the master. Write an accumulation statement to add each duplicate to the master. See Also: UNIQUE on page 1636 Featured in: Example 4 on page 1646, Example 5 on page 1646, and Example 6 on page 1648 NOBS=variable
creates and names a temporary variable whose value is usually the total number of observations in the input data set. For certain SAS views, SAS cannot determine the
1636
MODIFY Statement
4
Chapter 6
number of observations. In these cases, SAS sets the value of the NOBS= variable to the largest positive integer value available in the operating environment. Explanation: At compilation time, SAS reads the descriptor portion of the data set and assigns the value of the NOBS= variable automatically. Thus, you can refer to the NOBS= variable before the MODIFY statement. The variable is available in the DATA step but is not added to the new data set. Tip: The NOBS= and POINT= options are independent of each other. Featured in: Example 3 on page 1644 POINT=variable
reads SAS data sets using random (direct) access by observation number. variable names a variable whose value is the number of the observation to read. The POINT= variable is available anywhere in the DATA step, but it is not added to any SAS data set. Requirement: When using the POINT= argument, include one or both of the following programming constructs: 3 a STOP statement 3 programming logic that checks for an invalid value of the POINT= variable Because POINT= reads only the specified observations, SAS cannot detect an end-of-file condition as it would if the file were being read sequentially. Because detecting an end-of-file condition terminates a DATA step automatically, failure to substitute another means of terminating the DATA step when you use POINT= can cause the DATA step to go into a continuous loop. Restriction: You cannot use the POINT= option with any of the following: 3 BY statement 3 WHERE statement 3 WHERE= data set option 3 transport format data sets 3 sequential data sets (on tape or disk) 3 a table from another vendor’s relational database management system. Restriction: You can use POINT= with compressed data sets only if the data set
was created with the POINTOBS= data set option set to YES, the default value. Restriction: You can use the random access method on compressed files only with SAS version 7 and beyond. Tip: If the POINT= value does not match an observation number, SAS sets the automatic variable _ERROR_ to 1. Featured in: Example 3 on page 1644 UNIQUE
causes a KEY= search always to begin at the top of the index for the data file being modified. Restriction: UNIQUE can appear only with the KEY= option. Tip: Use UNIQUE when there are consecutive duplicate KEY= values in the transaction data set, so that the search for a match in the master data set begins at the top of the index file for each duplicate transaction. You must include an accumulation statement or the duplicate values overwrite each other causing only the last transaction value to be the result in the master observation. Featured in: Example 5 on page 1646 UPDATEMODE=MISSINGCHECK | NOMISSINGCHECK
Statements
4
MODIFY Statement
1637
specifies whether missing variable values in a transaction data set are to be allowed to replace existing variable values in a master data set. MISSINGCHECK prevents missing variable values in a transaction data set from replacing values in a master data set. NOMISSINGCHECK allows missing variable values in a transaction data set to replace values in a master data set by preventing the check from being performed. Default: MISSINGCHECK
The UPDATEMODE argument must be accompanied by a BY statement that specifies the variables by which observations are matched.
Requirement:
However, special missing values are the exception and they replace values in the master data set even when MISSINGCHECK is in effect.
Tip:
Details uMatching Access The matching access method uses the BY statement to match observations from the transaction data set with observations in the master data set. The BY statement specifies a variable that is in the transaction data set and the master data set. When the MODIFY statement reads an observation from the transaction data set, it uses dynamic WHERE processing to locate the matching observation in the master data set. The observation in the master data set can be either
3 replaced in the master data set with the value from the transaction data set 3 deleted from the master data set 3 appended to the master data set. Example 2 on page 1643 shows the matching access method.
uDuplicate BY Values
Duplicates in the master and transaction data sets affect
processing.
3 If duplicates exist in the master data set, only the first occurrence is updated because the generated WHERE statement always finds the first occurrence in the master. 3 If duplicates exist in the transaction data set, the duplicates are applied one on top of another unless you write an accumulation statement to add all of them to the master observation. Without the accumulation statement, the values in the duplicates overwrite each other so that only the value in the last transaction is the result in the master observation.
vDirect Access by Indexed Values This method requires that you use the KEY= option in the MODIFY statement to name an indexed variable from the data set that is being modified. Use another data source (typically a SAS data set named in a SET statement or an external file read by an INPUT statement) to provide a like-named variable whose values are supplied to the index. MODIFY uses the index to locate observations in the data set that is being modified. Example 4 on page 1646 shows the direct-access-by-indexed-values method. vDuplicate Index Values 3 If there are duplicate values of the indexed variable in the master data set, only the first occurrence is retrieved, modified, or replaced. Use a DO LOOP to execute
1638
MODIFY Statement
4
Chapter 6
a SET statement with the KEY= option multiple times to update all duplicates with the transaction value.
3 If there are duplicate, nonconsecutive values in the like-named variable in the data source, MODIFY applies each transaction cumulatively to the first observation in the master data set whose index value matches the values from the data source. Therefore, only the value in the last duplicate transaction is the result in the master observation unless you write an accumulation statement to accumulate each duplicate transaction value in the master observation.
3 If there are duplicate, consecutive values in the variable in the data source, the values from the first observation in the data source are applied to the master data set, but the DATA step terminates with an error when it tries to locate an observation in the master data set for the second duplicate from the data source. To avoid this error, use the UNIQUE option in the MODIFY statement. The UNIQUE option causes SAS to return to the top of the master data set before retrieving a match for the index value. You must write an accumulation statement to accumulate the values from all the duplicates. If you do not, only the last one applied is the result in the master observation. Example 5 on page 1646 shows how to handle duplicate index values.
3 If there are duplicate index values in both data sets, you can use SQL to apply the duplicates in the transaction data set to the duplicates in the master data set in a one-to-one correspondence.
wDirect (Random) Access by Observation Number
You can use the POINT= option in the MODIFY statement to name a variable from another data source (not the master data set), whose value is the number of an observation that you want to modify in the master data set. MODIFY uses the values of the POINT= variable to retrieve observations in the data set that you are modifying. (You can use POINT= on a compressed data set only if the data set was created with the POINTOBS= data set option.) It is good programming practice to validate the value of the POINT= variable and to check the status of the automatic variable _ERROR_. Example 3 on page 1644 shows the direct (random) access by observation number method. CAUTION:
POINT= can result in infinite looping. Be careful when you use POINT=, as failure to terminate the DATA step can cause the DATA step to go into a continuous loop. Use a STOP statement, programming logic that checks for an invalid value of the POINT= variable, or both. 4
xSequential Access The sequential access method is the simplest form of the MODIFY statement, but it provides less control than the direct access methods. With the sequential access method, you can use the NOBS= and END= options to modify a data set; you do not use the POINT= or KEY= options. Preparing Your Data Sets before Using MODIFY
There are a number of things you can do to improve performance and get the results you want when using the MODIFY statement. For more information, see “Combining SAS Data Sets: Basic Concepts” in SAS Language Reference: Concepts.
Automatic Variable _IORC_ and the SYSRC Autocall Macro The automatic variable _IORC_ contains the return code for each I/O operation that the MODIFY statement attempts to perform. The best way to test for values of _IORC_ is with the mnemonic codes that are provided by the SYSRC autocall macro. Each mnemonic code describes
Statements
4
MODIFY Statement
1639
one condition. The mnemonics provide an easy method for testing problems in a DATA step program. These codes are useful: _DSENMR specifies that the transaction data set observation does not exist on the master data set (used only with MODIFY and BY statements). If consecutive observations with different BY values do not find a match in the master data set, both of them return _DSENMR. _DSEMTR specifies that multiple transaction data set observations with a given BY value do not exist on the master data set (used only with MODIFY and BY statements). If consecutive observations with the same BY values do not find a match in the master data set, the first observation returns _DSENMR and the subsequent observations return _DSEMTR. _DSENOM specifies that the data set being modified does not contain the observation that is requested by the KEY= option or the POINT= option. _SENOCHN specifies that SAS is attempting to execute an OUTPUT or REPLACE statement on an observation that contains a key value which duplicates one already existing on an indexed data set that requires unique key values. _SOK specifies that the observation was located.
Note: The IORCMSG function returns a formatted error message associated with the current value of _IORC_. 4 Example 6 on page 1648 shows how to use the automatic variable _IORC_ and the SYSRC autocall macro.
Writing Observations When MODIFY Is Used in a DATA Step The way SAS writes observations to a SAS data set when the DATA step contains a MODIFY statement depends on whether certain other statements are present. The possibilities are no explicit statement writes the current observation to its original place in the SAS data set. The action occurs as the last action in the step (as if a REPLACE statement were the last statement in the step). OUTPUT statement if no data set is specified in the OUTPUT statement, writes the current observation to the end of all data sets that are specified in the DATA step. If a data set is specified, the statement writes the current observation to the end of the data set that is indicated. The action occurs at the point in the DATA step where the OUTPUT statement appears. REPLACE statement rewrites the current observation in the specified data set or data sets, or, if no argument is specified, rewrites the current observation in each data set specified on the DATA statement. The action occurs at the point of the REPLACE statement. REMOVE statement deletes the current observation in the specified data set or data sets, or, if no argument is specified, deletes the current observation in each data set specified on
1640
MODIFY Statement
4
Chapter 6
the DATA statement. The deletion can be a physical one or a logical one, depending on the characteristics of the engine that maintains the data set. Remember the following as you work with these statements:
3 When no OUTPUT, REPLACE, or REMOVE statement is specified, the default action is REPLACE.
3 The OUTPUT, REPLACE, and REMOVE statements are independent of each other. You can code multiple OUTPUT, REPLACE, and REMOVE statements to apply to one observation. However, once an OUTPUT, REPLACE, or REMOVE statement executes, the MODIFY statement must execute again before the next REPLACE or REMOVE statement executes. You can use OUTPUT and REPLACE in the following example of conditional logic because only one of the REPLACE or OUTPUT statements executes per observation: data master; modify master trans; by key; if _iorc_=0 then replace; else output; run;
But you should not use multiple REPLACE operations on the same observation as in this example: data master; modify master; x=1; replace; replace; run;
You can code multiple OUTPUT statements per observation. However, be careful when you use multiple OUTPUT statements. It is possible to go into an infinite loop with just one OUTPUT statement. data master; modify master; output; run;
3 Using OUTPUT, REPLACE, or REMOVE in a DATA step overrides the default replacement of observations. If you use any one of these statements in a DATA step, you must explicitly program each action that you want to take.
3 If both an OUTPUT statement and a REPLACE or REMOVE statement execute on a given observation, perform the OUTPUT action last to keep the position of the observation pointer correct. Example 7 on page 1649 shows how to use the OUTPUT, REMOVE, and REPLACE statements to write observations.
Missing Values and the MODIFY Statement By default, the UPDATEMODE=MISSINGCHECK option is in effect, so missing values in the transaction data set do not replace existing values in the master data set. Therefore, if you want to update some but not all variables and if the variables that you want to update differ from one observation to the next, set to missing those variables that are not changing. If you want missing values in the transaction data set to replace existing values in the master data set, use UPDATEMODE=NOMISSINGCHECK.
Statements
4
MODIFY Statement
1641
Even when UPDATEMODE=MISSINGCHECK is in effect, you can replace existing values with missing values by using special missing value characters in the transaction data set. To create the transaction data set, use the MISSING statement in the DATA step. If you define one of the special missing values A through Z for the transaction data set, SAS updates numeric variables in the master data set to that value. If you want the resulting value in the master data set to be a regular missing value, use a single underscore (_) to represent missing values in the transaction data set. The resulting value in the master data set will be a period (.) for missing numeric values and a blank for missing character values. For more information about defining and using special missing value characters, see “MISSING Statement” on page 1632.
Using MODIFY with Data Set Options
If you use data set options (such as KEEP=) in your program, then use the options in the MODIFY statement for the master data set. Using data set options in the DATA statement might produce unexpected results.
Using MODIFY in a SAS/SHARE Environment
In a SAS/SHARE environment, the MODIFY statement accesses an observation in update mode. That is, the observation is locked from the time MODIFY reads it until a REPLACE or REMOVE statement executes. At that point the observation is unlocked. It cannot be accessed until it is re-read with the MODIFY statement. The MODIFY statement opens the data set in update mode, but the control level is based on the statement used. For example, KEY= and POINT= are member-level locking. Refer to SAS/SHARE User’s Guide for more information.
Comparisons 3 When you use a MERGE, SET, or UPDATE statement in a DATA step, SAS creates a new SAS data set. The data set descriptor of the new copy can be different from the old one (variables added or deleted, labels changed, and so on). When you use a MODIFY statement in a DATA step, however, SAS does not create a new copy of the data set. As a result, the data set descriptor cannot change. For information on DBMS replacement rules, see the SAS/ACCESS documentation. 3 If you use a BY statement with a MODIFY statement, MODIFY works much like the UPDATE statement, except that 3 neither the master data set nor the transaction data set needs to be sorted or indexed. (The BY statement that is used with MODIFY triggers dynamic WHERE processing.) Note: Dynamic WHERE processing can be costly if the MODIFY statement modifies a SAS data set that is not in sorted order or has not been indexed. Having the master data set in sorted order or indexed and having the transaction data set in sorted order reduces processing overhead, especially for large files. 4 3 both the master data set and the transaction data set can have observations with duplicate values of the BY variables. MODIFY treats the duplicates as described in “uDuplicate BY Values” on page 1637. 3 MODIFY cannot make any changes to the descriptor information of the data set as UPDATE can. Thus, it cannot add or delete variables, change variable labels, and so on.
Input Data Set for Examples The examples modify the INVTY.STOCK data set. INVTY.STOCK contains these variables:
1642
MODIFY Statement
4
Chapter 6
PARTNO is a character variable with a unique value identifying each tool number. DESC is a character variable with the text description of each tool. INSTOCK is a numeric variable with a value describing how many units of each tool the company has in stock. RECDATE is a numeric variable containing the SAS date value that is the day for which INSTOCK values are current. PRICE is a numeric variable with a value that describes the unit price for each tool. In addition, INVTY.STOCK contains a simple index on PARTNO. This DATA step creates INVTY.STOCK: libname invty ’SAS-library’;
options yearcutoff= 1920; data invty.stock(index=(partno)); input PARTNO $ DESC $ INSTOCK @17 RECDATE date7. @25 PRICE; format recdate date7.; datalines; K89R seal 34 27jul95 245.00 M4J7 sander 98 20jun95 45.88 LK43 filter 121 19may96 10.99 MN21 brace 43 10aug96 27.87 BC85 clamp 80 16aug96 9.55 NCF3 valve 198 20mar96 24.50 KJ66 cutter 6 18jun96 19.77 UYN7 rod 211 09sep96 11.55 JD03 switch 383 09jan97 13.99 BV1E timer 26 03jan97 34.50 ;
Examples Example 1: Modifying All Observations This example replaces the date on all of the records in the data set INVTY.STOCK with the current date. It also replaces the value of the variable RECDATE with the current date for all observations in INVTY.STOCK: data invty.stock; modify invty.stock; recdate=today(); run; proc print data=invty.stock noobs; title ’INVTY.STOCK’; run;
Statements
Output 6.20
4
MODIFY Statement
1643
Results of Updating the RECDATE Field INVTY.STOCK PARTNO K89R M4J7 LK43 MN21 BC85 NCF3 KJ66 UYN7 JD03 BV1E
DESC
1
INSTOCK
RECDATE
PRICE
34 98 121 43 80 198 6 211 383 26
14MAR97 14MAR97 14MAR97 14MAR97 14MAR97 14MAR97 14MAR97 14MAR97 14MAR97 14MAR97
245.00 45.88 10.99 27.87 9.55 24.50 19.77 11.55 13.99 34.50
seal sander filter brace clamp valve cutter rod switch timer
The MODIFY statement opens INVTY.STOCK for update processing. SAS reads one observation of INVTY.STOCK for each iteration of the DATA step and performs any operations that the code specifies. In this case, the code replaces the value of RECDATE with the result of the TODAY function for every iteration of the DATA step. An implicit REPLACE statement at the end of the step writes each observation to its previous location in INVTY.STOCK.
Example 2: Modifying Observations Using a Transaction Data Set
This example adds the quantity of newly received stock to its data set INVTY.STOCK as well as updating the date on which stock was received. The transaction data set ADDINV in the WORK library contains the new data. The ADDINV data set is the data set that contains the updated information. ADDINV contains these variables: PARTNO is a character variable that corresponds to the indexed variable PARTNO in INVTY.STOCK. NWSTOCK is a numeric variable that represents quantities of newly received stock for each tool. ADDINV is the second data set in the MODIFY statement. SAS uses it as the transaction data set and reads each observation from ADDINV sequentially. Because the BY statement specifies the common variable PARTNO, MODIFY finds the first occurrence of the value of PARTNO in INVTY.STOCK that matches the value of PARTNO in ADDINV. For each observation with a matching value, the DATA step changes the value of RECDATE to today’s date and replaces the value of INSTOCK with the sum of INSTOCK and NWSTOCK (from ADDINV). MODIFY does not add NWSTOCK to the INVTY.STOCK data set because that would modify the data set descriptor. Thus, it is not necessary to put NWSTOCK in a DROP statement. This example specifies ADDINV as the transaction data set that contains information to modify INVTY.STOCK. A BY statement specifies the shared variable whose values locate the observations in INVTY.STOCK. This DATA step creates ADDINV: data addinv; input PARTNO $ NWSTOCK; datalines; K89R 55 M4J7 21 LK43 43
1644
MODIFY Statement
MN21 BC85 NCF3 KJ66 UYN7 JD03 BV1E ;
4
Chapter 6
73 57 90 2 108 55 27
This DATA step uses values from ADDINV to update INVTY.STOCK. libname invty ’SAS-library’;
data invty.stock; modify invty.stock addinv; by partno; RECDATE=today(); INSTOCK=instock+nwstock; if _iorc_=0 then replace; run;
proc print data=invty.stock noobs; title ’INVTY.STOCK’; run;
Output 6.21
Results of Updating the INSTOCK and RECDATE Fields INVTY.STOCK PARTNO K89R M4J7 LK43 MN21 BC85 NCF3 KJ66 UYN7 JD03 BV1E
DESC seal sander filter brace clamp valve cutter rod switch timer
1
INSTOCK
RECDATE
PRICE
89 119 164 116 137 288 8 319 438 53
14MAR97 14MAR97 14MAR97 14MAR97 14MAR97 14MAR97 14MAR97 14MAR97 14MAR97 14MAR97
245.00 45.88 10.99 27.87 9.55 24.50 19.77 11.55 13.99 34.50
Example 3: Modifying Observations Located by Observation Number This example reads the data set NEWP, determines which observation number in INVTY.STOCK to update based on the value of TOOL_OBS, and performs the update. This example explicitly specifies the update activity by using an assignment statement to replace the value of PRICE with the value of NEWP. The data set NEWP contains two variables: TOOL_OBS contains the observation number of each tool in the tool company’s master data set, INVTY.STOCK. NEWP contains the new price for each tool. This DATA step creates NEWP:
Statements
data newp; input TOOL_OBS NEWP; datalines; 1 251.00 2 49.33 3 12.32 4 30.00 5 15.00 6 25.75 7 22.00 8 14.00 9 14.32 10 35.00 ;
This DATA step updates INVTY.STOCK: libname invty ’SAS-library’;
data invty.stock; set newp; modify invty.stock point=tool_obs nobs=max_obs; if _error_=1 then do; put ’ERROR occurred for TOOL_OBS=’ tool_obs / ’during DATA step iteration’ _n_ / ’TOOL_OBS value may be out of range.’; _error_=0; stop; end; PRICE=newp; RECDATE=today(); run;
proc print data=invty.stock noobs; title ’INVTY.STOCK’; run;
Output 6.22
Results of Updating the RECDATE and PRICE Fields INVTY.STOCK PARTNO K89R M4J7 LK43 MN21 BC85 NCF3 KJ66 UYN7 JD03 BV1E
DESC seal sander filter brace clamp valve cutter rod switch timer
1
INSTOCK
RECDATE
PRICE
34 98 121 43 80 198 6 211 383 26
14MAR97 14MAR97 14MAR97 14MAR97 14MAR97 14MAR97 14MAR97 14MAR97 14MAR97 14MAR97
251.00 49.33 12.32 30.00 15.00 25.75 22.00 14.00 14.32 35.00
4
MODIFY Statement
1645
1646
MODIFY Statement
4
Chapter 6
Example 4: Modifying Observations Located by an Index
This example uses the KEY= option to identify observations to retrieve by matching the values of PARTNO from ADDINV with the indexed values of PARTNO in INVTY.STOCK. ADDINV is created in Example 2 on page 1643. KEY= supplies index values that allow MODIFY to access directly the observations to update. No dynamic WHERE processing occurs. In this example, you specify that the value of INSTOCK in the master data set INVTY.STOCK increases by the value of the variable NWSTOCK from the transaction data set ADDINV. libname invty ’SAS-library’; data invty.stock; set addinv; modify invty.stock key=partno; INSTOCK=instock+nwstock; RECDATE=today(); if _iorc_=0 then replace; run; proc print data=invty.stock noobs; title ’INVTY.STOCK’; run;
Output 6.23 Index
Results of Updating the INSTOCK and RECDATE Fields by Using an
INVTY.STOCK PARTNO K89R M4J7 LK43 MN21 BC85 NCF3 KJ66 UYN7 JD03 BV1E
DESC
1
INSTOCK
RECDATE
PRICE
89 119 164 116 137 288 8 319 438 53
14MAR97 14MAR97 14MAR97 14MAR97 14MAR97 14MAR97 14MAR97 14MAR97 14MAR97 14MAR97
245.00 45.88 10.99 27.87 9.55 24.50 19.77 11.55 13.99 34.50
seal sander filter brace clamp valve cutter rod switch timer
Example 5: Handling Duplicate Index Values
This example shows how MODIFY handles duplicate values of the variable in the SET data set that is supplying values to the index on the master data set. The NEWINV data set is the data set that contains the updated information. NEWINV contains these variables: PARTNO is a character variable that corresponds to the indexed variable PARTNO in INVTY.STOCK. The NEWINV data set contains duplicate values for PARTNO; M4J7 appears twice. NWSTOCK is a numeric variable that represents quantities of newly received stock for each tool. This DATA step creates NEWINV: data newinv; input PARTNO $ NWSTOCK;
Statements
4
MODIFY Statement
1647
datalines; K89R 55 M4J7 21 M4J7 26 LK43 43 MN21 73 BC85 57 NCF3 90 KJ66 2 UYN7 108 JD03 55 BV1E 27 ;
This DATA step terminates with an error when it tries to locate an observation in INVTY.STOCK to match with the second occurrence of M4J7 in NEWINV: libname invty ’SAS-library’;
/* This DATA step terminates with an error! */ data invty.stock; set newinv; modify invty.stock key=partno; INSTOCK=instock+nwstock; RECDATE=today(); run;
This message appears in the SAS log: ERROR: No matching observation was found in MASTER data set. PARTNO=K89R NWSTOCK=55 DESC= INSTOCK=. RECDATE=14MAR97 PRICE=. _ERROR_=1 _IORC_=1230015 _N_=1 NOTE: Missing values were generated as a result of performing an operation on missing values. Each place is given by: (Number of times) at (Line):(Column). 1 at 689:19 NOTE: The SAS System stopped processing this step because of errors. NOTE: The data set INVTY.STOCK has been updated. There were 0 observations rewritten, 0 observations added and 0 observations deleted.
Adding the UNIQUE option to the MODIFY statement avoids the error in the previous DATA step. The UNIQUE option causes the DATA step to return to the top of the index each time it looks for a match for the value from the SET data set. Thus, it finds the M4J7 in the MASTER data set for each occurrence of M4J7 in the SET data set. The updated result for M4J7 in the output shows that both values of NWSTOCK from NEWINV for M4J7 are added to the value of INSTOCK for M4J7 in INVTY.STOCK. An accumulation statement sums the values; without it, only the value of the last instance of M4J7 would be the result in INVTY.STOCK. data invty.stock; set newinv; modify invty.stock key=partno / unique; INSTOCK=instock+nwstock;
1648
MODIFY Statement
4
Chapter 6
RECDATE=today(); if _iorc_=0 then replace; run; proc print data=invty.stock noobs; title ’Results of Using the UNIQUE Option’; run;
Output 6.24 Results of Updating the INSTOCK and RECDATE Fields by Using the UNIQUE Option Results of Using the UNIQUE Option PARTNO K89R M4J7 LK43 MN21 BC85 NCF3 KJ66 UYN7 JD03 BV1E
DESC
1
INSTOCK
RECDATE
PRICE
89 145 164 116 137 288 8 319 438 53
14MAR97 14MAR97 14MAR97 14MAR97 14MAR97 14MAR97 14MAR97 14MAR97 14MAR97 14MAR97
245.00 45.88 10.99 27.87 9.55 24.50 19.77 11.55 13.99 34.50
seal sander filter brace clamp valve cutter rod switch timer
Example 6: Controlling I/O
This example uses the SYSRC autocall macro and the _IORC_ automatic variable to control I/O condition. This technique helps to prevent unexpected results that could go undetected. This example uses the direct access method with an index to update INVTY.STOCK. The data in the NEWSHIP data set updates INVTY.STOCK. This DATA step creates NEWSHIP: options yearcutoff= 1920; data newship; input PARTNO $ DESC $ NWSTOCK @17 SHPDATE date7. @25 NWPRICE; datalines; K89R seal 14 14nov96 245.00 M4J7 sander 24 23aug96 47.98 LK43 filter 11 29jan97 14.99 MN21 brace 9 09jan97 27.87 BC85 clamp 12 09dec96 10.00 ME34 cutter 8 14nov96 14.50 ;
Each WHEN clause in the SELECT statement specifies actions for each input/output return code that is returned by the SYSRC autocall macro:
3 _SOK indicates that the MODIFY statement executed successfully. 3 _DSENOM indicates that no matching observation was found in INVTY.STOCK. The OUTPUT statement specifies that the observation be appended to INVTY.STOCK. See the last observation in the output.
3 If any other code is returned by SYSRC, the DATA step terminates and the PUT statement writes the message to the log. libname invty ’SAS-library’;
Statements
4
MODIFY Statement
1649
data invty.stock; set newship; modify invty.stock key=partno; select (_iorc_); when (%sysrc(_sok)) do; INSTOCK=instock+nwstock; RECDATE=shpdate; PRICE=nwprice; replace; end; when (%sysrc(_dsenom)) do; INSTOCK=nwstock; RECDATE=shpdate; PRICE=nwprice; output; _error_=0; end; otherwise do; put ’An unexpected I/O error has occurred.’/ ’Check your data and your program’; _error_=0; stop; end; end; run; proc print data=invty.stock noobs; title ’INVTY.STOCK Data Set’; run;
Output 6.25
The Updated INVTY.STOCK Data Set INVTY.STOCK Data Set PARTNO K89R M4J7 LK43 MN21 BC85 NCF3 KJ66 UYN7 JD03 BV1E ME34
DESC seal sander filter brace clamp valve cutter rod switch timer cutter
1
INSTOCK
RECDATE
PRICE
48 122 132 52 92 198 6 211 383 26 8
14NOV96 23AUG96 29JAN97 09JAN97 09DEC96 20MAR96 18JUN96 09SEP96 09JAN97 03JAN97 14NOV96
245.00 47.98 14.99 27.87 10.00 24.50 19.77 11.55 13.99 34.50 14.50
Example 7: Replacing and Removing Observations and Writing Observations to Different SAS Data Sets This example shows that you can replace and remove (delete) observations and write observations to different data sets. Further, this example shows that if an OUTPUT, REPLACE, or REMOVE statement is present, you must specify explicitly what action to take because no default statement is generated. The parts that were received in 1997 are output to INVTY.STOCK97 and are removed from INVTY.STOCK. Likewise, the parts that were received in 1995 are output to INVTY.STOCK95 and are removed from INVTY.STOCK. Only the parts that
1650
Null Statement
4
Chapter 6
were received in 1996 remain in INVTY.STOCK, and the PRICE is updated only in INVTY.STOCK. libname invty ’SAS-library’; data invty.stock invty.stock95 invty.stock97; modify invty.stock; if recdate>’01jan97’d then do; output invty.stock97; remove invty.stock; end; else if recdate;
Without Arguments Using OUTPUT without arguments causes the current observation to be written to all data sets that are named in the DATA statement. Note: If a MODIFY statement is present, OUTPUT with no arguments writes the current observation to the end of the data set that is specified in the MODIFY statement. 4
Arguments data-set-name specifies the name of a data set to which SAS writes the observation. Restriction: All names specified in the OUTPUT statement must also appear in
the DATA statement. You can specify up to as many data sets in the OUTPUT statement as you specified in the DATA statement for that DATA step.
Tip:
Details When and Where the OUTPUT Statement Writes Observations The OUTPUT statement tells SAS to write the current observation to a SAS data set immediately, not at the end
1654
OUTPUT Statement
4
Chapter 6
of the DATA step. If no data set name is specified in the OUTPUT statement, the observation is written to the data set or data sets that are listed in the DATA statement.
Implicit versus Explicit Output
By default, every DATA step contains an implicit OUTPUT statement at the end of each iteration that tells SAS to write observations to the data set or data sets that are being created. Placing an explicit OUTPUT statement in a DATA step overrides the automatic output, and SAS adds an observation to a data set only when an explicit OUTPUT statement is executed. Once you use an OUTPUT statement to write an observation to any one data set, however, there is no implicit OUTPUT statement at the end of the DATA step. In this situation, a DATA step writes an observation to a data set only when an explicit OUTPUT executes. You can use the OUTPUT statement alone or as part of an IF-THEN or SELECT statement or in DO-loop processing.
When Using the MODIFY Statement
When you use the MODIFY statement with the OUTPUT statement, the REMOVE and REPLACE statements override the implicit write action at the end of each DATA step iteration. See “Comparisons” on page 1654 for more information. If both the OUTPUT statement and a REPLACE or REMOVE statement execute on a given observation, perform the output action last to keep the position of the observation pointer correct.
Comparisons 3 OUTPUT writes observations to a SAS data set; PUT writes variable values or text strings to an external file or the SAS log. 3 To control when an observation is written to a specified output data set, use the OUTPUT statement. To control which variables are written to a specified output data set, use the KEEP= or DROP= data set option in the DATA statement, or use the KEEP or DROP statement. 3 When you use the OUTPUT statement with the MODIFY statement, the following items apply. 3 Using an OUTPUT, REPLACE, or REMOVE statement overrides the default write action at the end of a DATA step. (OUTPUT is the default action; REPLACE becomes the default action when a MODIFY statement is used.) If you use any of these statements in a DATA step, you must explicitly program output for the new observations that are added to the data set. 3 The OUTPUT, REPLACE, and REMOVE statements are independent of each other. More than one statement can apply to the same observation, as long as the sequence is logical. 3 If both an OUTPUT and a REPLACE or REMOVE statement execute on a given observation, perform the OUTPUT action last to keep the position of the observation pointer correct.
Examples Example 1: Sample Uses of OUTPUT These examples show how you can use an OUTPUT statement: 3 This line of code writes the current observation to a SAS data set. output;
3 This line of code writes the current observation to a SAS data set when a specified condition is true. if deptcode gt 2000 then output;
Statements
4
OUTPUT Statement
1655
3 This line of code writes an observation to the data set MARKUP when the PHONE value is missing. if phone=. then output markup;
Example 2: Creating Multiple Observations from Each Line of Input
You can create two or more observations from each line of input data. This SAS program creates three observations in the data set RESPONSE for each observation in the data set SULFA: data response(drop=time1-time3); set sulfa; time=time1; output; time=time2; output; time=time3; output; run;
Example 3: Creating Multiple Data Sets from a Single Input File You can create more than one SAS data set from one input file. In this example, OUTPUT writes observations to two data sets, OZONE and OXIDES: options yearcutoff= 1920; data ozone oxides; infile file-specification; input city $ 1-15 date date9. chemical $ 26-27 ppm 29-30; if chemical=’O3’ then output ozone; else output oxides; run;
Example 4: Creating One Observation from Several Lines of Input You can combine several input observations into one observation. In this example, OUTPUT creates one observation that totals the values of DEFECTS in the first ten observations of the input data set: data discards; set gadgets; drop defects; reps+1; if reps=1 then total=0; total+defects; if reps=10 then do; output; stop; end; run;
See Also Statements: “DATA Statement” on page 1416 “MODIFY Statement” on page 1634 “PUT Statement” on page 1656
1656
PAGE Statement
4
Chapter 6
“REMOVE Statement” on page 1689 “REPLACE Statement” on page 1692
PAGE Statement Skips to a new page in the SAS log. Valid:
Anywhere Log Control
Category:
Syntax PAGE;
Without Arguments The PAGE statement skips to a new page in the SAS log.
Details You can use the PAGE statement when you run SAS in a windowing environment, batch, or noninteractive mode. The PAGE statement itself does not appear in the log. When you run SAS in interactive line mode, PAGE might print blank lines to the display monitor (or to the alternate log file).
See Also Statement: “LIST Statement” on page 1621 System Options: “LINESIZE= System Option” on page 1878 “PAGESIZE= System Option” on page 1900
PUT Statement Writes lines to the SAS log, to the SAS output window, or to an external location that is specified in the most recent FILE statement. in a DATA step Category: File-handling Type: Executable Valid:
Syntax PUT < specification(s)>< _ODS_>;
Statements
4
PUT Statement
1657
Without Arguments The PUT statement without arguments is called a null PUT statement. The null PUT statement 3 writes the current output line to the current location, even if the current output line is blank 3 releases an output line that is being held with a trailing @ by a previous PUT statement. For an example, see Example 5 on page 1670. For more information, see “Using Line-Hold Specifiers” on page 1664.
Arguments specification(s) specifies what is written, how it is written, and where it is written. The specification can include variable specifies the variable whose value is written. Note: Beginning with Version 7, you can specify column-mapped Output Delivery System variables in the PUT statement. This functionality is described briefly here in _ODS_ on page 1658, but documented more completely in PUT Statement for ODS in SAS Output Delivery System: User’s Guide. 4 (variable-list) specifies a list of variables whose values are written. Requirement: The (format-list) must follow the (variable-list). See: “PUT Statement, Formatted” on page 1675 ’character-string’ specifies a string of text, enclosed in quotation marks, to write. Tip: To write a hexadecimal string in EBCDIC or ASCII, follow the ending quotation mark with an x. Tip: If you use single quotation marks (’’) or double quotation marks ("") together (with no space in between them) as the string of text, SAS will output a single quotation mark ( ’) or double quotation mark ("), respectively. See Also: “List Output” on page 1662 Example: This statement writes HELLO when the hexadecimal string is converted to ASCII characters: put ’68656C6C6F’x;
n* specifies to repeat n times the subsequent character string. Example: This statement writes a line of 132 underscores. put 132*’_’;
Featured in: Example 4 on page 1670
pointer-control moves the output pointer to a specified line or column in the output buffer.
1658
PUT Statement
4
Chapter 6
See: “Column Pointer Controls” on page 1659 and “Line Pointer Controls” on
page 1660 column-specifications specifies which columns of the output line the values are written. See: “Column Output” on page 1662 Featured in: Example 2 on page 1667 format. specifies a format to use when the variable values are written. See: “Formatted Output” on page 1662 Featured in: Example 1 on page 1666 (format-list) specifies a list of formats to use when the values of the preceding list of variables are written. Restriction: The (format-list) must follow the (variable-list). See: “PUT Statement, Formatted” on page 1675
_INFILE_ writes the last input data record that is read either from the current input file or from the data lines that follow a DATELINES statement. Tip: _INFILE_ is an automatic variable that references the current INPUT buffer. You can use this automatic variable in other SAS statements. Tip: If the most recent INPUT statement uses line-pointer controls to read multiple input data records, PUT _INFILE_ writes only the record that the input pointer is positioned on. Example: This PUT statement writes all the values of the first input data record: input #3 score #1 name $ 6-23; put _infile_;
Featured in: Example 6 on page 1671
_ALL_ writes the values of all variables, which includes automatic variables, that are defined in the current DATA step by using named output. See: “Named Output” on page 1662 _ODS_ moves data values for all columns (as defined by the ODS option in the FILE statement) into a special buffer, from which it is eventually written to the data component. The ODS option in the FILE statement defines the structure of the data component that holds the results of the DATA step. Restriction: Use _ODS_ only if you have previously specified the ODS option in
the FILE statement. Tip: You can use the _ODS_ specification in conjunction with variable specifications and column pointers, and it can appear anywhere in a PUT statement. Interaction: _ODS_ writes data to a specific column only if a PUT statement has not already specified a variable for that column with a column pointer. That is, a variable specification for a column overrides the _ODS_ option. See: “PUT Statement for ODS” in SAS Output Delivery System: User’s Guide @|@@
Statements
4
PUT Statement
1659
holds an output line for the execution of the next PUT statement even across iterations of the DATA step. These line-hold specifiers are called trailing @ and double trailing @. Restriction: The trailing @ or double trailing @ must be the last item in the PUT
statement. Use an @ or @@ to hold the pointer at its current location. The next PUT statement that executes writes to the same output line rather than to a new output line.
Tip:
See:
“Using Line-Hold Specifiers” on page 1664
Featured in: Example 5 on page 1670
Column Pointer Controls @n moves the pointer to column n. Range: a positive integer Example: @15 moves the pointer to column 15 before the value of NAME is
written: put @15 name $10.;
Featured in: Example 2 on page 1667 and Example 4 on page 1670
@numeric-variable moves the pointer to the column given by the value of numeric-variable. Range: a positive integer
If n is not an integer, SAS truncates the decimal portion and uses only the integer value. If n is zero or negative, the pointer moves to column 1.
Tip:
Example: The value of the variable A moves the pointer to column 15 before the
value of NAME is written: a=15; put @a name $10.;
Featured in: Example 2 on page 1667
@(expression) moves the pointer to the column that is given by the value of expression. Range: a positive integer
If the value of expression is not an integer, SAS truncates the decimal value and uses only the integer value. If it is zero, the pointer moves to column 1.
Tip:
Example: The result of the expression moves the pointer to column 15 before the
value of NAME is written: b=5; put @(b*3) name $10.;
+n moves the pointer n columns. Range: a positive integer or zero
If n is not an integer, SAS truncates the decimal portion and uses only the integer value.
Tip:
Example: This statement moves the pointer to column 23, writes a value of
LENGTH in columns 23 through 26, advances the pointer five columns, and writes the value of WIDTH in columns 32 through 35:
1660
PUT Statement
4
Chapter 6
put @23 length 4. +5 width 4.;
+numeric-variable moves the pointer the number of columns given by the value of numeric-variable. Range: a positive or negative integer or zero Tip: If numeric-variable is not an integer, SAS truncates the decimal value and uses only the integer value. If numeric-variable is negative, the pointer moves backward. If the current column position becomes less than 1, the pointer moves to column 1. If the value is zero, the pointer does not move. If the value is greater than the length of the output buffer, the current line is written out and the pointer moves to column 1 on the next line. +(expression) moves the pointer the number of columns given by expression. Range: expression must result in an integer Tip: If expression is not an integer, SAS truncates the decimal value and uses only the integer value. If expression is negative, the pointer moves backward. If the current column position becomes less than 1, the pointer moves to column 1. If the value is zero, the pointer does not move. If the value is greater than the length of the output buffer, the current line is written out and the pointer moves to column 1 on the next line. Featured in: Example 2 on page 1667
Line Pointer Controls #n moves the pointer to line n. Range: a positive integer Example: The #2 moves the pointer to the second line before the value of ID is written in columns 3 and 4: put @12 name $10. #2 id 3-4;
#numeric-variable moves the pointer to the line given by the value of numeric-variable. Range: a positive integer Tip: If the value of numeric-variable is not an integer, SAS truncates the decimal value and uses only the integer value. #(expression) moves the pointer to the line that is given by the value of expression. Range: Expression must result in a positive integer. Tip: If the value of expression is not an integer, SAS truncates the decimal value and uses only the integer value. / advances the pointer to column 1 of the next line. Example: The values for NAME and AGE are written on one line, and then the pointer moves to the second line to write the value of ID in columns 3 and 4: put name age / id 3-4;
Featured in:
Example 3 on page 1668
OVERPRINT causes the values that follow the keyword OVERPRINT to print on the most recently written output line.
Statements
4
PUT Statement
1661
You must direct the output to a file. Set the N= option in the FILE statement to 1 and direct the PUT statements to a file.
Requirement: Tip:
OVERPRINT has no effect on lines that are written to a display.
Use OVERPRINT in combination with column pointer and line pointer controls to overprint text.
Tip:
Example: This statement overprints underscores, starting in column 15, which
underlines the title: put @15 ’Report Title’ overprint @15 ’____________’;
Featured in: Example 4 on page 1670
_BLANKPAGE_ advances the pointer to the first line of a new page, even when the pointer is positioned on the first line and the first column of a new page. If the current output file contains carriage-control characters, _BLANKPAGE_ produces output lines that contain the appropriate carriage-control character.
Tip:
Featured in: Example 3 on page 1668
_PAGE_ advances the pointer to the first line of a new page. SAS automatically begins a new page when a line exceeds the current PAGESIZE= value. If the current output file is printed, _PAGE_ produces an output line that contains the appropriate carriage-control character. _PAGE_ has no effect on a file that is not printed.
Tip:
If you specify FILE PRINT in an interactive SAS session, then the Output window interprets the form-feed control characters as page breaks, and they are removed from the output. The resulting file is a flat file without page break characters. If a file needs to contain the form-feed characters, then the FILE statement should include a physical file location and the PRINT option.
Tip:
Featured in: Example 3 on page 1668
Details When to Use PUT Use the PUT statement to write lines to the SAS log, to the SAS output window, or to an external location. If you do not execute a FILE statement before the PUT statement in the current iteration of a DATA step, SAS writes the lines to the SAS log. If you specify the PRINT option in the FILE statement, SAS writes the lines to the SAS output window. The PUT statement can write lines that contain variable values, character strings, and hexadecimal character constants. With specifications in the PUT statement, you specify what to write, where to write it, and how to format it.
Output Styles There are four ways to write variable values with the PUT statement:
3 3 3 3
column list (simple and modified) formatted named
1662
PUT Statement
4
Chapter 6
A single PUT statement might contain any or all of the available output styles, depending on how you want to write lines.
Column Output
With column output, the column numbers follow the variable in the PUT statement. These numbers indicate where in the line to write the following value: put name 6-15 age 17-19;
These lines are written to the SAS log.* ----+----1----+----2----+ Peterson 21 Morgan 17
The PUT statement writes values for NAME and AGE in the specified columns. See “PUT Statement, Column” on page 1673 for more information.
List Output With list output, list the variables and character strings in the PUT statement in the order that you want to write them. For example, this PUT statement put name age;
writes the values for NAME and AGE to the SAS log: ----+----1----+----2----+ Peterson 21 Morgan 17
See “PUT Statement, List” on page 1678 for more information.
Formatted Output
With formatted output, specify a SAS format or a user-written format after the variable name. The format gives instructions on how to write the variable value. Formats enable you to write in a non-standard form, such as packed decimal, or numbers that contain special characters such as commas. For example, this PUT statement put name $char10. age 2. +1 date mmddyy10.;
writes the values for NAME, AGE, and DATE to the SAS log: ----+----1----+----2----+ Peterson 21 07/18/1999 Morgan 17 11/12/1999
Using a pointer control of +1 inserts a blank space between the values of AGE and DATE. See “PUT Statement, Formatted” on page 1675 for more information.
Named Output
With named output, list the variable name followed by an equal sign. For example, this PUT statement put name= age=;
writes the values for NAME and AGE to the SAS log: ----+----1----+----2----+ name=Peterson age=21 name=Morgan age=17
* The ruled line is for illustrative purposes only; the PUT statement does not generate it.
Statements
4
PUT Statement
1663
See “PUT Statement, Named” on page 1683 for more information.
Using Multiple Output Styles in a Single PUT Statement A PUT statement can combine any or all of the different output styles. For example, put name ’on ’ date mmddyy8. ’ weighs ’ startwght +(-1) ’.’ idno= 40-45;
See Example 1 on page 1666 for an explanation of the lines written to the SAS log. When you combine different output styles, it is important to understand the location of the output pointer after each value is written. For more information on the pointer location, see “Pointer Location After a Value Is Written” on page 1665.
Avoiding a Common Error When Writing Both a Character Constant and a Variable When using a PUT statement to write a character constant that is followed by a variable name, always put a blank space between the closing quotation mark and the variable name: put ’Player:’ name1 ’Player:’ name2 ’Player:’ name3;
Otherwise, SAS might interpret a character constant that is followed by a variable name as a special SAS constant as illustrated in this table. Table 6.9 Characters That Cause Misinterpretation When They Follow a Character Constant Starting Letter of Variable
Represents
Examples
b
bit testing constant
’00100000’b
d
date constant
’01jan04’d
dt
datetime constant
’18jan2003:9:27:05am’dt
n
name literal
’My Table’n
t
time constant
’9:25:19pm’t
x
hexadecimal notation
’534153’x
Example 7 on page 1671 shows how to use character constants followed by variables. For more information about SAS name literals and SAS constants in expressions, see SAS Language Reference: Concepts.
Pointer Controls As SAS writes values with the PUT statement, it keeps track of its position with a pointer. The PUT statement provides three ways to control the movement of the pointer: column pointer controls reset the pointer’s column position when the PUT statement starts to write the value to the output line. line pointer controls reset the pointer’s line position when the PUT statement writes the value to the output line. line-hold specifiers
1664
PUT Statement
4
Chapter 6
hold a line in the output buffer so that another PUT statement can write to it. By default, the PUT statement releases the previous line and writes to a new line. With column and line pointer controls, you can specify an absolute line number or column number to move the pointer or you can specify a column or line location that is relative to the current pointer position. The following table lists all pointer controls that are available in the PUT statement. Table 6.10
Pointer Controls Available in the PUT Statement
Pointer Controls
Relative
Absolute
column pointer controls
+n
@n
+numeric-variable
@numeric-variable
+(expression)
@(expression)
/ , _PAGE_ ,
#n
_BLANKPAGE_
#numeric-variable
line pointer controls
#(expression)
line-hold specifiers
Note:
OVERPRINT
none
@
(not applicable)
@@
(not applicable)
Always specify pointer controls before the variable for which they apply.
4
See “Pointer Location After a Value Is Written” on page 1665 for more information about how SAS determines the pointer position.
Using Line-Hold Specifiers Line-hold specifiers keep the pointer on the current output line when 3 more than one PUT statement writes to the same output line 3 a PUT statement writes values from more than one observation to the same output line. Without line-hold specifiers, each PUT statement in a DATA step writes a new output line. In the PUT statement, trailing @ and double trailing @@ produce the same effect. Unlike the INPUT statement, the PUT statement does not automatically release a line that is held by a trailing @ when the DATA step begins a new iteration. SAS releases the current output line that is held by a trailing @ or double trailing @ when it encounters 3 a PUT statement without a trailing @ 3 a PUT statement that uses _BLANKPAGE_ or _PAGE_ 3 the end of the current line (determined by the current value of the LRECL= or LINESIZE= option in the FILE statement, if specified, or the LINESIZE= system option) 3 the end of the last iteration of the DATA step. Using a trailing @ or double trailing @ can cause SAS to attempt to write past the current line length because the pointer value is unchanged when the next PUT statement executes. See “When the Pointer Goes Past the End of a Line” on page 1665.
Statements
4
PUT Statement
1665
Pointer Location After a Value Is Written Understanding the location of the output pointer after a value is written is important, especially if you combine output styles in a single PUT statement. The pointer location after a value is written depends on which output style you use and whether a character string or a variable is written. With column or formatted output, the pointer is located in the first column after the end of the field that is specified in the PUT statement. These two styles write only variable values. With list output or named output, the pointer is located in the second column after a variable value because PUT skips a column automatically after each value is written. However, when a PUT statement uses list output to write a character string, the pointer is located in the first column after the string. If you do not use a line pointer control or column output after a character string is written, add a blank space to the end of the character string to separate it from the next value. After an _INFILE_ specification, the pointer is located in the first column after the record is written from the current input file. When the output pointer is in the upper left corner of a page,
3 PUT _BLANKPAGE_ writes a blank page and moves the pointer to the top of the next page.
3 PUT _PAGE_ leaves the pointer in the same location. You can determine the current location of the pointer by examining the variables that are specified with the COLUMN= option and the LINE= option in the FILE statement.
When the Pointer Goes Past the End of a Line SAS does not write an output line that is longer than the current output line length. The line length of the current output file is determined by
3 the value of the LINESIZE= option in the current FILE statement 3 the value of the LINESIZE= system option (for the SAS output window) 3 the LRECL= option in the current FILE statement (for external files). You can inadvertently position the pointer beyond the current line length with one or more of these specifications:
3 a + pointer control with a value that moves the pointer to a column beyond the current line length
3 a column range that exceeds the current line length (for example, PUT X 90 – 100 when the current line length is 80)
3 a variable value or character string that does not fit in the space that remains on the current output line. By default, when PUT attempts to write past the end of the current line, SAS withholds the entire item that overflows the current line, writes the current line, and then writes the overflow item on a new line, starting in column 1. See the FLOWOVER, DROPOVER, and STOPOVER options in the statement “FILE Statement” on page 1454.
Arrays You can use the PUT statement to write an array element. The subscript is any SAS expression that results in an integer when the PUT statement executes. You can use an array reference in a numeric-variable construction with a pointer control if you enclose the reference in parentheses, as shown here:
3 @(array-name{i}) 3 +(array-name{i})
1666
PUT Statement
4
Chapter 6
3 #(array-name{i}) Use the array subscript asterisk (*) to write all elements of a previously defined array to an external location. SAS allows one-dimensional or multidimensional arrays, but it does not allow a _TEMPORARY_ array. Enclose the subscript in braces, brackets, or parentheses, and print the array using list, formatted, column, or named output. With list output, the form of this statement is PUT array-name{*};
With formatted output, the form of this statement is PUT array-name{*}(format|format.list)
The format in parentheses follows the array reference.
Comparisons 3 The PUT statement writes variable values and character strings to the SAS log or to an external location while the INPUT statement reads raw data in external files or data lines entered instream. 3 Both the INPUT and the PUT statements use the trailing @ and double trailing @ line-hold specifiers to hold the current line in the input or output buffer, respectively. In an INPUT statement, a double trailing @ holds a line in the input buffer from one iteration of the DATA step to the next. In a PUT statement, however, a trailing @ has the same effect as a double trailing @; both hold a line across iterations of the DATA step.
3 Both the PUT and OUTPUT statements create output in a DATA step. The PUT statement uses an output buffer and writes output lines to an external location, the SAS log, or your monitor. The OUTPUT statement uses the program data vector and writes observations to a SAS data set.
Examples Example 1: Using Multiple Output Styles in One PUT Statement
This example uses
several output styles in a single PUT statement: options yearcutoff= 1920; data club1; input idno name $ startwght date : date7.; put name ’on ’ date mmddyy8. ’ weighs ’ startwght +(-1) ’.’ idno= 32-40; datalines; 032 David 180 25nov99 049 Amelia 145 25nov99 219 Alan 210 12nov99 ;
The following table shows the output style used for each variable in the example:
Statements
Variables
Output Style
NAME, STARTWGHT
list output
DATE
formatted output
IDNO
named output
4
PUT Statement
1667
The PUT statement also uses pointer controls and specifies both character strings and variable names. The program writes the following lines to the SAS log:* ----+----1----+----2----+----3----+----4 David on 11/25/99 weighs 180. idno=1032 Amelia on 11/25/99 weighs 145. idno=1049 Alan on 11/12/99 weighs 210. idno=1219
Blank spaces are inserted at the beginning and the end of the character strings to change the pointer position. These spaces separate the value of a variable from the character string. The +(-1) pointer control moves the pointer backward to remove the unwanted blank that occurs between the value of STARTWGHT and the period. For more information on how to position the pointer, see “Pointer Location After a Value Is Written” on page 1665.
Example 2: Moving the Pointer within a Page
These PUT statements show how to use column and line pointer controls to position the output pointer.
3 To move the pointer to a specific column, use @ followed by the column number, variable, or expression whose value is that column number. For example, this statement moves the pointer to column 15 and writes the value of TOTAL SALES using list output: put @15 totalsales;
This PUT statement moves the pointer to the value that is specified in COLUMN and writes the value of TOTALSALES with the COMMA6 format: data _null_; set carsales; column=15; put @column totalsales comma6.; run;
3 This program shows two techniques to move the pointer backward: data carsales; input item $10. jan : comma5. feb : comma5. mar : comma5.; saleqtr1=sum(jan,feb,mar); /* an expression moves pointer backward */ put ’1st qtr sales for ’ item ’is ’ saleqtr1 : comma6. +(-1) ’.’; /* a numeric variable with a negative value moves pointer backward. */ x=-1;
* The ruled line is for illustrative purposes only; the PUT statement does not generate it.
1668
PUT Statement
4
Chapter 6
put ’1st qtr sales for ’ item ’is ’ saleqtr1 : comma5. +x datalines; trucks 1,382 2,789 vans 1,265 2,543 sedans 2,391 3,011 ;
’.’; 3,556 3,987 3,658
Because the value of SALEQTR1 is written with modified list output, the pointer moves automatically two spaces. For more information, see “How Modified List Output and Formatted Output Differ” on page 1680. To remove the unwanted blank that occurs between the value and the period, move the pointer backward by one space. The program writes the following lines to the SAS log:* ----+----1----+----2----+----3----+----4 st qtr sales for trucks is 7,727. st qtr sales for trucks is 7,727. st qtr sales for vans is 7,795. st qtr sales for vans is 7,795. st qtr sales for sedans is 9,060. st qtr sales for sedans is 9,060.
3 This program uses a PUT statement with the / line pointer control to advance to the next output line: data _null_; set carsales end=lastrec; totalsales+saleqtr1; if lastrec then put @2 ’Total Sales for 1st Qtr’ / totalsales 10-15; run;
After the DATA step calculates TOTALSALES using all the observations in the CARSALES data set, the PUT statement executes. It writes a character string beginning in column 2 and moves to the next line to write the value of TOTALSALES in columns 10 through 15: ----+----1----+----2----+----3 Total Sales for 1st Qtr 24582
Example 3: Moving the Pointer to a New Page
This example creates a data set called STATEPOP, which contains information from the 1990 U.S. census about the population of metropolitan and non-metropolitan areas. It executes the FORMAT procedure to group the 50 states and the District of Columbia into four regions. It then uses the IF and PUT statements to control the printed output. options pagesize=24 linesize=64 nodate pageno=1; title1; data statepop; input state $ cityp90 ncityp90 region @@; label cityp90= ’1990 metropolitan population
* The ruled line is for illustrative purposes only; the PUT statement does not generate it.
Statements
ME VT RI NY PA MD VA NC GA KY AL AR OK OH IL WI IA ND NE MT WY NM UT WA CA HI ;
(million)’ ncityp90=’1990 nonmetropolitan population (million)’ region= ’Geographic region’; datalines; .443 .785 1 NH .659 .450 1 .152 .411 1 MA 5.788 .229 1 .938 .065 1 CT 3.148 .140 1 16.515 1.475 1 NJ 7.730 .A 1 10.083 1.799 1 DE .553 .113 2 4.439 .343 2 DC .607 . 2 4.773 1.414 2 WV .748 1.045 2 4.376 2.253 2 SC 2.423 1.064 2 4.352 2.127 2 FL 12.023 .915 2 1.780 1.906 2 TN 3.298 1.579 2 2.710 1.331 2 MS .776 1.798 2 1.040 1.311 2 LA 3.160 1.060 2 1.870 1.276 2 TX 14.166 2.821 2 8.826 2.021 3 IN 3.962 1.582 3 9.574 1.857 3 MI 7.698 1.598 3 3.331 1.561 3 MN 3.011 1.364 3 1.200 1.577 3 MO 3.491 1.626 3 .257 .381 3 SD .221 .475 3 .787 .791 3 KS 1.333 1.145 3 .191 .608 4 ID .296 .711 4 .134 .319 4 CO 2.686 .608 4 .842 .673 4 AZ 3.106 .559 4 1.336 .387 4 NV 1.014 .183 4 4.036 .830 4 OR 1.985 .858 4 28.799 .961 4 AK .226 .324 4 .836 .272 4
proc format; value regfmt 1=’Northeast’ 2=’South’ 3=’Midwest’ 4=’West’; run; data _null_; set statepop; by region; pop90=sum(cityp90,ncityp90); file print; put state 1-2 @5 pop90 7.3 ’ million’; if first.region then regioncitypop=0; /* new region */ regioncitypop+cityp90; if last.region then do; put // ’1990 US CENSUS for ’ region regfmt. / ’Total Urban Population: ’ regioncitypop’ million’ _page_;
4
PUT Statement
1669
1670
PUT Statement
4
Chapter 6
end; run;
Output 6.27
PUT Statement Output for the Northeast Region 1
ME NH VT MA RI CT NY NJ PA
1.228 1.109 0.563 6.017 1.003 3.288 17.990 7.730 11.882
million million million million million million million million million
1990 US CENSUS for Northeast Total Urban Population: 45.456
million
PUT _PAGE_ advances the pointer to line 1 of the new page when the value of LAST.REGION is 1. The example prints a footer message before exiting the page.
Example 4: Underlining Text This example uses OVERPRINT to underscore a value written by a previous PUT statement: data _null_; input idno name $ startwght; file file-specification print; put name 1-10 @15 startwght 3.; if startwght > 200 then put overprint @15 ’___’; datalines; 032 David 180 049 Amelia 145 219 Alan 210 ;
The second PUT statement underlines weights above 200 on the output line the first PUT statement prints. This PUT statement uses OVERPRINT with both a column pointer control and a line pointer control: put @5 name $8. overprint @5 8*’_’ / @20 address;
The PUT statement writes a NAME value, underlines it by overprinting eight underscores, and moves the output pointer to the next line to write an ADDRESS value.
Example 5: Holding and Releasing Output Lines
This DATA step demonstrates how to hold and release an output line with a PUT statement: data _null_; input idno name $ startwght 3.; put name @; if startwght ne . then put @15 startwght; else put; datalines; 032 David 180
Statements
4
PUT Statement
1671
049 Amelia 145 126 Monica 219 Alan 210 ;
In this example,
3 the trailing @ in the first PUT statement holds the current output line after the value of NAME is written
3 if the condition is met in the IF-THEN statement, the second PUT statement writes the value of STARTWGHT and releases the current output line
3 if the condition is not met, the second PUT never executes. Instead, the ELSE PUT statement executes. The ELSE PUT statement releases the output line and positions the output pointer at column 1 in the output buffer. The program writes the following lines to the SAS log:* ----+----1----+----2 David 180 Amelia 145 Monica Alan 210
Example 6: Writing the Current Input Record to the Log
When a value for ID is less than 1000, PUT _INFILE_ executes and writes the current input record to the SAS log. The DELETE statement prevents the DATA step from writing the observation to the TEAM data set. data team; input id team $ score1 score2; if id le 1000 then do; put _infile_; delete; end; datalines; 032 red 180 165 049 yellow 145 124 219 red 210 192 ;
The program writes the following line to the SAS log:* ----+----1----+----2 219 red 210 192
Example 7: Avoiding a Common Error When Writing a Character Constant Followed by a Variable This example illustrates how to use a PUT statement to write character constants and variable values without causing them to be misinterpreted as SAS name literals. A SAS name literal is a name token that is expressed as a string within quotation marks, followed by the letter n. For more information about SAS name literals, see SAS Language Reference: Concepts. In the program below, the PUT statement writes the constant ’n’ followed by the value of the variable NVAR1, and then writes another constant ’n’: * The ruled line is for illustrative purposes only; the PUT statement does not generate it.
1672
PUT Statement
4
Chapter 6
data _null_; n=5; nvar1=1; var1=7; put @1 ’n’ nvar1 ’n’; run;
This program writes the following line to the SAS log:* ----+----1----+----2 n1 n
If all the spaces between the constants and the variables are removed from the previous PUT statement, SAS interprets ’n’ as a name literal instead of reading ’n’ as a constant. The next variable is read as VAR1 instead of NVAR1. The final ’n’ constant is interpreted correctly. put @1 ’n’nvar1’n’;
This PUT statement writes the following line to the SAS log:* ----+----1----+----2 5 7 n
To print character constants and variable values without intervening spaces, and without potential misinterpretation, you can add spaces between them and use pointer controls where necessary. For example, the following PUT statement uses a pointer control to write the correct character constants and variable values but does not insert blank spaces. Note that +(-1) moves the PUT statement pointer backwards by one space. put @1 ’n’ nvar1 +(-1) ’n’;
This PUT statement writes the following line to the SAS log:* ----+----1----+----2 n1n
See Also Statements: “FILE Statement” on page 1454 “PUT Statement, Column” on page 1673 “PUT Statement, Formatted” on page 1675 “PUT Statement, List” on page 1678 “PUT Statement, Named” on page 1683 PUT Statement for ODS
System Options: “LINESIZE= System Option” on page 1878 “PAGESIZE= System Option” on page 1900 * The ruled line is for illustrative purposes only; the PUT statement does not generate it.
Statements
4
PUT Statement, Column
1673
PUT Statement, Column Writes variable values in the specified columns in the output line. in a DATA step Category: File-handling Type: Executable Valid:
Syntax PUT variable start-column ;
Arguments variable
specifies the variable whose value is written. start-column
specifies the first column of the field where the value is written in the output line. — end-column
specifies the last column of the field for the value. Tip: If the value occupies only one column in the output line, omit end-column. Example: Because end-column is omitted, the values for the character variable GENDER occupy only column 16: put name 1-10 gender 16;
.decimal-places
specifies the number of digits to the right of the decimal point in a numeric value. Range: positive integer Tip: If you specify 0 for d or omit d, the value is written without a decimal point. Featured in: “Examples” on page 1674 @| @@
holds an output line for the execution of the next PUT statement even across iterations of the DATA step. These line-hold specifiers are called trailing @ and double trailing @. Requirement: The trailing @ or double trailing @ must be the last item in the PUT statement. See: “Using Line-Hold Specifiers” on page 1664
Details With column output, the column numbers indicate the position that each variable value will occupy in the output line. If a value requires fewer columns than specified, a character variable is left-aligned in the specified columns, and a numeric variable is right-aligned in the specified columns. There is no limit to the number of column specifications you can make in a single PUT statement. You can write anywhere in the output line, even if a value overwrites
1674
PUT Statement, Column
4
Chapter 6
columns that were written earlier in the same statement. You can combine column output with any of the other output styles in a single PUT statement. For more information, see “Using Multiple Output Styles in a Single PUT Statement” on page 1663.
Examples Use column output in the PUT statement as shown here.
3 This PUT statement uses column output: data _null_; input name $ 1-18 score1 score2 score3; put name 1-20 score1 23-25 score2 28-30 score3 33-35; datalines; Joseph 11 32 76 Mitchel 13 29 82 Sue Ellen 14 27 74 ;
The program writes the following lines to the SAS log:* ----+----1----+----2----+----3----+----4 Joseph 11 32 76 Mitchel 13 29 82 Sue Ellen 14 27 74
The values for the character variable NAME begin in column 1, the left boundary of the specified field (columns 1 through 20). The values for the numeric variables SCORE1 through SCORE3 appear flush with the right boundary of their field.
3 This statement produces the same output lines, but writes the SCORE1 value first and the NAME value last: put score1 23-25 score2 28-30 score3 33-35 name $ 1-20;
3 This DATA step specifies decimal points with column output: data _null_; x=11; y=15; put x 10-18 .1 y 20-28 .1; run;
This program writes the following line to the SAS log:* ----+----1----+----2----+----3----+----4 11.0 15.0
See Also Statement: “PUT Statement” on page 1656 * The ruled line is for illustrative purposes only; the PUT statement does not generate it.
Statements
4
PUT Statement, Formatted
1675
PUT Statement, Formatted Writes variable values with the specified format in the output line. in a DATA step Category: File-handling Type: Executable Valid:
Syntax PUT variable format. ; PUT (variable-list) (format-list) ;
Arguments pointer-control
moves the output pointer to a specified line or column. See: “Column Pointer Controls” on page 1659 and “Line Pointer Controls” on page 1660 Featured in: Example 1 on page 1677 variable
specifies the variable whose value is written. (variable-list)
specifies a list of variables whose values are written. Requirement: The (format-list) must follow the (variable-list). See: “How to Group Variables and Formats” on page 1676 Featured in: Example 1 on page 1677 format.
specifies a format to use when the variable values are written. To override the default alignment, you can add an alignment specification to a format: -L
left aligns the value.
-C
centers the value.
-R
right aligns the value. Tip: Ensure that the format width provides enough space to write the value and any commas, dollar signs, decimal points, or other special characters that the format includes. Example: This PUT statement uses the format dollar7.2 to write the value of X: put x dollar7.2;
When X is 100, the formatted value uses seven columns: $100.00
Featured in: (format-list)
Example 2 on page 1677
1676
PUT Statement, Formatted
4
Chapter 6
specifies a list of formats to use when the values of the preceding list of variables are written. In a PUT statement, a format-list can include format. specifies the format to use to write the variable values. Tip: You can specify either a SAS format or a user-written format. See Chapter 3, “Formats,” on page 81. pointer-control specifies one of these pointer controls to use to position a value: @, #, /, +, and OVERPRINT. Example: Example 1 on page 1677 character-string specifies one or more characters to place between formatted values. Example: This statement places a hyphen between the formatted values of CODE1, CODE2, and CODE3: put bldg $ (code1 code2 code3) (3. ’-’);
See: Example 1 on page 1677
n* specifies to repeat n times the next format in a format list. Example: This statement uses the 7.2 format to write GRADES1, GRADES2, and GRADES3 and the 5.2 format to write GRADES4 and GRADES5: put (grades1-grades5) (3*7.2, 2*5.2);
Restriction: The (format-list) must follow (variable-list). See Also: “How to Group Variables and Formats” on page 1676 @| @@
holds an output line for the execution of the next PUT statement even across iterations of the DATA step. These line-hold specifiers are called trailing @ and double trailing @. Restriction: The trailing @ or double trailing @ must be the last item in the PUT statement. See: “Using Line-Hold Specifiers” on page 1664
Details Using Formatted Output The Formatted output describes the output lines by listing the variable names and the formats to use to write the values. You can use a SAS format or a user-written format to control how SAS prints the variable values. For a complete description of the SAS formats, see “Definition of Formats” on page 84. With formatted output, the PUT statement uses the format that follows the variable name to write each value. SAS does not automatically add blanks between values. If the value uses fewer columns than specified, character values are left-aligned and numeric values are right-aligned in the field that is specified by the format width. Formatted output, combined with pointer controls, makes it possible to specify the exact line and column location to write each variable. For example, this PUT statement uses the dollar7.2 format and centers the value of X starting at column 12: put @12 x dollar7.2-c;
How to Group Variables and Formats
When you want to write values in a pattern on the output lines, use format lists to shorten your coding time. A format list consists of
Statements
4
PUT Statement, Formatted
1677
the corresponding formats separated by either blanks or commas and enclosed in parentheses. It must follow the names of the variables enclosed in parentheses. For example, this statement uses a format list to write the five variables SCORE1 through SCORE5, one after another, using four columns for each value with no blanks in between: put (score1-score5) (4. 4. 4. 4. 4.);
A shorter version of the previous statement is put (score1-score5) (4.);
You can include any of the pointer controls (@, #, /, +, and OVERPRINT) in the list of formats, as well as n*, and a character string. You can use as many format lists as necessary in a PUT statement, but do not nest the format lists. After all the values in the variable list are written, the PUT statement ignores any directions that remain in the format list. For an example, see Example 3 on page 1678. You can also specify a reference to all elements in an array as (array-name {*}), followed by a list of formats. You cannot, however, specify the elements in a _TEMPORARY_ array in this way. This PUT statement specifies an array name and a format list: put (array1{*}) (4.);
For more information on how to reference an array, see “Arrays” on page 1665.
Examples Example 1: Writing a Character between Formatted Values This example formats some values and writes a - (hyphen) between the values of variables BLDG and ROOM: data _null_; input name & put name @20 datalines; Bill Perkins J Sydney Riley C ;
$15. bldg $ room; (bldg room) ($1. "-" 3.); 126 219
These lines are written to the SAS log: Bill Perkins Sydney Riley
J-126 C-219
Example 2: Overriding the Default Alignment of Formatted Values includes an alignment specification in the format: data _null_; input name $ 1-12 score1 score2 score3; put name $12.-r +3 score1 3. score2 3. score3 4.; datalines; Joseph 11 32 76 Mitchel 13 29 82 Sue Ellen 14 27 74 ;
These lines are written to the log:
This example
1678
PUT Statement, List
4
Chapter 6
----+----1----+----2----+----3----+----4 Joseph 11 32 76 Mitchel 13 29 82 Sue Ellen 14 27 74
The value of the character variable NAME is right-aligned in the formatted field. (Left alignment is the default for character variables.)
Example 3: Including More Format Specifications Than Necessary This format list includes more specifications than are necessary when the PUT statement executes: data _null_; input x y z; put (x y z) (2.,+1); datalines; 2 24 36 0 20 30 ;
The PUT statement writes the value of X using the 2. format. Then, the +1 column pointer control moves the pointer forward one column. Next, the value of Y is written with the 2. format. Again, the +1 column pointer moves the pointer forward one column. Then, the value of Z is written with the 2. format. For the third iteration, the PUT statement ignores the +1 pointer control. These lines are written to the SAS log: * ----+----1----+ 2 24 36 0 20 30
See Also Statement: “PUT Statement” on page 1656
PUT Statement, List Writes variable values and the specified character strings in the output line. in a DATA step Category: File-handling Type: Executable Valid:
Syntax PUT < pointer-control> variable ; PUT ’character-string’ ; * The ruled line is for illustrative purposes only; the PUT statement does not generate it.
Statements
4
PUT Statement, List
1679
PUT variable format.;
Arguments pointer-control
moves the output pointer to a specified line or column. See: “Column Pointer Controls” on page 1659 and “Line Pointer Controls” on page 1660 Featured in: Example 2 on page 1682 variable
specifies the variable whose value is written. Featured in: Example 1 on page 1681 n*
specifies to repeat n times the subsequent character string. Example: This statement writes a line of 132 underscores: put 132*’_’;
’character-string’
specifies a string of text, enclosed in quotation marks, to write. Interaction: When insufficient space remains on the current line to write the entire text string, SAS withholds the entire string and writes the current line. Then it writes the text string on a new line, starting in column 1. For more information, see “When the Pointer Goes Past the End of a Line” on page 1665. Tip: To avoid misinterpretation, always put a space after a closing quotation mark in a PUT statement. Tip: If you follow a quotation mark with X, SAS interprets the text string as a hexadecimal constant. Tip: If you use single quotation (‘) or double quotes (“) together (with no space in between them) as the string of text, SAS will output a single quotation mark ( ’)or double quotation mark (“), respectively. See Also: “How List Output Is Spaced” on page 1680 Featured in: Example 2 on page 1682 :
enables you to specify a format that the PUT statement uses to write the variable value. All leading and trailing blanks are deleted, and each value is followed by a single blank. Requirement: You must specify a format. See: “How Modified List Output and Formatted Output Differ” on page 1680 Featured in: Example 3 on page 1682 ~
enables you to specify a format that the PUT statement uses to write the variable value. SAS displays the formatted value in quotation marks even if the formatted value does not contain the delimiter. SAS deletes all leading and trailing blanks, and each value is followed by a single blank. Missing values for character variables are written as a blank (" ") and, by default, missing values for numeric variables are written as a period (".").
1680
PUT Statement, List
4
Chapter 6
Requirement: Featured in:
You must specify the DSD option in the FILE statement. Example 4 on page 1682
format.
specifies a format to use when the data values are written. You can specify either a SAS format or a user-written format. See Chapter 3, “Formats,” on page 81.
Tip:
Featured in:
Example 3 on page 1682
@ | @@
holds an output line for the execution of the next PUT statement even across iterations of the DATA step. These line-hold specifiers are called trailing @ and double trailing @. Restriction: The trailing @ or double-trailing @ must be the last item in the PUT
statement. See:
“Using Line-Hold Specifiers” on page 1664
Details Using List Output With list output, you list the names of the variables whose values you want written, or you specify a character string in quotation marks. The PUT statement writes a variable value, inserts a single blank, and then writes the next value. Missing values for numeric variables are written as a single period. Character values are left-aligned in the field; leading and trailing blanks are removed. To include blanks (in addition to the blank inserted after each value), use formatted or column output instead of list output. There are two types of list output:
3 simple list output 3 modified list output. Modified list output increases the versatility of the PUT statement because you can specify a format to control how the variable values are written. See Example 3 on page 1682.
How List Output Is Spaced
List output uses different spacing methods when it writes variable values and character strings. When a variable is written with list output, SAS automatically inserts a blank space. The output pointer stops at the second column that follows the variable value. However, when a character string is written, SAS does not automatically insert a blank space. The output pointer stops at the column that immediately follows the last character in the string. To avoid spacing problems when both character strings and variable values are written, you might want to use a blank space as the last character in a character string. When a character string that provides punctuation follows a variable value, you need to move the output pointer backward. Moving the output pointer backward prevents an unwanted space from appearing in the output line. See Example 2 on page 1682.
Comparisons How Modified List Output and Formatted Output Differ
List output and formatted output use different methods to determine how far to move the pointer after a variable value is written. Therefore, modified list output, which uses formats, and formatted output produce different results in the output lines. Modified list output writes the value, inserts a blank space, and moves the pointer to the next column. Formatted
Statements
4
PUT Statement, List
1681
output moves the pointer the length of the format, even if the value does not fill that length. The pointer moves to the next column; an intervening blank is not inserted. The following DATA step uses modified list output to write each output line: data _null_; input x y; put x : comma10.2 y : 7.2; datalines; 2353.20 7.10 6231 121 ;
These lines are written to the SAS log: ----+----1----+----2 2,353.20 7.10 6,231.00 121.00
In comparison, the following example uses formatted output: put x comma10.2 y 7.2;
These lines are written to the SAS log, with the values aligned in columns: ----+----1----+----2 2,353.20 7.10 6,231.00 121.00
Examples
Example 1: Writing Values with List Output This DATA step uses a PUT statement with list output to write variable values to the SAS log: data _null_; input name $ 1-10 sex $ 12 age 15-16; put name sex age; datalines; Joseph M 13 Mitchel M 14 Sue Ellen F 11 ;
These lines are written to the SAS log: ----+----1----+----2----+----3----+----4 Joseph M 13 Mitchel M 14 Sue Ellen F 11
By default, the values of the character variable NAME are left-aligned in the field.
1682
PUT Statement, List
4
Chapter 6
Example 2: Writing Character Strings and Variable Values This PUT statement adds a space to the end of a character string and moves the output pointer backward to prevent an unwanted space from appearing in the output line after the variable STARTWGHT: data _null_; input idno name $ startwght; put name ’weighs ’ startwght +(-1) ’.’; datalines; 032 David 180 049 Amelia 145 219 Alan 210 ;
These lines are written to the SAS log: David weighs 180. Amelia weighs 145. Alan weighs 210.
The blank space at the end of the character string changes the pointer position. This space separates the character string from the value of the variable that follows. The +(-1) pointer control moves the pointer backward to remove the unwanted blank that occurs between the value of STARTWGHT and the period.
Example 3: Writing Values with Modified List Output (:)
This DATA step uses modified list output to write several variable values in the output line using the : argument: data _null_; input salesrep : $10. tot : comma6. date : date9.; put ’Week of ’ date : worddate15. salesrep : $12. ’sales were ’ tot : dollar9. + (-1) ’.’; datalines; Wong 15,300 12OCT2004 Hoffman 9,600 12OCT2004 ;
These lines are written to the SAS log: Week of Oct 12, 2004 Wong sales were $15,300. Week of Oct 12, 2004 Hoffman sales were $9,600.
Example 4: Writing Values with Modified List Output and ~ This DATA step uses modified list output to write several variable values in the output line using the ~ argument: data _null_; input salesrep : $10. tot : comma6. date : date9.; file log delimiter=" " dsd; put ’Week of ’ date ~ worddate15. salesrep ~ $12. ’sales were ’ tot ~ dollar9. + (-1) ’.’; datalines; Wong 15,300 12OCT2004 Hoffman 9,600 12OCT2004 ;
These lines are written to the SAS log:
Statements
4
PUT Statement, Named
1683
Week of "Oct 12, 2004" "Wong" sales were "$15,300". Week of "Oct 12, 2004" "Hoffman" sales were "$9,600".
See Also Statements: “PUT Statement” on page 1656 “PUT Statement, Formatted” on page 1675
PUT Statement, Named Writes variable values after the variable name and an equal sign. Valid:
in a DATA step
Category: File-handling Type: Executable
Syntax PUT variable= ; PUT variable= start-column ;
Arguments pointer-control
moves the output pointer to a specified line or column in the output buffer. See: “Column Pointer Controls” on page 1659 and “Line Pointer Controls” on page 1660 variable=
specifies the variable whose value is written by the PUT statement in the form variable=value
format.
specifies a format to use when the variable values are written. Tip: Ensure that the format width provides enough space to write the value and any commas, dollar signs, decimal points, or other special characters that the format includes. Example: This PUT statement uses the format DOLLAR7.2 to write the value of X: put x= dollar7.2;
When X=100, the formatted value uses seven columns: X=$100.00
See:
“Formatting Named Output” on page 1684
1684
PUT Statement, Named
4
Chapter 6
start-column
specifies the first column of the field where the variable name, equal sign, and value are to be written in the output line. — end-column
determines the last column of the field for the value. If the variable name, equal sign, and value require more space than the columns specified, PUT will write past the end column rather than truncate the value. You must leave enough space before beginning the next value.
Tip:
.decimal-places
specifies the number of digits to the right of the decimal point in a numeric value. If you specify 0 for d or omit d, the value is written without a decimal point. Range:
positive integer
@ | @@
holds an output line for the execution of the next PUT statement even across iterations of the DATA step. These line-hold specifiers are called trailing @ and double trailing @. Restriction: The trailing @ or double trailing @ must be the last item in the PUT
statement. “Using Line-Hold Specifiers” on page 1664
See:
Details Using Named Output With named output, follow the variable name with an equal sign in the PUT statement. You can use either list output, column output, or formatted output specifications to indicate how to position the variable name and values. To insert a blank space between each variable value automatically, use list output. To align the output in columns, use pointer controls or column specifications. Formatting Named Output
You can specify either a SAS format or a user-written format to control how SAS prints the variable values. The width of the format does not include the columns required by the variable name and equal sign. To align a formatted value, SAS deletes leading blanks and writes the variable value immediately after the equal sign. SAS does not align on the right side of the formatted length, as in unnamed formatted output. For a complete description of the SAS formats, see “Definition of Formats” on page 84.
Examples Use named output in the PUT statement as shown here.
3 This PUT combines named output with column pointer controls to align the output: data _null_; input name $ 1-18 score1 score2 score3; put name = @20 score1= score3= ; datalines; Joseph 11 32 76 Mitchel 13 29 82 Sue Ellen 14 27 74 ;
The program writes the following lines to the SAS log:
Statements
4
PUTLOG Statement
1685
----+----1----+----2----+----3----+----4 NAME=Joseph SCORE1=11 SCORE3=76 NAME=Mitchel SCORE1=13 SCORE3=82 NAME=Sue Ellen SCORE1=14 SCORE3=74
3 This example specifies an output format for the variable AMOUNT: put item= @25 amount= dollar12.2;
When the value of ITEM is binders and the value of AMOUNT is 153.25, this output line is produced: ----+----1----+----2----+----3----+----4 ITEM=binders AMOUNT=$153.25
See Also Statement: “PUT Statement” on page 1656
PUTLOG Statement Writes a message to the SAS log. Valid:
in a DATA step
Category: Action Type: Executable
Syntax PUTLOG ’message’;
Arguments message specifies the message that you want to write to the SAS log. Message can include character literals (enclosed in quotation marks), variable names, formats, and pointer controls. You can precede your message text with WARNING, MESSAGE, or NOTE to better identify the output in the log.
Tip:
Details The PUTLOG statement writes a message that you specify to the SAS log. The PUTLOG statement is also helpful when you use macro-generated code because you can send output to the SAS log without affecting the current file destination.
Comparisons The PUTLOG statement is similar to the ERROR statement except that PUTLOG does not set _ERROR_ to 1.
1686
PUTLOG Statement
4
Chapter 6
Examples Example 1: Writing Messages to the SAS Log Using the PUTLOG Statement The following program creates the computeAverage92 macro, which computes the average score, validates input data, and uses the PUTLOG statement to write error messages to the SAS log. The DATA step uses the PUTLOG statement to write a warning message to the log. data ExamScores; input Name $ 1-16 datalines; Sullivan, James 86 Martinez, Maria 95 Guzik, Eugene 99 Schultz, John 90 van Dyke, Sylvia 98 Tan, Carol 93 ;
Score1 Score2 Score3; 92 88 91 92 98 . 87 93 . 91 85 85
options pageno=1 nodate linesize=80 pagesize=60; filename outfile ’your-output-file’; /* Create a macro that computes the average score, validates */ /* input data, and uses PUTLOG to write error messages to the */ /* SAS log. */ %macro computeAverage92(s1, s2, s3, avg); if &s1 < 0 or &s2 < 0 or &s3 < 0 then do; putlog ’ERROR: Invalid score data ’ &s1= &s2= &s3=; &avg = .; end; else &avg = mean(&s1, &s2, &s3); %mend; data _null_; set ExamScores; file outfile; %computeAverage92(Score1, Score2, Score3, AverageScore); put name Score1 Score2 Score3 AverageScore; /* Use PUTLOG to write a warning message to the SAS log. */ if AverageScore < 92 then putlog ’WARNING: Score below the minimum ’ name= AverageScore= 5.2; run; proc print; run;
The following lines are written to the SAS log.
Statements
Output 6.28
4
REDIRECT Statement
SAS Log Results from the PUTLOG Statement
WARNING: Score ERROR: Invalid WARNING: Score WARNING: Score ERROR: Invalid WARNING: Score WARNING: Score
below score below below score below below
the minimum Name=Sullivan, James AverageScore=88.67 data Score1=99 Score2=98 Score3=. the minimum Name=Guzik, Eugene AverageScore=. the minimum Name=Schultz, John AverageScore=90.00 data Score1=98 Score2=. Score3=91 the minimum Name=van Dyke, Sylvia AverageScore=. the minimum Name=Tan, Carol AverageScore=87.67
SAS creates the following output file. Output 6.29
Individual Examination Scores Exam Scores Obs 1 2 3 4 5 6
Name
1
Score1
Score2
Score3
86 95 99 90 98 93
92 91 98 87 . 85
88 92 . 93 91 85
Sullivan, James Martinez, Maria Guzik, Eugene Schultz, John van Dyke, Sylvia Tan, Carol
See Also Statement: “ERROR Statement” on page 1452
REDIRECT Statement Points to different input or output SAS data sets when you execute a stored program. in a DATA step Category: Action Valid:
Type: Executable Requirement:
You must specify the PGM= option in the DATA statement.
Syntax REDIRECT INPUT | OUTPUT old-name-1 = new-name-1;
Arguments INPUT | OUTPUT
1687
1688
4
REDIRECT Statement
Chapter 6
specifies whether to redirect input or output data sets. When you specify INPUT, the REDIRECT statement associates the name of the input data set in the source program with the name of another SAS data set. When you specify OUTPUT, the REDIRECT statement associates the name of the output data set with the name of another SAS data set. old-name
specifies the name of the input or output data set in the source program. new-name
specifies the name of the input or output data set that you want SAS to process for the current execution.
Details The REDIRECT statement is available only when you execute a stored program. For more information about stored programs, see “Stored Compiled DATA Step Programs” in SAS Language Reference: Concepts. CAUTION:
Use care when you redirect input data sets. The number and attributes of variables in the input data sets that you read with the REDIRECT statement should match the number and attributes of variables in the input data sets in the MERGE, SET, MODIFY, or UPDATE statements of the source code. If the variable type attributes differ, the stored program stops processing and an appropriate error message is written to the SAS log. If the variable length attributes differ, the length of the variable in the source code data set determines the length of the variable in the redirected data set. Extra variables in the redirected data sets cause the stored program to stop processing and an error message is written to the SAS log. 4 The DROP or KEEP data set options can be added in the stored program if the input data set that you read with the REDIRECT statement has more variables than are in the data set used to compile the program.
Tip:
Comparison The REDIRECT statement applies only to SAS data sets. To redirect input and output stored in external files, include a FILENAME statement to associate the fileref in the source program with different external files.
Examples This example executes the stored program called STORED.SAMPLE. The REDIRECT statement specifies the source of the input data as BASE.SAMPLE. The output data set from this execution of the program is redirected and stored in a data set named SUMS.SAMPLE. libname stored ’SAS-library’; libname base ’SAS-library’; libname sums ’SAS-library’; data pgm=stored.sample; redirect input in.sample=base.sample; redirect output out.sample=sums.sample; run;
Statements
4
REMOVE Statement
1689
See Also Statement: “DATA Statement” on page 1416 “Stored Compiled DATA Step Programs” in SAS Language Reference: Concepts.
REMOVE Statement Deletes an observation from a SAS data set. Valid:
in a DATA step
Category: Action Type: Executable Restriction:
Use only with a MODIFY statement.
Syntax REMOVE ;
Without Arguments If you specify no argument, the REMOVE statement deletes the current observation from all data sets that are named in the DATA statement.
Arguments data-set-name specifies the data set in which the observation is deleted. Restriction: The data set name must also appear in the DATA statement and in one or more MODIFY statements.
Details The deletion of an observation can be physical or logical, depending on the engine that maintains the data set. Using REMOVE overrides the default replacement of observations. If a DATA step contains a REMOVE statement, you must explicitly program all output for the step.
Comparisons 3 Using an OUTPUT, REPLACE, or REMOVE statement overrides the default write action at the end of a DATA step. (OUTPUT is the default action; REPLACE becomes the default action when a MODIFY statement is used.) If you use any of these statements in a DATA step, you must explicitly program all output for new observations. 3 The OUTPUT, REPLACE, and REMOVE statements are independent of each other. More than one statement can apply to the same observation, as long as the sequence is logical.
1690
RENAME Statement
4
Chapter 6
3 If both an OUTPUT and a REPLACE or REMOVE statement execute on a given observation, perform the OUTPUT action last to keep the position of the observation pointer correct. 3 Because the REMOVE statement can perform a physical or a logical deletion, REMOVE is available with the MODIFY statement for all SAS data set engines. Both the DELETE and subsetting IF statements perform only physical deletions; therefore, they are not available with the MODIFY statement for certain engines.
Examples This example removes one observation from a SAS data set. libname perm ’SAS-library’; data perm.accounts; input AcctNumber Credit; datalines; 1001 1500 1002 4900 1003 3000 ; data perm.accounts; modify perm.accounts; if AcctNumber=1002 then remove; run; proc print data=perm.accounts; title ’Edited Data Set’; run;
Here are the results of the PROC PRINT statement: Edited Data Set
OBS
Acct Number
Credit
1 3
1001 1003
1500 3000
See Also Statements: “DELETE Statement” on page 1436 “IF Statement, Subsetting” on page 1531 “MODIFY Statement” on page 1634 “OUTPUT Statement” on page 1653 “REPLACE Statement” on page 1692
RENAME Statement Specifies new names for variables in output SAS data sets.
1
Statements
Valid:
4
RENAME Statement
1691
in a DATA step
Category: Information Type: Declarative
Syntax RENAME old-name-1=new-name-1 . . . ;
Arguments old-name
specifies the name of a variable or variable list as it appears in the input data set, or in the current DATA step for newly created variables. new-name
specifies the name or list to use in the output data set.
Details The RENAME statement allows you to change the names of one or more variables, variables in a list, or a combination of variables and variable lists. The new variable names are written to the output data set only. Use the old variable names in programming statements for the current DATA step. RENAME applies to all output data sets. Note: only. 4
The RENAME statement has an effect on data sets opened in output mode
Comparisons 3 RENAME cannot be used in PROC steps, but the RENAME= data set option can. 3 The RENAME= data set option allows you to specify the variables you want to rename for each input or output data set. Use it in input data sets to rename variables before processing.
3 If you use the RENAME= data set option in an output data set, you must continue to use the old variable names in programming statements for the current DATA step. After your output data is created, you can use the new variable names.
3 The RENAME= data set option in the SET statement renames variables in the input data set. You can use the new names in programming statements for the current DATA step.
3 To rename variables as a file management task, use the DATASETS procedure or access the variables through the SAS windowing interface. These methods are simpler and do not require DATA step processing.
Examples 3 These examples show the correct syntax for renaming variables using the RENAME statement:
3 rename street=address; 3 rename time1=temp1 time2=temp2 time3=temp3; 3 rename name=Firstname score1-score3=Newscore1-Newscore3;
1692
4
REPLACE Statement
Chapter 6
3 This example uses the old name of the variable in program statements. The variable Olddept is named Newdept in the output data set, and the variable Oldaccount is named Newaccount. rename Olddept=Newdept Oldaccount=Newaccount; if Oldaccount>5000; keep Olddept Oldaccount items volume;
3 This example uses the old name OLDACCNT in the program statements. However, the new name NEWACCNT is used in the DATA statement because SAS applies the RENAME statement before it applies the KEEP= data set option. data market(keep=newdept newaccnt items volume); rename olddept=newdept oldaccnt=newaccnt; set sales; if oldaccnt>5000; run;
3 The following example uses both a variable and a variable list to rename variables. New variable names appear in the output data set. data temp; input (score1-score3) (2.,+1) name $; rename name=Firstname score1-score3=Newscore1-Newscore3; datalines; 12 24 36 Lisa 22 44 66 Fran ;
See Also Data Set Option: “RENAME= Data Set Option” on page 51
REPLACE Statement Replaces an observation in the same location. Valid:
in a DATA step
Category: Type:
Action
Executable
Restriction:
Use only with a MODIFY statement.
Syntax REPLACE < data-set-name-1>< . . .data-set-name-n>;
Statements
4
REPLACE Statement
1693
Without Arguments If you specify no argument, the REPLACE statement writes the current observation to the same physical location from which it was read in all data sets that are named in the DATA statement.
Arguments data-set-name specifies the data set to which the observation is written. Requirement: The data set name must also appear in the DATA statement and in one or more MODIFY statements.
Details Using an explicit REPLACE statement overrides the default replacement of observations. If a DATA step contains a REPLACE statement, explicitly program all output for the step.
Comparisons 3 Using an OUTPUT, REPLACE, or REMOVE statement overrides the default write action at the end of a DATA step. (OUTPUT is the default action; REPLACE becomes the default action when a MODIFY statement is used.) If you use any of these statements in a DATA step, you must explicitly program output of a new observation for the step. 3 The OUTPUT, REPLACE, and REMOVE statements are independent of each other. More than one statement can apply to the same observation, as long as the sequence is logical.
3 If both an OUTPUT and a REPLACE or REMOVE statement execute on a given observation, perform the OUTPUT action last to keep the position of the observation pointer correct.
3 REPLACE writes the observation to the same physical location, while OUTPUT writes a new observation to the end of the data set. 3 REPLACE can appear only in a DATA step that contains a MODIFY statement. You can use OUTPUT with or without MODIFY.
Examples This example updates phone numbers in data set MASTER with values in data set TRANS. It also adds one new observation at the end of data set MASTER. The SYSRC autocall macro tests the value of _IORC_ for each attempted retrieval from MASTER. (SYSRC is part of the SAS autocall macro library.) The resulting SAS data set appears after the code: data master; input FirstName $ id $ PhoneNumber; datalines; Kevin ABCjkh 904 Sandi defsns 905 Terry ghitDP 951 Jason jklJWM 962 ; data trans;
1694
RETAIN Statement
4
Chapter 6
input FirstName $ id $ PhoneNumber; datalines; . ABCjkh 2904 . defsns 2905 Madeline mnombt 2983 ; data master; modify master trans; by id; /* obs found in master */ /* change info, replace */ if _iorc_ = %sysrc(_sok) then replace; /* obs not in master */ else if _iorc_ = %sysrc(_dsenmr) then do; /* reset _error_ */ _error_=0; /* reset _iorc_ */ _iorc_=0; /* output obs to master */ output; end; run; proc print data=master; title ’MASTER with New Phone Numbers’; run; MASTER with New Phone Numbers
OBS 1 2 3 4 5
First Name Kevin Sandi Terry Jason Madeline
id
Phone Number
ABCjkh defsns ghitDP jklJWM mnombt
2904 2905 951 962 2983
3
See Also Statements: “MODIFY Statement” on page 1634 “OUTPUT Statement” on page 1653 “REMOVE Statement” on page 1689
RETAIN Statement Causes a variable that is created by an INPUT or assignment statement to retain its value from one iteration of the DATA step to the next.
Statements
4
RETAIN Statement
1695
in a DATA step Category: Information Type: Declarative Valid:
Syntax RETAIN >;
Without Arguments If you do not specify an argument, the RETAIN statement causes the values of all variables that are created with INPUT or assignment statements to be retained from one iteration of the DATA step to the next.
Arguments element-list specifies variable names, variable lists, or array names whose values you want retained. Tip: If you specify _ALL_, _CHAR_, or _NUMERIC_, only the variables that are defined before the RETAIN statement are affected. Tip: If a variable name is specified only in the RETAIN statement and you do not specify an initial value, the variable is not written to the data set, and a note stating that the variable is uninitialized is written to the SAS log. If you specify an initial value, the variable is written to the data set. initial-value specifies an initial value, numeric or character, for one or more of the preceding elements. Tip: If you omit initial-value, the initial value is missing. Initial-value is assigned to all the elements that precede it in the list. All members of a variable list, therefore, are given the same initial value. See Also: (initial-value) and (initial-value-list) (initial-value) specifies an initial value, numeric or character, for a single preceding element or for the first in a list of preceding elements. (initial-value-list) specifies an initial value, numeric or character, for individual elements in the preceding list. SAS matches the first value in the list with the first variable in the list of elements, the second value with the second variable, and so on. Element values are enclosed in quotation marks. To specify one or more initial values directly, use the following format: (initial-value(s)) To specify an iteration factor and nested sublists for the initial values, use the following format: constant value | constant-sublist Restriction: If you specify both an initial-value-list and an element-list, then element-list must be listed before initial-value-list in the RETAIN statement.
1696
RETAIN Statement
4
Chapter 6
Tip:
You can separate initial values by blank spaces or commas.
You can also use a shorthand notation for specifying a range of sequential integers. The increment is always +1.
Tip: Tip:
You can assign initial values to both variables and temporary data elements.
If there are more variables than initial values, the remaining variables are assigned an initial value of missing and SAS issues a warning message.
Tip:
Details Default DATA Step Behavior Without a RETAIN statement, SAS automatically sets variables that are assigned values by an INPUT or assignment statement to missing before each iteration of the DATA step. Assigning Initial Values Use a RETAIN statement to specify initial values for individual variables, a list of variables, or members of an array. If a value appears in a RETAIN statement, variables that appear before it in the list are set to that value initially. (If you assign different initial values to the same variable by naming it more than once in a RETAIN statement, SAS uses the last value.) You can also use RETAIN to assign an initial value other than the default value of 0 to a variable whose value is assigned by a sum statement. Redundancy It is redundant to name any of these items in a RETAIN statement, because their values are automatically retained from one iteration of the DATA step to the next:
3 3 3 3
variables that are read with a SET, MERGE, MODIFY or UPDATE statement a variable whose value is assigned in a sum statement the automatic variables _N_, _ERROR_, _I_, _CMD_, and _MSG_ variables that are created by the END= or IN= option in the SET, MERGE, MODIFY, or UPDATE statement or by options that create variables in the FILE and INFILE statements
3 data elements that are specified in a temporary array 3 array elements that are initialized in the ARRAY statement 3 elements of an array that have assigned initial values to any or all of the elements on the ARRAY statement. You can, however, use a RETAIN statement to assign an initial value to any of the previous items, with the exception of _N_ and _ERROR_.
Comparisons The RETAIN statement specifies variables whose values are not set to missing at the beginning of each iteration of the DATA step. The KEEP statement specifies variables that are to be included in any data set that is being created.
Examples Example 1: Basic Usage 3 This RETAIN statement retains the values of variables MONTH1 through MONTH5 from one iteration of the DATA step to the next: retain month1-month5;
Statements
4
RETAIN Statement
1697
3 This RETAIN statement retains the values of nine variables and sets their initial values: retain month1-month5 1 year 0 a b c ’XYZ’;
The values of MONTH1 through MONTH5 are set initially to 1; YEAR is set to 0; variables A, B, and C are each set to the character value XYZ. 3 This RETAIN statement assigns the initial value 1 to the variable MONTH1 only: retain month1-month5 (1);
Variables MONTH2 through MONTH5 are set to missing initially. 3 This RETAIN statement retains the values of all variables that are defined earlier in the DATA step but not the values that are defined afterwards: retain _all_;
3 All of these statements assign initial values of 1 through 4 to VAR1 through VAR4: 3 retain var1-var4 (1 2 3 4); 3
retain var1-var4 (1,2,3,4);
3
retain var1-var4(1:4);
Example 2: Overview of the RETAIN Operation This example shows how to use variable names and array names as elements in the RETAIN statement and shows assignment of initial values with and without parentheses: data _null_; array City{3} $ City1-City3; array cp{3} Citypop1-Citypop3; retain Year Taxyear 1999 City ’ ’ cp (10000,50000,100000); file file-specification print; put ’Values at beginning of DATA step:’ / @3 _all_ /; input Gain; do i=1 to 3; cp{i}=cp{i}+Gain; end; put ’Values after adding Gain to city populations:’ / @3 _all_; datalines; 5000 10000 ;
Here are the initial values assigned by RETAIN: 3 Year and Taxyear are assigned the initial value 1999. 3 City1, City2, and City3 are assigned missing values. 3 Citypop1 is assigned the value 10000. 3 Citypop2 is assigned 50000. 3 Citypop3 is assigned 100000. Here are the lines written by the PUT statements: Values at beginning of DATA step: City1= City2= City3= Citypop1=10000 Citypop2=50000 Citypop3=100000
1698
RETAIN Statement
4
Chapter 6
Year=1999 Taxyear=1999 Gain=. i=. _ERROR_=0 _N_=1 Values after adding GAIN to city populations: City1= City2= City3= Citypop1=15000 Citypop2=55000 Citypop3=105000 Year=1999 Taxyear=1999 Gain=5000 i=4 _ERROR_=0 _N_=1 Values at beginning of DATA step: City1= City2= City3= Citypop1=15000 Citypop2=55000 Citypop3=105000 Year=1999 Taxyear=1999 Gain=. i=. _ERROR_=0 _N_=2 Values after adding GAIN to city populations: City1= City2= City3= Citypop1=25000 Citypop2=65000 Citypop3=115000 Year=1999 Taxyear=1999 Gain=10000 i=4 _ERROR_=0 _N_=2 Values at beginning of DATA step: City1= City2= City3= Citypop1=25000 Citypop2=65000 Citypop3=115000 Year=1999 Taxyear=1999 Gain=. i=. _ERROR_=0 _N_=3
The first PUT statement is executed three times, while the second PUT statement is executed only twice. The DATA step ceases execution when the INPUT statement executes for the third time and reaches the end of the file.
Example 3: Selecting One Value from a Series of Observations
In this example, the data set ALLSCORES contains several observations for each identification number and variable ID. Different observations for a particular ID value might have different values of the variable GRADE. This example creates a new data set, CLASS.BESTSCORES, which contains one observation for each ID value. The observation must have the highest GRADE value of all observations for that ID in BESTSCORES. libname class ’SAS-library’; proc sort data=class.allscores; by id; run; data class.bestscores; drop grade; set class.allscores; by id; /* Prevents HIGHEST from being reset*/ /* to missing for each iteration. */ retain highest; /* Sets HIGHEST to missing for each */ /* different ID value. */ if first.id then highest=.; /* Compares HIGHEST to GRADE in */ /* current iteration and resets */ /* value if GRADE is higher. */
Statements
4
RETURN Statement
1699
highest=max(highest,grade); if last.id then output; run;
See Also Statements: “Assignment Statement” on page 1399 “BY Statement” on page 1403 “INPUT Statement” on page 1567
RETURN Statement Stops executing statements at the current point in the DATA step and returns to a predetermined point in the step. in a DATA step Category: Control Type: Executable Valid:
Syntax RETURN;
Without Arguments The RETURN statement causes execution to stop at the current point in the DATA step, and returns control to a previous DATA step statement.
Details The point to which SAS returns depends on the order in which statements are executed in the DATA step. The RETURN statement is often used with the 3 GO TO statement 3 HEADER= option in the FILE statement 3 LINK statement. When RETURN causes a return to the beginning of the DATA step, an implicit OUTPUT statement writes the current observation to any new data sets (unless the DATA step contains an explicit OUTPUT statement, or REMOVE or REPLACE statements with MODIFY statements). Every DATA step has an implied RETURN as its last executable statement.
Examples In this example, when the values of X and Y are the same, SAS executes the RETURN statement and adds the observation to the data set. When the values of X
1700
RUN Statement
4
Chapter 6
and Y are not equal, SAS executes the remaining statements and then adds the observation to the data set. data survey; input x y; if x=y then return; put x= y=; datalines; 21 25 20 20 7 17 ;
See Also Statements: “FILE Statement” on page 1454 “GO TO Statement” on page 1529 “LINK Statement” on page 1619
RUN Statement Executes the previously entered SAS statements. anywhere Category: Program Control Valid:
Syntax RUN ;
Without Arguments Without arguments, the RUN statement executes the previously entered SAS statements.
Arguments CANCEL terminates the current step without executing it. SAS prints a message that indicates that the step was not executed. CAUTION:
The CANCEL option does not prevent execution of a DATA step that contains a DATALINES or DATALINES4 statement. 4 CAUTION:
The CANCEL option has no effect when you use the KILL option with PROC DATASETS.
4
Statements
4
%RUN Statement
1701
Details Although the RUN statement is not required between steps in a SAS program, using it creates a step boundary and can make the SAS log easier to read.
Examples 3 This RUN statement marks a step boundary and executes this PROC PRINT step: proc print data=report; title ’Status Report’; run;
3 This example shows the usefulness of the CANCEL option in a line prompt mode session. The fourth statement in the DATA step contains an invalid value for PI (4.13 instead of 3.14). RUN with CANCEL ends the DATA step and prevents it from executing. data circle; infile file-specification; input radius; c=2*4.13*radius; run cancel;
The following message is written to the SAS log: WARNING: DATA step not executed at user’s request.
%RUN Statement Ends source statements following a %INCLUDE * statement. Valid:
anywhere
Category: Program Control
Syntax %RUN;
Without Arguments The %RUN statement causes SAS to stop reading input from the keyboard (including subsequent SAS statements on the same line as %RUN) and resume reading from the previous input source.
Details Using the %INCLUDE statement with an asterisk specifies that you enter source lines from the keyboard. Note: The asterisk (*) cannot be used to specify keyboard entry if you use the Enhanced Editor in the Microsoft Windows operating environment. 4
Comparisons The RUN statement executes previously entered DATA or PROC steps. The %RUN statement ends the prompting for source statements and returns program control to the
1702
4
SASFILE Statement
Chapter 6
original source program, when you use the %INCLUDE statement to allow data to be entered from the keyboard. The type of prompt that you use depends on how you run the SAS session. The include operation is most useful in interactive line and noninteractive modes, but it can also be used in windowing and batch mode. When you are running SAS in batch mode, include the %RUN statement in the external file that is referenced by the SASTERM fileref.
Examples 3 To request keyboard-entry source on a %INCLUDE statement, follow the statement with an asterisk: %include *;
Note: The asterisk (*) cannot be used to specify keyboard entry if you use the Enhanced Editor in the Microsoft Windows operating environment.
4
3 When it executes this statement, SAS prompts you to enter source lines from the keyboard. When you finish entering code from the keyboard, type the following statement to return processing to the program that contains the %INCLUDE statement. %run;
See Also Statements: “%INCLUDE Statement” on page 1534 “RUN Statement” on page 1700
SASFILE Statement Opens a SAS data set and allocates enough buffers to hold the entire file in memory. Anywhere
Valid:
Category:
Program Control
A SAS data set opened by the SASFILE statement can be used for subsequent input (read) or update processing but not for output or utility processing.
Restriction: See:
SASFILE Statement in the documentation for your operating environment.
Syntax SASFILE member-name< .member-type> OPEN | LOAD | CLOSE ;
Arguments libref
Statements
4
SASFILE Statement
1703
a name that is associated with a SAS library. The libref (library reference) must be a valid SAS name. The default libref is either USER (if assigned) or WORK (if USER not assigned). Restriction: The libref cannot represent a concatenation of SAS libraries that
contain a library in sequential format. member-name
a valid SAS name that is a SAS data file (a SAS data set with the member type DATA) that is a member of the SAS library associated with the libref. Restriction: The SAS data set must have been created with the V7, V8, or V9 Base SAS engine. member-type
the type of SAS file to be opened. Valid value is DATA, which is the default. password-option(s)
specifies one or more of the following password options: READ=password enables the SASFILE statement to open a read-protected file. The password must be a valid SAS name. WRITE=password enables the SASFILE statement to use the write password to open a file that is both read-protected and write-protected. The password must be a valid SAS name. ALTER=password enables the SASFILE statement to use the alter password to open a file that is both read-protected and alter-protected. The password must be a valid SAS name. PW=password enables the SASFILE statement to use the password to open a file that is assigned for all levels of protection. The password must be a valid SAS name. When SASFILE is executed, SAS checks whether the file is read-protected. Therefore, if the file is read-protected, you must include the READ= password in the SASFILE statement. If the file is either write-protected or alter-protected, you can use a WRITE=, ALTER=, or PW= password. However, the file is opened only in input (read) mode. For subsequent processing, you must specify the necessary password or passwords. See Example 2 on page 1707.
Tip:
OPEN
opens the file, allocates the buffers, but defers reading the data into memory until a procedure, statement, or application is executed. LOAD
opens the file, allocates the buffers, and reads the data into memory. Note: If the total number of allowed buffers is less than the number of buffers required for the file based on the number of data set pages and index file pages, SAS issues a warning to tell you how many pages are read into memory. 4 CLOSE
frees the buffers and closes the file.
Details General Information
The SASFILE statement opens a SAS data set and allocates enough buffers to hold the entire file in memory. Once it is read, data is held in memory, available to subsequent DATA and PROC steps or applications, until either a
1704
SASFILE Statement
4
Chapter 6
second SASFILE statement closes the file and frees the buffers or the program ends, which automatically closes the file and frees the buffers. Using the SASFILE statement can improve performance by 3 reducing multiple open/close operations (including allocation and freeing of memory for buffers) to process a SAS data set to one open/close operation
3 reducing I/O processing by holding the data in memory. If your SAS program consists of steps that read a SAS data set multiple times and you have an adequate amount of memory so that the entire file can be held in real memory, the program should benefit from using the SASFILE statement. Also, SASFILE is especially useful as part of a program that starts a SAS server such as a SAS/SHARE server. However, as with most performance-improvement features, it is suggested that you set up a test in your environment to measure performance with and without the SASFILE statement.
Processing a SAS Data Set Opened with SASFILE When the SASFILE statement executes, SAS opens the specified file. Then when subsequent DATA and PROC steps execute, SAS does not have to open the file for each request; the file remains open until a second SASFILE statement closes it or the program or session ends. When a SAS data set is opened by the SASFILE statement, the file is opened for input processing and can be used for subsequent input or update processing. However, the file cannot be used for subsequent utility or output processing, because utility and output processing requires exclusive access to the file (member-level locking). For example, you cannot replace the file or rename its variables. Table 6.11 on page 1704 provides a list of some SAS procedures and statements and specifies whether they are allowed if the file is opened by the SASFILE statement: Table 6.11
Processing Requests for a File Opened by SASFILE
Processing Request
Open Mode
Allowed
APPEND procedure
update
Yes
DATA step that creates or replaces the file
output
No
DATASETS procedure to rename or add a variable, add or change a label, or add or remove integrity constraints or indexes
utility
No
DATASETS procedure with AGE, CHANGE, or DELETE statements
does not open the file but requires exclusive access
No
FSEDIT procedure
update
Yes
PRINT procedure
input
Yes
SORT procedure that replaces original data set with sorted one
output
No
SQL procedure to modify, add, or delete observations
update
Yes
Statements
4
Processing Request
Open Mode
Allowed
SQL procedure with CREATE TABLE or CREATE VIEW statement
output
No
SQL procedure to create or remove integrity constraints or indexes
utility
No
SASFILE Statement
1705
Buffer Allocation
A buffer is a reserved area of memory that holds a segment of data while it is processed. The number of allocated buffers determines how much data can be held in memory at one time. The number of buffers is not a permanent attribute of a SAS file; that is, it is valid only for the current SAS session or job. When a SAS file is opened, a default number of buffers for processing the file is set. The default depends on the operating environment but typically is a small number, for example, one buffer. To specify a different number of buffers, you can use the BUFNO= data set option or system option. When the SASFILE statement is executed, SAS automatically allocates the number of buffers based on the number of data set pages and index file pages (if an index file exists). For example:
3 If the number of data set pages is five and there is not an index file, SAS allocates five buffers.
3 If the number of data set pages is 500 and the number of index file pages is 200, SAS allocates 700 buffers. If a file that is held in memory increases in size during processing, the number of allocated buffers increases to accommodate the file. Note that if SASFILE is executed for a SAS data set, the BUFNO= option is ignored.
I/O Processing An I/O (input/output) request reads a segment of data from a storage device (such as disk) and transfers the data to memory, or conversely transfers the data from memory and writes it to the storage device. When a SAS data set is opened by the SASFILE statement, data is read once and held in memory, which should reduce the number of I/O requests. CAUTION:
I/O processing can be reduced only if there is sufficient real memory. If the SAS data set is very large, you might not have sufficient real memory to hold the entire file. If insufficient memory exists, your operating environment can simulate more memory than actually exists, which is virtual memory. If virtual memory occurs, data access I/O requests are replaced with swapping I/O requests, which could result in no performance improvement. In addition, both SAS and your operating environment have a maximum amount of memory that can be allocated, which could be exceeded by the needs of your program. If your program needs exceed the memory that is available, the number of allocated buffers might be decreased to the default allocation in order to free memory. 4 To determine how much memory a SAS data set requires, execute the CONTENTS procedure for the file to list its page size, the number of data set pages, the index file size, and the number of index file pages.
Tip:
Using the SASFILE Statement in a SAS/SHARE Environment
The following are considerations for using the SASFILE statement with SAS/SHARE software:
1706
SASFILE Statement
4
Chapter 6
3 You must execute the SASFILE statement before you execute the PROC SERVER 3 3 3 3 3
statement. If the client (the computer on which you use a SAS session to access a SAS/ SHARE server) executes the SASFILE statement, it is rejected. Once the SASFILE statement is executed, all users who subsequently open the file will access the data held in memory instead of data that is stored on the disk. Once the SASFILE statement is executed, you cannot close the file and free the buffers until the SAS/SHARE server is terminated. You can use the ALLOCATE SASFILE command for the PROC SERVER statement as an alternative that brings part of the file into memory (controlled by the BUFNO= option). If the SASFILE statement is executed and you execute ALLOCATE SASFILE specifying a value for BUFNO= that is a larger number of buffers than allocated by SASFILE, performance will not be improved.
Comparisons 3 Use the BUFNO= system option or data set option to specify a specific number of buffers. 3 With SAS/SHARE software, you can use the ALLOCATE SASFILE command for the PROC SERVER statement to bring part of the file into memory (controlled by the BUFNO= option).
Examples Example 1: Using SASFILE in a Program with Multiple Steps
The following SAS program illustrates the process of opening a SAS data set, transferring its data to memory, and reading that data held in memory for multiple tasks. The program is consists of steps that read the file multiple times. libname mydata ’SAS-library’; sasfile mydata.census.data open; u data test1; set mydata.census; v run; data test2; set mydata.census; w run; proc summary data=mydata.census print; x run; data mydata.census; y modify mydata.census; . . (statements to modify data) . run; sasfile mydata.census close; U
1 Opens SAS data set MYDATA.CENSUS, and allocates the number of buffers based
on the number of data set pages and index file pages.
Statements
4
SELECT Statement
1707
2 Reads all pages of MYDATA.CENSUS, and transfers all data from disk to memory. 3 Reads MYDATA.CENSUS a second time, but this time from memory without
additional I/O requests. 4 Reads MYDATA.CENSUS a third time, again from memory without additional I/O
requests. 5 Reads MYDATA.CENSUS a fourth time, again from memory without additional I/ O requests. If the MODIFY statement successfully changes data in memory, the changed data is transferred from memory to disk at the end of the DATA step. 6 Closes MYDATA.CENSUS, and frees allocated buffers.
Example 2: Specifying Passwords with the SASFILE Statement
The following SAS program illustrates using the SASFILE statement and specifying passwords for a SAS data set that is both read-protected and alter-protected: libname mydata ’SAS-data-data-library’; sasfile mydata.census (read=gizmo) open; u proc print data=mydata.census (read=gizmo); v run; data mydata.census; modify mydata.census (alter=luke); w . . (statements to modify data) . run;
1 The SASFILE statement specifies the read password, which is sufficient to open
the file. 2 In the PRINT procedure, the read password must be specified again. 3 The alter password is used in the MODIFY statement, because the data set is being updated. Note: It is acceptable to use the higher-level alter password instead of the read password in the above example. 4
See Also Data Set Option: “BUFNO= Data Set Option” on page 15 System Option: “BUFNO= System Option” on page 1796 “The SERVER Procedure” in SAS/SHARE User’s Guide.
SELECT Statement Executes one of several statements or groups of statements. Valid:
in a DATA step
1708
SELECT Statement
4
Chapter 6
Control Type: Executable Category:
Syntax SELECT ; WHEN-1 (when-expression-1 ) statement; END;
Arguments (select-expression)
specifies any SAS expression that evaluates to a single value. See: “Evaluating the when-expression When a select-expression Is Included” on page 1708 (when-expression)
specifies any SAS expression, including a compound expression. SELECT requires you to specify at least one when-expression. Separating multiple when-expressions with a comma is equivalent to separating them with the logical operator OR. Tip: The way a when-expression is used depends on whether a select-expression is present. Tip:
“Evaluating the when-expression When a select-expression Is Not Included” on page 1709
See:
statement
can be any executable SAS statement, including DO, SELECT, and null statements. You must specify the statement argument.
Details Using WHEN Statements in a SELECT Group The SELECT statement begins a SELECT group. SELECT groups contain WHEN statements that identify SAS statements that are executed when a particular condition is true. Use at least one WHEN statement in a SELECT group. An optional OTHERWISE statement specifies a statement to be executed if no WHEN condition is met. An END statement ends a SELECT group. Null statements that are used in WHEN statements cause SAS to recognize a condition as true without taking further action. Null statements that are used in OTHERWISE statements prevent SAS from issuing an error message when all WHEN conditions are false. Evaluating the when-expression When a select-expression Is Included If the select-expression is present, SAS evaluates the select-expression and when-expression. SAS compares the two for equality and returns a value of true or false. If the comparison is true, statement is executed. If the comparison is false, execution proceeds either to the next when-expression in the current WHEN statement, or to the next
Statements
4
SELECT Statement
1709
WHEN statement if no more expressions are present. If no WHEN statements remain, execution proceeds to the OTHERWISE statement, if one is present. If the result of all SELECT-WHEN comparisons is false and no OTHERWISE statement is present, SAS issues an error message and stops executing the DATA step.
Evaluating the when-expression When a select-expression Is Not Included If no select-expression is present, the when-expression is evaluated to produce a result of true or false. If the result is true, statement is executed. If the result is false, SAS proceeds to the next when-expression in the current WHEN statement, or to the next WHEN statement if no more expressions are present, or to the OTHERWISE statement if one is present. (That is, SAS performs the action that is indicated in the first true WHEN statement.) If the result of all when-expressions is false and no OTHERWISE statement is present, SAS issues an error message. If more than one WHEN statement has a true when-expression, only the first WHEN statement is used; once a when-expression is true, no other when-expressions are evaluated. Processing Large Amounts of Data with %INCLUDE Files One way to process large amounts of data is to use %INCLUDE statements in your DATA step. Using %INCLUDE statements enables you to perform complex processing while keeping your main program manageable. The %INCLUDE files that you use in your main program can contain WHEN statements and other SAS statements to process your data. See Example 5 on page 1710 for an example.
Comparisons Use IF-THEN/ELSE statements for programs with few statements. Use subsetting IF statements without a THEN clause to continue processing only those observations or records that meet the condition that is specified in the IF clause.
Examples
Example 1: Using Statements select (a); when (1) x=x*10; when (2); when (3,4,5) x=x*100; otherwise; end;
Example 2: Using DO Groups select (payclass); when (’monthly’) amt=salary; when (’hourly’) do; amt=hrlywage*min(hrs,40); if hrs>40 then put ’CHECK TIMECARD’; end; /* end of do */ otherwise put ’PROBLEM OBSERVATION’; end; /* end of select */
1710
SELECT Statement
4
Chapter 6
Example 3: Using a Compound Expression select; when (mon in (’JUN’, ’JUL’, ’AUG’) and temp>70) put ’SUMMER ’ mon=; when (mon in (’MAR’, ’APR’, ’MAY’)) put ’SPRING ’ mon=; otherwise put ’FALL OR WINTER ’ mon=; end;
Example 4: Making Comparisons for Equality /* INCORRECT usage to select value of 2 */ select (x); /* evaluates T/F and compares for */ /* equality with x */ when (x=2) put ’two’; end; /* correct usage */ select(x); /* compares 2 to x for equality */ when (2) put ’two’; end; /* correct usage */ select; /* compares 2 to x for equality when (x=2) put ’two’; end;
*/
Example 5: Processing Large Amounts of Data
In the following example, the %INCLUDE statements contain code that includes WHEN statements to process new and old items in the inventory. The main program shows the overall logic of the DATA step. data test (keep=ItemNumber); set ItemList; select; %include NewItems; %include OldItems; otherwise put ’Item ’ ItemNumber ’ is not in the inventory.’; end; run;
See Also Statements: “DO Statement” on page 1441 “IF Statement, Subsetting” on page 1531 “IF-THEN/ELSE Statement” on page 1532
Statements
4
SET Statement
1711
SET Statement Reads an observation from one or more SAS data sets. in a DATA step Category: File-handling Type: Executable Valid:
Syntax SET< SAS-data-set(s) > ;
Without Arguments When you do not specify an argument, the SET statement reads an observation from the most recently created data set.
Arguments SAS-data-set (s) specifies a one-level name, a two-level name, or one of the special SAS data set names. Tip: You can specify data set lists. For more information, see “Using Data Set Lists with SET” on page 1715. See Also: See “SAS Data Sets” in SAS Language Reference: Concepts for a description of the levels of SAS data set names and when to use each level. Featured in: Example 13 on page 1719 (data-set-options) specifies actions SAS is to take when it reads variables or observations into the program data vector for processing. Tip: Data set options that apply to a data set list apply to all of the data sets in the list. See: Refer to “Definition of Data Set Options” on page 10 for a list of the data set options to use with input data sets.
Options END=variable creates and names a temporary variable that contains an end-of-file indicator. The variable, which is initialized to zero, is set to 1 when SET reads the last observation of the last data set listed. This variable is not added to any new data set. Restriction: END= cannot be used with POINT=. When random access is used, the END= variable is never set to 1. Interaction: If you use a BY statement, END= is set to 1 when the SET statement reads the last observation of the interleaved data set. For more information, see “BY-Group Processing with SET” on page 1716. Featured in: Example 11 on page 1719
1712
SET Statement
4
Chapter 6
KEY=index provides nonsequential access to observations in a SAS data set, which are based on the value of an index variable or a key. Range: Specify the name of a simple or a composite index of the data set that is
being read. Restriction: KEY= cannot be used with POINT=.
Using the _IORC_ automatic variable in conjunction with the SYSRC autocall macro provides you with more error-handling information than was previously available. When you use the SET statement with the KEY= option, the new automatic variable _IORC_ is created. This automatic variable is set to a return code that shows the status of the most recent I/O operation that is performed on an observation in a SAS data set. If the KEY= value is not found, the _IORC_ variable returns a value that corresponds to the SYSRC autocall macro’s mnemonic _DSENOM and the automatic variable _ERROR_ is set to 1.
Tip:
Featured in:
Example 7 on page 1718 and Example 8 on page 1718.
See Also: For more information, see the description of the autocall macro SYSRC
in SAS Macro Language: Reference. See Also: UNIQUE option on page 1714
CAUTION:
Continuous loops can occur when you use the KEY= option. If you use the KEY= option without specifying the primary data set, you must include either a STOP statement to stop DATA step processing, or programming logic that uses the _IORC_ automatic variable in conjunction with the SYSRC autocall macro and checks for an invalid value of the _IORC_ variable, or both. 4 INDSNAME=variable creates and names a variable that stores the name of the SAS data set from which the current observation is read. The stored name can be a data set name or a physical name. The physical name is the name by which the operating environment recognizes the file. For data set names, SAS will add the library name to the variable value (for example, WORK.PRICE) and convert the two-level name to uppercase.
Tip:
Unless previously defined, the length of the variable is set to 41 characters. Use a LENGTH statement to make the variable length long enough to contain the value of the physical filename if it is longer than 41 characters. If the variable is previously defined as a character variable with a specific length, that length is not changed. If the value placed into the INDSNAME variable is longer than that length, then the value is truncated. If the variable is previously defined as a numeric variable, an error will occur.
Tip:
Featured in:
Example 12 on page 1719
NOBS=variable creates and names a temporary variable whose value is usually the total number of observations in the input data set or data sets. If more than one data set is listed in the SET statement, NOBS= the total number of observations in the data sets that are listed. The number of observations includes those observations that are marked for deletion but are not yet deleted. Restriction: For certain SAS views, SAS cannot determine the number of
observations. In these cases, SAS sets the value of the NOBS= variable to the largest positive integer value that is available in your operating environment. At compilation time, SAS reads the descriptor portion of each data set and assigns the value of the NOBS= variable automatically. Thus, you can refer to
Tip:
Statements
4
SET Statement
1713
the NOBS= variable before the SET statement. The variable is available in the DATA step but is not added to any output data set. Interaction: The NOBS= and POINT= options are independent of each other. Featured in: Example 10 on page 1718 OPEN=(IMMEDIATE | DEFER) allows you to delay the opening of any concatenated SAS data sets until they are ready to be processed. IMMEDIATE during the compilation phase, opens all data sets that are listed in the SET statement. Restriction: When you use the IMMEDIATE option KEY=, POINT=, and BY statement processing are mutually exclusive. Tip: If a variable on a subsequent data set is of a different type (character versus numeric, for example) than the type of the same-named variable on the first data set, the DATA step will stop processing and produce an error message. DEFER opens the first data set during the compilation phase, and opens subsequent data sets during the execution phase. When the DATA step reads and processes all observations in a data set, it closes the data set and opens the next data set in the list. Restriction: When you specify the DEFER option, you cannot use the KEY= statement option, the POINT= statement option, or the BY statement. These constructs imply either random processing or interleaving of observations from the data sets, which is not possible unless all data sets are open. Requirement: You can use the DROP=, KEEP=, or RENAME= data set options to process a set of variables, but the set of variables that are processed for each data set must be identical. In most cases, if the set of variables defined by any subsequent data set differs from the variables defined by the first data set, SAS prints a warning message to the log but does not stop execution. SAS stops execution for some conditions: 1 If a variable on a subsequent data set is of a different type (character versus numeric, for example) than the type of the same-named variable on the first data set, the DATA step will stop processing and produce an error message. 2 If a variable on a subsequent data set was not defined by the first data set in the SET statement, but was defined previously in the DATA step program, the DATA step will stop processing and produce an error message. In this case, the value of the variable in previous iterations might be incorrect because the semantic behavior of SET requires this variable to be set to missing when processing the first observation of the first data set. Default: IMMEDIATE POINT=variable specifies a temporary variable whose numeric value determines which observation is read. POINT= causes the SET statement to use random (direct) access to read a SAS data set. Requirement: a STOP statement Restriction: You cannot use POINT= with a BY statement, a WHERE statement, or a WHERE= data set option. In addition, you cannot use it with transport
1714
SET Statement
4
Chapter 6
format data sets, data sets in sequential format on tape or disk, and SAS/ACCESS views or the SQL procedure views that read data from external files. Restriction: You cannot use POINT= with KEY=.
You must supply the values of the POINT= variable. For example, you can use the POINT= variable as the index variable in some form of the DO statement.
Tip:
The POINT= variable is available anywhere in the DATA step, but it is not added to any new SAS data set.
Tip:
Featured in:
Example 6 on page 1718 and Example 9 on page 1718
CAUTION:
Continuous loops can occur when you use the POINT= option. When you use the POINT= option, you must include a STOP statement to stop DATA step processing, programming logic that checks for an invalid value of the POINT= variable, or both. Because POINT= reads only those observations that are specified in the DO statement, SAS cannot read an end-of-file indicator as it would if the file were being read sequentially. Because reading an end-of-file indicator ends a DATA step automatically, failure to substitute another means of ending the DATA step when you use POINT= can cause the DATA step to go into a continuous loop. If SAS reads an invalid value of the POINT= variable, it sets the automatic variable _ERROR_ to 1. Use this information to check for conditions that cause continuous DO-loop processing, or include a STOP statement at the end of the DATA step, or both. 4 UNIQUE causes a KEY= search always to begin at the top of the index for the data set that is being read. Restriction: UNIQUE can appear only with the KEY= argument and must be
preceded by a slash. Explanation: By default, SET begins searching at the top of the index only when
the KEY= value changes. If the KEY= value does not change on successive executions of the SET statement, the search begins by following the most recently retrieved observation. In other words, when consecutive duplicate KEY= values appear, the SET statement attempts a one-to-one match with duplicate indexed values in the data set that is being read. If more consecutive duplicate KEY= values are specified than exist in the data set that is being read, the extra duplicates are treated as not found. When KEY= is a unique value, only the first attempt to read an observation with that key value succeeds; subsequent attempts to read the observation with that value of the key will fail. The _IORC_ variable returns a value that corresponds to the SYSRC autocall macro’s mnemonic _DSENOM. If you add the /UNIQUE option, subsequent attempts to read the observation with the unique KEY= value will succeed. The _IORC_ variable returns a 0. Featured in:
Example 8 on page 1718
See Also: For extensive examples, see Combining and Modifying SAS Data Sets:
Examples.
Details What SET Does Each time the SET statement is executed, SAS reads one observation into the program data vector. SET reads all variables and all observations from the
Statements
4
SET Statement
1715
input data sets unless you tell SAS to do otherwise. A SET statement can contain multiple data sets; a DATA step can contain multiple SET statements. See Combining and Modifying SAS Data Sets: Examples.
Uses
The SET statement is flexible and has a variety of uses in SAS programming. These uses are determined by the options and statements that you use with the SET statement: 3 reading observations and variables from existing SAS data sets for further processing in the DATA step 3 concatenating and interleaving data sets, and performing one-to-one reading of data sets 3 reading SAS data sets by using direct access methods.
Using Data Set Lists with SET
You can use data set lists with the SET statement. Data set lists provide a quick way to reference existing groups of data sets. These data set lists must be either name prefix lists or numbered range lists. Name prefix lists refer to all data sets that begin with a specified character string. For example, set SALES1:; tells SAS to read all data sets starting with "SALES1" such as SALES1, SALES10, SALES11, and SALES12. Numbered range lists require you to have a series of data sets with the same name, except for the last character or characters, which are consecutive numbers. In a numbered range list, you can begin with any number and end with any number. For example, these lists refer to the same data sets: sales1 sales2 sales3 sales4 sales1-sales4
Note: If the numeric suffix of the first data set name contains leading zeros, the number of digits in the numeric suffix of the last data set name must be greater than or equal to the number of digits in the first data set name; otherwise, an error will occur. For example, the data set lists sales001–sales99 and sales01–sales9 will cause an error. The data set list sales001–sales999 is valid. If the numeric suffix of the first data set name does not contain leading zeros, the number of digits in the numeric suffix of the first and last data set names do not have to be equal. For example, the data set list sales1–sales999 is valid. 4 Some other rules to consider when using numbered data set lists are as follows: 3 You can specify groups of ranges. set cost1-cost4 cost11-cost14 cost21-cost24;
3 You can mix numbered range lists with name prefix lists. set cost1-cost4 cost2: cost33-37;
3 You can mix single data sets with data set lists. set cost1 cost10-cost20 cost30;
3 Quotation marks around data set lists are ignored. /* these two lines are the same */ set sales1 - sales4; set ’sales1’n - ’sales4’n;
3 Spaces in data set names are invalid. If quotation marks are used, trailing blanks are ignored. /* blanks in these statements will cause errors */ set sales 1 - sales 4;
1716
SET Statement
4
Chapter 6
set ’sales 1’n - ’sales 4’n; /* trailing blanks in this statement will be ignored */ set ’sales1 ’n - ’sales4 ’n;
3 The maximum numeric suffix is 2147483647. /* this suffix will cause an error */ set prod2000000000-prod2934850239;
3 Physical pathnames are not allowed. /* physical pathnames will cause an error */ &let work_path = %sysfunc(pathname(WORK)); set "&work_path\dept.sas7bdat";
BY-Group Processing with SET
Only one BY statement can accompany each SET statement in a DATA step. The BY statement should immediately follow the SET statement to which it applies. The data sets that are listed in the SET statement must be sorted by the values of the variables that are listed in the BY statement, or they must have an appropriate index. SET, when it is used with a BY statement, interleaves data sets. The observations in the new data set are arranged by the values of the BY variable or variables, and within each BY group, by the order of the data sets in which they occur. See Example 2 on page 1717 for an example of BY-group processing with the SET statement.
Combining SAS Data Sets Use a single SET statement with multiple data sets to concatenate the specified data sets. That is, the number of observations in the new data set is the sum of the number of observations in the original data sets, and the order of the observations is all the observations from the first data set followed by all the observations from the second data set, and so on. See Example 1 on page 1717 for an example of concatenating data sets. Use a single SET statement with a BY statement to interleave the specified data sets. The observations in the new data set are arranged by the values of the BY variable or variables, and within each BY group, by the order of the data sets in which they occur. See Example 2 on page 1717 for an example of interleaving data sets. Use multiple SET statements to perform one-to-one reading (also called one-to-one matching) of the specified data sets. The new data set contains all the variables from all the input data sets. The number of observations in the new data set is the number of observations in the smallest original data set. If the data sets contain common variables, the values that are read in from the last data set replace the values that were read in from earlier ones. See Example 6 on page 1718, Example 7 on page 1718, and Example 8 on page 1718 for examples of one-to-one reading of data sets. For extensive examples, see Combining and Modifying SAS Data Sets: Examples. For more information about how to prepare your data sets, see “Combining SAS Data Sets: Basic Concepts” in SAS Language Reference: Concepts.
Comparisons 3 SET reads an observation from an existing SAS data set. INPUT reads raw data from an external file or from in-stream data lines in order to create SAS variables and observations.
3 Using the KEY= option with SET enables you to access observations nonsequentially in a SAS data set according to a value. Using the POINT= option with SET enables you to access observations nonsequentially in a SAS data set according to the observation number.
Statements
4
SET Statement
1717
Examples
Example 1: Concatenating SAS Data Sets
If more than one data set name appears in the SET statement, the resulting output data set is a concatenation of all the data sets that are listed. SAS reads all observations from the first data set, then all from the second data set, and so on, until all observations from all the data sets have been read. This example concatenates the three SAS data sets into one output data set named FITNESS: data fitness; set health exercise well; run;
Example 2: Interleaving SAS Data Sets
To interleave two or more SAS data sets, use a
BY statement after the SET statement: data april; set payable recvable; by account; run;
Example 3: Reading a SAS Data Set In this DATA step, each observation in the data set NC.MEMBERS is read into the program data vector. Only those observations whose value of CITY is Raleigh are output to the new data set RALEIGH.MEMBERS: data raleigh.members; set nc.members; if city=’Raleigh’; run;
Example 4: Merging a Single Observation with All Observations in a SAS Data Set
An observation to be merged into an existing data set can be one that is created by a SAS procedure or another DATA step. In this example, the data set AVGSALES has only one observation: data national; if _n_=1 then set avgsales; set totsales; run;
Example 5: Reading from the Same Data Set More Than Once
In this example, SAS treats each SET statement independently; that is, it reads from one data set as if it were reading from two separate data sets: data drugxyz; set trial5(keep=sample); if sample>2; set trial5; run;
For each iteration of the DATA step, the first SET statement reads one observation. The next time the first SET statement is executed, it reads the next observation. Each SET statement can read different observations with the same iteration of the DATA step.
1718
SET Statement
4
Chapter 6
Example 6: Combining One Observation with Many
You can subset observations from one data set and combine them with observations from another data set by using direct access methods, as follows: data south; set revenue; if region=4; set expense point=_n_; run;
Example 7: Performing a Table Lookup This example illustrates using the KEY= option to perform a table lookup. The DATA step reads a primary data set that is named INVTORY and a lookup data set that is named PARTCODE. It uses the index PARTNO to read PARTCODE nonsequentially, by looking for a match between the PARTNO value in each data set. The purpose is to obtain the appropriate description, which is available only in the variable DESC in the lookup data set, for each part that is listed in the primary data set: data combine; set invtory(keep=partno instock price); set partcode(keep=partno desc) key=partno; run;
Example 8: Performing a Table Lookup When the Master File Contains Duplicate Observations This example uses the KEY= option to perform a table lookup. The DATA step reads a primary data set that is named INVTORY, which is indexed on PARTNO, and a lookup data set named PARTCODE. PARTCODE contains quantities of new stock (variable NEW_STK). The UNIQUE option ensures that, if there are any duplicate observations in INVTORY, values of NEW_STK are added only to the first observation of the group: data combine; set partcode(keep=partno new_stk); set invtory(keep=partno instock price) key=partno/unique; instock=instock+new_stk; run;
Example 9: Reading a Subset by Using Direct Access These statements select a subset of 50 observations from the data set DRUGTEST by using the POINT= option to access observations directly by number: data sample; do obsnum=1 to 100 by 2; set drugtest point=obsnum; if _error_ then abort; output; end; stop; run;
Example 10: Performing a Function Until the Last Observation Is Reached These statements use NOBS= to set the termination value for DO-loop processing. The value of the temporary variable LAST is the sum of the observations in SURVEY1 and SURVEY2: do obsnum=1 to last by 100; set survey1 survey2 point=obsnum nobs=last;
Statements
4
SET Statement
1719
output; end; stop;
Example 11: Writing an Observation Only After All Observations Have Been Read
This example uses the END= variable LAST to tell SAS to assign a value to the variable REVENUE and write an observation only after the last observation of RENTAL has been read: set rental end=last; totdays + days; if last then do; revenue=totdays*65.78; output; end;
Example 12: Retrieving the Name of the Data Set from Which the Current Observation Is Read This example creates three data sets and stores the data set name in a variable named dsn. The name is split into three parts and the example prints out the results. /* Create some data sets to read */ data gas_price_option; value=395; run; data gas_rbid_option; value=840; run; data gas_price_forward; value=275; run; /* Create a data set D */ data d; set gas_price_option gas_rbid_option gas_price_forward indsname=dsn; /* split the data set names into 3 parts */ commodity = scan (dsn, 2, "._"); type = scan (dsn, 3, "._"); instrument = scan (dsn, 4, "._"); run; proc print data=d; run;
Output 6.30
Data Set Name Split into Three Parts The SAS System Obs
value
commodity
1 2 3
395 840 275
GAS GAS GAS
Example 13: Using Data Set Lists the data sets. data data data data
dept008; dept009; dept010; dept011;
emp=13; run; emp=9; run; emp=4; run; emp=33; run;
type PRICE RBID PRICE
1 instrument OPTION OPTION FORWARD
This example uses a numbered range list to input
1720
SET Statement
4
Chapter 6
data _null_; set dept008-dept010; put _all_; run;
The following lines are written to the SAS log.
Output 6.31
Using a Data Set List with the SET Statement
1 data dept008; emp=13; run; NOTE: The data set WORK.DEPT008 has 1 observations and 1 variables. NOTE: DATA statement used (Total process time): real time 0.06 seconds cpu time 0.03 seconds 2 data dept009; emp=9; run; NOTE: The data set WORK.DEPT009 has 1 observations and 1 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 3 data dept010; emp=4; run; NOTE: The data set WORK.DEPT010 has 1 observations and 1 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 4 data dept011; emp=33; run; NOTE: The data set WORK.DEPT011 has 1 observations and 1 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 5 6 data _null_; 7 set dept008-dept010; 8 put _all_; 9 run; emp=13 _ERROR_=0 _N_=1 emp=9 _ERROR_=0 _N_=2 emp=4 _ERROR_=0 _N_=3 NOTE: There were 1 observations read from the data set WORK.DEPT008. NOTE: There were 1 observations read from the data set WORK.DEPT009. NOTE: There were 1 observations read from the data set WORK.DEPT010. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds
In addition, you could use data set lists to find missing data sets. This example uses a numbered range list to locate the missing data sets. An error occurs for each data set that does not exist. Once you know which data sets are missing, you can correct the SET statement to reflect the data sets that actually exist. data data data data
dept008; dept009; dept011; dept014;
emp=13; run; emp=9; run; emp=4; run; emp=33; run;
data _null_; set dept008-dept014; put _all_; run;
Statements
4
SET Statement
The following lines are written to the SAS log.
Output 6.32
Finding Missing Data Sets Using the SET Statement
1 data dept008; emp=13; run; NOTE: The data set WORK.DEPT008 has 1 observations and 1 variables. NOTE: DATA statement used (Total process time): real time 0.04 seconds cpu time 0.04 seconds 2 data dept009; emp=9; run; NOTE: The data set WORK.DEPT009 has 1 observations and 1 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 3 data dept011; emp=4; run; NOTE: The data set WORK.DEPT011 has 1 observations and 1 variables. NOTE: DATA statement used (Total process time): real time 0.03 seconds cpu time 0.01 seconds 4 data dept014; emp=33; run; NOTE: The data set WORK.DEPT014 has 1 observations and 1 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 5 data _null_; 6 set dept008-dept014; ERROR: File WORK.DEPT010.DATA does not exist. ERROR: File WORK.DEPT012.DATA does not exist. ERROR: File WORK.DEPT013.DATA does not exist. 7 put _all_; 8 run; NOTE: The SAS System stopped processing this step because of errors. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds
See Also Statements: “BY Statement” on page 1403 “DO Statement” on page 1441 “INPUT Statement” on page 1567 “MERGE Statement” on page 1629 “STOP Statement” on page 1722 “UPDATE Statement” on page 1733 “Rules for Words and Names” in SAS Language Reference: Concepts “Reading, Modifying, and Combining SAS Data Sets” in SAS Language Reference: Concepts “Definition of Data Set Options” on page 10 SAS Macro Language: Reference Combining and Modifying SAS Data Sets: Examples
1721
1722
SKIP Statement
4
Chapter 6
SKIP Statement Creates a blank line in the SAS log. Valid:
Anywhere Log Control
Category:
Syntax SKIP ;
Without Arguments Using SKIP without arguments causes SAS to create one blank line in the log.
Arguments n specifies the number of blank lines that you want to create in the log. If the number specified is greater than the number of lines that remain on the page, SAS goes to the top of the next page.
Tip:
Details The SKIP statement itself does not appear in the log. You can use this statement in all methods of operation.
See Also Statement: “PAGE Statement” on page 1656 System Options: “LINESIZE= System Option” on page 1878 “PAGESIZE= System Option” on page 1900
STOP Statement Stops execution of the current DATA step. Valid:
in a DATA step
Category: Type:
Action
Executable
Statements
4
STOP Statement
1723
Syntax STOP;
Without Arguments The STOP statement causes SAS to stop processing the current DATA step immediately and resume processing statements after the end of the current DATA step.
Details SAS outputs a data set for the current DATA step. However, the observation being processed when STOP executes is not added. The STOP statement can be used alone or in an IF-THEN statement or SELECT group. Use STOP with any features that read SAS data sets using random access methods, such as the POINT= option in the SET statement. Because SAS does not detect an end-of-file with this access method, you must include program statements to prevent continuous processing of the DATA step.
Comparisons 3 When you use a windowing environment or other interactive methods of operation, the ABORT statement and the STOP statement both stop processing. The ABORT statement sets the value of the automatic variable _ERROR_ to 1, but the STOP statement does not.
3 In batch or noninteractive mode, the two statements also have different effects. Use the STOP statement in batch or noninteractive mode to continue processing with the next DATA or PROC step.
Examples Example 1: Basic Usage 3 stop;
3 3
if idcode=9999 then stop; select (a); when (0) output; otherwise stop; end;
Example 2: Avoiding an Infinite Loop This example shows how to use STOP to avoid an infinite loop within a DATA step when you are using random access methods: data sample; do sampleobs=1 to totalobs by 10; set master.research point=sampleobs nobs=totalobs; output; end; stop; run;
See Also Statements: “ABORT Statement” on page 1388
1724
Sum Statement
4
Chapter 6
POINT= option in the SET statement on page 1713
Sum Statement Adds the result of an expression to an accumulator variable. in a DATA step Category: Action Type: Executable Valid:
Syntax variable+expression;
Arguments variable
specifies the name of the accumulator variable, which contains a numeric value. Tip: The variable is automatically set to 0 before SAS reads the first observation. The variable’s value is retained from one iteration to the next, as if it had appeared in a RETAIN statement. Tip: To initialize a sum variable to a value other than 0, include it in a RETAIN statement with an initial value. expression
is any SAS expression. Tip: The expression is evaluated and the result added to the accumulator variable. Tip: SAS treats an expression that produces a missing value as zero.
Comparisons The sum statement is equivalent to using the SUM function and the RETAIN statement, as shown here: retain variable 0; variable=sum(variable,expression);
Examples Here are examples of sum statements that illustrate various expressions:
3
balance+(-debit);
3
sumxsq+x*x;
3
nx+(x ne .);
3
if status=’ready’ then OK+1;
See Also
Statements
4
TITLE Statement
1725
Function: “SUM Function” on page 1108 Statement: “RETAIN Statement” on page 1694
SYSECHO Statement Fires a global statement complete event and passes a text string back to the IOM client. Valid:
anywhere
Category: Program Control Restriction:
Has an effect only in objectserver mode
Syntax SYSECHO ;
Without Arguments Using SYSECHO without arguments sends a global statement complete event to the IOM client.
Arguments "text" specifies a text string that is passed back to the IOM client. Range: 1–64 characters Requirement:
The text string must be enclosed in double quotation marks.
Details The SYSECHO statement enables IOM clients to manually track the progress of a segment of a submitted SAS program. When the SYSECHO statement is executed, a global statement complete event is generated and, if specified, the text string is passed back to the IOM client.
TITLE Statement Specifies title lines for SAS output. Valid:
anywhere
Category: Output Control See:
TITLE Statement in the documentation for your operating environment.
1726
TITLE Statement
4
Chapter 6
Syntax TITLE ;
Without Arguments Using TITLE without arguments cancels all existing titles.
Arguments n specifies the relative line that contains the title line. Range: 1 - 10 Tip: The title line with the highest number appears on the bottom line. If you omit n, SAS assumes a value of 1. Therefore, you can specify TITLE or TITLE1 for the first title line. Tip: You can create titles that contain blank lines between the lines of text. For example, if you specify text with a TITLE statement and a TITLE3 statement, there will be a blank line between the two lines of text. ods-format-options specifies formatting options for the ODS HTML, RTF, and PRINTER destinations. BOLD specifies that the title text is bold font weight. ODS Destinations: HTML, RTF, PRINTER COLOR=color specifies the title text color. Alias: C ODS Destinations: HTML, RTF, PRINTER Featured in: Example 3 on page 1729 BCOLOR=color specifies the background color of the title block. ODS Destinations: HTML, RTF, PRINTER FONT=font-face specifies the font to use. If you supply multiple fonts, then the destination device uses the first one that is installed on your system. Alias: F ODS Destinations: HTML, RTF, PRINTER HEIGHT=size specifies the point size. Alias: H ODS Destinations: HTML, RTF, PRINTER Featured in: Example 3 on page 1729 ITALIC specifies that the title text is in italic style. ODS Destinations: HTML, RTF, PRINTER JUSTIFY= CENTER | LEFT | RIGHT specifies justification. CENTER
Statements
4
TITLE Statement
1727
specifies center justification. Alias: C LEFT specifies left justification. Alias: L RIGHT specifies right justification. Alias: R Alias: J ODS Destinations: HTML, RTF, PRINTER Featured in: Example 3 on page 1729 LINK=’url’ specifies a hyperlink. Tip: The visual properties for LINK= always come from the current style. ODS Destinations: HTML, RTF, PRINTER UNDERLIN= 0 | 1 | 2 | 3 specifies whether the subsequent text is underlined. 0 indicates no underlining. 1, 2, and 3 indicates underlining. Alias: U Tip: ODS generates the same type of underline for values 1, 2, and 3. However, SAS/GRAPH uses values 1, 2, and 3 to generate increasingly thicker underlines. ODS Destinations: HTML, RTF, PRINTER Note: The defaults for how ODS renders the TITLE statement come from style elements relating to system titles in the current style. The TITLE statement syntax with ods-format-options is a way to override the settings provided by the current style. The current style varies according to the ODS destination. For more information on how to determine the current style, see “What Are Style Definitions, Style Elements, and Style Attributes?” and “Concepts: Style Definitions and the TEMPLATE Procedure” in the SAS Output Delivery System: User’s Guide. 4 Tip: You can specify these options by letter, word, or words by preceding each letter or word of the text by the option. For example, this code will make the title “Red, White, and Blue” appear in different colors. title color=red "Red," color=white "White, and" color=blue "Blue";
’text’ | “text” specifies text that is enclosed in single or double quotation marks. You can customize titles by inserting BY variable values (#BYVALn), BY variable names (#BYVARn), or BY lines (#BYLINE) in titles that are specified in PROC steps. Embed the items in the specified title text string at the position where you want the substitution text to appear. #BYVALn | #BYVAL(variable-name) substitutes the current value of the specified BY variable for #BYVAL in the text string and displays the value in the title. Follow these rules when you use #BYVAL in the TITLE statement of a PROC step: 3 Specify the variable that is used by #BYVAL in the BY statement.
1728
TITLE Statement
4
Chapter 6
3 Insert #BYVAL in the specified title text string at the position where you want the substitution text to appear.
3 Follow #BYVAL with a delimiting character, either a space or other nonalphanumeric character (for example, a quotation mark) that ends the text string.
3 If you want the #BYVAL substitution to be followed immediately by other text, with no delimiter, use a trailing dot (as with macro variables). Specify the variable with one of the following: n specifies which variable in the BY statement #BYVAL should use. The value of n indicates the position of the variable in the BY statement. Example: #BYVAL2 specifies the second variable in the BY statement.
variable-name names the BY variable. Example: #BYVAL(YEAR) specifies the BY variable, YEAR. Tip: Variable-name is not case sensitive.
#BYVARn | #BYVAR(variable-name) substitutes the name of the BY variable or label that is associated with the variable (whatever the BY line would normally display) for #BYVAR in the text string and displays the name or label in the title. Follow these rules when you use #BYVAR in the TITLE statement of a PROC step:
3 Specify the variable that is used by #BYVAR in the BY statement. 3 Insert #BYVAR in the specified title text string at the position where you want the substitution text to appear.
3 Follow #BYVAR with a delimiting character, either a space or other nonalphanumeric character (for example, a quotation mark) that ends the text string.
3 If you want the #BYVAR substitution to be followed immediately by other text, with no delimiter, use a trailing dot (as with macro variables). Specify the variable with one of the following: n specifies which variable in the BY statement #BYVAR should use. The value of n indicates the position of the variable in the BY statement. Example: #BYVAR2 specifies the second variable in the BY statement.
variable-name names the BY variable. Example: #BYVAR(SITES) specifies the BY variable SITES. Tip: variable-name is not case sensitive.
#BYLINE substitutes the entire BY line without leading or trailing blanks for #BYLINE in the text string and displays the BY line in the title. Tip: #BYLINE produces output that contains a BY line at the top of the page
unless you suppress it by using NOBYLINE in an OPTIONS statement. See Also: For more information on NOBYLINE, see “BYLINE System
Option” on page 1800.
Statements
4
TITLE Statement
1729
For compatibility with previous releases, SAS accepts some text without quotation marks. When writing new programs or updating existing programs, always enclose text in quotation marks.
Tip:
If you use single quotation marks (’’) or double quotation marks (””) together (with no space in between them) as the string of text, SAS will output a single quotation mark ( ’) or double quotation marks (””), respectively.
Tip:
If you use an automatic macro variable in the title text, you must enclose the title text in double quotation marks. The SAS macro facility will resolve the macro variable only if the text is in double quotation marks.
Tip:
For more information about including quotation marks as part of the title, see “Expressions” in SAS Language Reference: Concepts.
See Also:
Details In a DATA Step or PROC Step A TITLE statement takes effect when the step or RUN group with which it is associated executes. Once you specify a title for a line, it is used for all subsequent output until you cancel the title or define another title for that line. A TITLE statement for a given line cancels the previous TITLE statement for that line and for all lines with larger n numbers. Operating Environment Information: The maximum title length that is allowed depends on your operating environment and the value of the LINESIZE= system option. Refer to the SAS documentation for your operating environment for more information. 4
Comparisons You can also create titles with the TITLES window.
Examples Example 1: Using the TITLE Statement
The following examples show how you can use
the TITLE statement:
3 This statement suppresses a title on line n and all lines after it: titlen;
3 These code lines are examples of TITLE statements: 3 title ’First Draft’; 3
title2 "Year’s End Report";
3
title2 ’Year’’s End Report’;
Example 2: Customizing Titles by Using BY Variable Values
You can customize titles by inserting BY variable values in the titles that you specify in PROC steps. The following examples show how to use #BYVALn, #BYVARn, and #BYLINE:
3
title ’Quarterly Sales for #byval(site)’;
3
title ’Annual Costs for #byvar2’;
3
title ’Data Group #byline’;
Example 3: Customizing Titles and Footnotes by Using the Output Delivery System You can customize titles and footnotes with ODS. The following example shows you how to use PROC TEMPLATE to change the color, justification, and size of the text for the title and footnote.
1730
TITLE Statement
4
Chapter 6
/********************************************* *The following program creates the data set * *grain_production and the $cntry format. * *********************************************/ data grain_production; length Country $ 3 Type $ 5; input Year country $ type $ Kilotons; datalines; 1995 1995 1995 1995 1995 1995 1995 1995 1995 1995 1995 1995 1995 1995 1995 1996 1996 1996 1996 1996 1996 1996 1996 1996 1996 1996 1996 1996 1996 1996 ; run;
BRZ Wheat BRZ Rice BRZ Corn CHN Wheat CHN Rice CHN Corn IND Wheat IND Rice IND Corn INS Wheat INS Rice INS Corn USA Wheat USA Rice USA Corn BRZ Wheat BRZ Rice BRZ Corn CHN Wheat CHN Rice CHN Corn IND Wheat IND Rice IND Corn INS Wheat INS Rice INS Corn USA Wheat USA Rice USA Corn
1516 11236 36276 102207 185226 112331 63007 122372 9800 . 49860 8223 59494 7888 187300 3302 10035 31975 109000 190100 119350 62620 120012 8660 . 51165 8925 62099 7771 236064
proc format; value $cntry ’BRZ’=’Brazil’ ’CHN’=’China’ ’IND’=’India’ ’INS’=’Indonesia’ ’USA’=’United States’; run; /***************************************** *This PROC TEMPLATE step creates the * *table definition TABLE1 that is used * *in the DATA step. * *****************************************/
Statements
4
TITLE Statement
proc template; define table table1; mvar sysdate9; dynamic colhd; classlevels=on; define column char_var; generic=on; blank_dups=on; header=colhd; style=cellcontents; end; define column num_var; generic=on; header=colhd; style=cellcontents; end; define footer table_footer; end; end; run; /*********************************************************************** *The ODS LISTING CLOSE statement closes the Listing * *destination to conserve resources. * * * *The ODS HTML statement creates HTML output created with * *the style defintion D3D. * * * *The TITLE statement specifies the text for the first title * *and the attributes that ODS uses to modify it. * *The J= style attribute left-justifies the title. * *The COLOR= style attributes change the color of the title text * *"Leading Grain" to blue and "Producers in" to green. * * * *The TITLE2 statement specifies the text for the second title * *and the attributes that ODS uses to modify it. * *The J= style attribute center justifies the title. * *The COLOR= attribute changes the color of the title text "1996" * *to red. * * The HEIGHT= attributes change the size of each * *individual number in "1996". * * * *The FOOTNOTE statement specifies the text for the first footnote * *and the attributes that ODS uses to modify it. * *The J=left style attribute left-justifies the footnote. * *The HEIGHT=20 style attribute changes the font size to 20pt. * *The COLOR= style attributes change the color of the footnote text * *"Prepared" to red and "on" to green. * * * *The FOOTNOTE2 statement specifies the text for the second footnote * *and the attributes that ODS uses to modify it. * *The J= style attribute centers the footnote. * *The COLOR= attribute changes the color of the date *
1731
1732
TITLE Statement
4
Chapter 6
*to blue, * *The HEIGHT= attribute changes the font size * *of the date specified by the sysdate9 macro. * ***********************************************************************/ ods listing close; ods html body=’newstyle-body.htm’ style=d3d; title j=left font= ’Times New Roman’ color=blue bcolor=red "Leading Grain " c=green bold italic "Producers in"; title2 j=center color=red underlin=1 height=28pt "1" height=24pt "9" height=20pt "9" height=16pt "6"; footnote j=left height=20pt color=red "Prepared " c=’#FF9900’ "on"; footnote2 j=center color=blue height=24pt "&sysdate9"; footnote3 link=’http://support.sas.com’ "SAS"; /*********************************************************** *This step uses the DATA step and ODS to produce * *an HTML report. It uses the default table definition * *(template) for the DATA step and writes an output object * *to the HTML destination. * ***********************************************************/ data _null_; set grain_production; where type in (’Rice’, ’Corn’) and year=1996; file print ods=( template=’table1’ columns=( char_var=country(generic=on format=$cntry. dynamic=(colhd=’Country’)) char_var=type(generic dynamic=(colhd=’Year’)) num_var=kilotons(generic=on format=comma12. dynamic=(colhd=’Kilotons’)) ) ); put _ods_; run; ods html close; ods listing;
Statements
4
UPDATE Statement
Display 6.1 Output with Customized Titles and Footnotes
See Also Statement: “FOOTNOTE Statement” on page 1523 System Option: “LINESIZE= System Option” on page 1878 “The TEMPLATE Procedure” in the SAS Output Delivery System: User’s Guide
UPDATE Statement Updates a master file by applying transactions. Valid:
in a DATA step
Category: File-handling Type: Executable
1733
1734
UPDATE Statement
4
Chapter 6
Syntax UPDATE master-data-set transaction-data-set ; BY by-variable;
Arguments master-data-set
specifies the SAS data set used as the master file. The name can be a one-level name (for example, FITNESS), a two-level name (for example, IN.FITNESS), or one of the special SAS data set names.
Range:
See Also: “SAS Names and Words” in SAS Language Reference: Concepts. (data-set-options)
specifies actions SAS is to take when it reads variables into the DATA step for processing. Requirements:
Data-set-options must appear within parentheses and follow a SAS
data set name. Dropping, keeping, and renaming variables is often useful when you update a data set. Renaming like-named variables prevents the second value that is read from over-writing the first one. By renaming one variable, you make the values of both of them available for processing, such as comparing.
Tip:
Featured in:
Example 2 on page 1736
See Also: A list of data set options to use with input data sets in “Data Set Options
by Category” on page 12. transaction-data-set
specifies the SAS data set that contains the changes to be applied to the master data set. The name can be a one-level name (for example, HEALTH), a two-level name (for example, IN.HEALTH), or one of the special SAS data set names.
Range:
END=variable
creates and names a temporary variable that contains an end-of-file indicator. This variable is initialized to 0 and is set to 1 when UPDATE processes the last observation. This variable is not added to any data set. UPDATEMODE=MISSINGCHECK UPDATEMODE=NOMISSINGCHECK
specifies whether missing variable values in a transaction data set are to be allowed to replace existing variable values in a master data set. MISSINGCHECK prevents missing variable values in a transaction data set from replacing values in a master data set. NOMISSINGCHECK allows missing variable values in a transaction data set to replace values in a master data set. Default: MISSINGCHECK
Statements
4
UPDATE Statement
1735
Special missing values, however, are the exception and will replace values in the master data set even when MISSINGCHECK (the default) is in effect.
Tip:
Details Requirements 3 The UPDATE statement must be accompanied by a BY statement that specifies the variables by which observations are matched. 3 The BY statement should immediately follow the UPDATE statement to which it applies. 3 The data sets listed in the UPDATE statement must be sorted by the values of the variables listed in the BY statement, or they must have an appropriate index. 3 Each observation in the master data set should have a unique value of the BY variable or BY variables. If there are multiple values for the BY variable, only the first observation with that value is updated. The transaction data set can contain more than one observation with the same BY value. (Multiple transaction observations are all applied to the master observation before it is written to the output file.) For more information, see “How to Prepare Your Data Sets” in SAS Language Reference: Concepts.
Transaction Data Sets Usually, the master data set and the transaction data set contain the same variables. However, to reduce processing time, you can create a transaction data set that contains only those variables that are being updated. The transaction data set can also contain new variables to be added to the output data set. The output data set contains one observation for each observation in the master data set. If any transaction observations do not match master observations, they become new observations in the output data set. Observations that are not to be updated can be omitted from the transaction data set. See “Reading, Combining, and Modifying SAS Data Sets” in SAS Language Reference: Concepts. Missing Values
By default the UPDATEMODE=MISSINGCHECK option is in effect, so missing values in the transaction data set do not replace existing values in the master data set. Therefore, if you want to update some but not all variables and if the variables you want to update differ from one observation to the next, set to missing those variables that are not changing. If you want missing values in the transaction data set to replace existing values in the master data set, use UPDATEMODE=NOMISSINGCHECK. Even when UPDATEMODE=MISSINGCHECK is in effect, you can replace existing values with missing values by using special missing value characters in the transaction data set. To create the transaction data set, use the MISSING statement in the DATA step. If you define one of the special missing values A through Z for the transaction data set, SAS updates numeric variables in the master data set to that value. If you want the resulting value in the master data set to be a regular missing value, use a single underscore (_) to represent missing values in the transaction data set. The resulting value in the master data set will be a period (.) for missing numeric values and a blank for missing character values. For more information about defining and using special missing value characters, see “MISSING Statement” on page 1632.
Comparisons 3 Both UPDATE and MERGE can update observations in a SAS data set.
1736
UPDATE Statement
4
Chapter 6
3 MERGE automatically replaces existing values in the first data set with missing values in the second data set. UPDATE, however, does not do so by default. To cause UPDATE to overwrite existing values in the master data set with missing ones in the transaction data set, you must use UPDATEMODE=NOMISSINGCHECK. 3 UPDATE changes or updates the values of selected observations in a master file by applying transactions. UPDATE can also add new observations.
Examples Example 1: Basic Updating These program statements create a new data set (OHIO.QTR1) by applying transactions to a master data set (OHIO.JAN). The BY variable STORE must appear in both OHIO.JAN and OHIO.WEEK4, and its values in the master data set should be unique: data ohio.qtr1; update ohio.jan ohio.week4; by store; run;
Example 2: Updating By Renaming Variables
This example shows renaming a variable in the FITNESS data set so that it will not overwrite the value of the same variable in the program data vector. Also, the WEIGHT variable is renamed in each data set and a new WEIGHT variable is calculated. The master data set and the transaction data set are listed before the code that performs the update: Master Data Set HEALTH OBS
ID
1 2 3 4 5
1114 1441 1750 1994 2304
NAME
TEAM
sally sue joey mark joe
blue green red yellow red
WEIGHT 125 145 189 165 170
Transaction Data Set FITNESS OBS
ID
NAME
TEAM
1 2 3
1114 1994 2304
sally mark joe
blue yellow red
WEIGHT 119 174 170
options nodate pageno=1 linesize=80 pagesize=60; /* Sort both data sets by ID */ proc sort data=health; by id; run; proc sort data=fitness; by id; run;
Statements
4
UPDATE Statement
1737
/* Update Master with Transaction */ data health2; length STATUS $11; update health(rename=(weight=ORIG) in=a) fitness(drop=name team in=b); by id ; if a and b then do; CHANGE=abs(orig - weight); if weightorig then status=’gain’; else status=’same’; end; else status=’no weigh in’; run; options nodate ls=78; proc print data=health2; title ’Weekly Weigh-in Report’; run;
Output 6.33
Updating By Renaming Variables Weekly Weigh-in Report
OBS STATUS 1 2 3 4 5
loss no weigh in no weigh in gain same
ID 1114 1441 1750 1994 2304
NAME
TEAM
sally sue joey mark joe
blue green red yellow red
ORIG 125 145 189 165 170
1 WEIGHT 119 . . 174 170
CHANGE 6 . . 9 0
Example 3: Updating with Missing Values This example illustrates the DATA steps used to create a master data set PAYROLL and a transaction data set INCREASE that contains regular and special missing values: options nodate pageno=1 linesize=80 pagesize=60; /* Create the Master Data Set */ data payroll; input ID SALARY; datalines; 011 245 026 269 028 374 034 333 057 582 ; /* Create the Transaction Data Set */ data increase; input ID SALARY; missing A _;
1738
WHERE Statement
011 026 028 034 057 ;
4
Chapter 6
datalines; 376 . 374 A _
/* Update Master with Transaction */ data newpay; update payroll increase; by id; run; proc print data=newpay; title ’Updating with Missing Values’; run;
Output 6.34
Updating With Missing Values
Updating with Missing Values OBS
ID
1 2 3 4 5
1011 1026 1028 1034 1057
1
SALARY 376 269 374 A .
=’01jan1999’d and time>=’9:00’t; 3 where state=’Mississippi’;
Statements
4
WHERE Statement
1741
As in other SAS expressions, the names of numeric variables can stand alone. SAS treats values of 0 or missing as false; other values are true. These examples are WHERE expressions that contain the numeric variables EMPNUM and SSN:
3 where empnum; 3 where empnum and ssn; Character literals or the names of character variables can also stand alone in WHERE expressions. If you use the name of a character variable by itself as a WHERE expression, SAS selects observations where the value of the character variable is not blank.
Operators Used in the WHERE Expression You can include both SAS operators and special WHERE-expression operators in the WHERE statement. For a complete list of the operators, see Table 6.12 on page 1741. For the rules SAS follows when it evaluates WHERE expressions, see “WHERE-Expression Processing” in SAS Language Reference: Concepts. Table 6.12
WHERE Statement Operators
Operator Type
Symbol or Mnemonic
Description
*
multiplication
/
division
+
addition
−
subtraction
**
exponentiation
Arithmetic
Comparison
4
= or EQ
equal to
^=, =, ~=, or NE
1
not equal to
> or GT
greater than
< or LT
less than
>= or GE
greater than or equal to
1991; ...more SAS statements... where same and year1991 and year
underscore (_)
?-
vertical bar (|)
?/
Examples This statement produces the output [TEST TITLE]: title ’?’;
CLEANUP System Option For an out-of-resource condition, specifies whether to perform an automatic cleanup or a user-specified cleanup. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Environment control: Error handling
PROC OPTIONS GROUP= ERRORHANDLING See:
CLEANUP System Option in the documentation for your operating environment.
Syntax CLEANUP | NOCLEANUP
Syntax Description CLEANUP
specifies that during the entire session, SAS attempts to perform automatic, continuous clean-up of resources that are not essential for execution. Nonessential
SAS System Options
4
CLEANUP System Option
1809
resources include resources that are not visible to the user (for example, cache memory) and resources that are visible to the user (for example, the KEYS windows). When CLEANUP is in effect and an out-of-resource condition occurs (except for a disk-full condition), a dialog box is not displayed, and no intervention is required by the user. When CLEANUP is in effect and a disk-full condition occurs, a dialog box displays that allows the user to decide how to proceed. NOCLEANUP
specifies that SAS allow the user to choose how to handle an out-of-resource condition. When NOCLEANUP is in effect and SAS cannot execute because of a lack of resources, SAS automatically attempts to clean up resources that are not visible to the user (for example, cache memory). However, resources that are visible to the user (for example, windows) are not automatically cleaned up. Instead, a dialog box appears that allows the user to choose how to proceed.
Details This table lists the dialog box choices: Dialog Box Choice
Action
Free windows
clears all windows not essential for execution.
Clear paste buffers
deletes paste buffer contents.
Deassign inactive librefs
prompts user for librefs to delete.
Delete definitions of all SAS macros and macro variables
deletes all macro definitions and variables.
Delete SAS files
allows user to select files to delete.
Clear Log window
erases Log window contents.
Clear Output window
erases Output window contents.
Clear Program Editor window
erases Program Editor window contents.
Clear source spooling/DMS recall buffers
erases recall buffers.
More items to clean up
displays a list of other resources that can be cleaned up.
Clean up everything
cleans up all other options that are shown on the requestor window. This selection only applies to the current clean-up request, not to the entire SAS session.
Continuous clean up
performs automatic, continuous clean-up. When continuous clean up is selected, SAS cleans up as many resources as possible in order to continue execution, and it ceases to display the requester window. Selecting continuous clean-up has the same effect as specifying CLEANUP. This selection applies to the current clean-up request and to the remainder of the SAS session.
Operating Environment Information: these choices in the dialog box:
Some operating environments might also include
1810
CMPLIB= System Option
4
Chapter 7
Dialog Box Choice
Action
Execute X command
enables the user to erase files and perform other clean-up operations.
Do nothing
halts the clean-up request and returns to the SAS session. This selection only applies to the current clean-up request, not to the entire SAS session.
If an out-of-resource condition cannot be resolved, the dialog box continues to display. In that case, see the SAS documentation for your operating environment for instructions on terminating the SAS session. When running in modes other than a windowing environment, the operation of CLEANUP depends on your operating environment. For details, see the SAS documentation for your operating environment. 4
CMPLIB= System Option Specifies one or more SAS data sets that contain compiler subroutines to include during program compilation. Valid in:
configuration file, SAS invocation, OPTIONS statement, System Options
window Category:
Files: SAS Files
PROC OPTIONS GROUP= SASFILES
Syntax CMPLIB=libref.data-set | (libref.data-set-1 … libref.data-set-n) | (libref.data-set-n – libref.data-set-m)
Syntax Description libref.data-set
specifies the libref and the data set of the compiler subroutines that are to be included during program compilation. The libref and data-set must be valid SAS names. libref.data-set-n – libref.data-set-m
specifies a range of compiler subroutines that are to be included during program compilation. The name of the libref and the data set must be valid SAS names that contain a numeric suffix.
Details SAS procedures, DATA steps, and macro programs that perform non-linear statistical modeling or optimization use a SAS language compiler subsystem that compiles and executes your SAS programs. The compiler subsystem generates machine language
SAS System Options
4
CMPMODEL= System Option
1811
code for the computer on which SAS is running. The SAS procedures that use the SAS language compiler are CALIS, COMPILE, GA, GENMOD, MODEL, NLIN, NLMIXED, NLP, PHREG, Risk Dimensions procedures, and SQL. The subroutines that you want to include must already have been compiled. All the subroutines in libref.data-set are included. You can specify a single libref.data-set, a list of libref.data-set names, or a range of libref.data-set names with numeric suffixes. When you specify more than one libref.data-set name, separate the names with a space and enclose the names in parentheses. After SAS starts, you can use the APPEND or INSERT system options to add additional data sets.
Examples Number of Libraries
OPTIONS Statement
One library
options cmplib=sasuser.cmpl;
Two or more libraries
options cmplib=(sasuser.cmpl sasuser.cmplA sasuser.cmpl3);
A range of libraries
options cmplib=(sasuser.cmpl1 - sasuser.cmpl6);
See Also 3 System options: “APPEND= System Option” on page 1789 “INSERT= System Option” on page 1871
CMPMODEL= System Option Specifies the output model type for the MODEL procedure. Valid in:
configuration file, SAS invocation, OPTIONS statement, System Options
window Category: System administration: Performance PROC OPTIONS GROUP= Performance
Syntax CMPMODEL=BOTH | CATALOG | XML
Syntax Description BOTH
specifies that the MODEL procedure create two output types for a model, one as a SAS catalog entry and the other as an XML file. This is the default.
1812
CMPOPT= System Option
4
Chapter 7
CATALOG
specifies that the output model type is an entry in a SAS catalog. XML
specifies that the output model type is an XML file.
See Also The MODEL Procedure in SAS/ETS User’s Guide
CMPOPT= System Option Specifies the type of code generation optimizations to use in the SAS language compiler. Valid in:
configuration file, SAS invocation, OPTIONS statement, System Options
window Category:
System administration: Performance
PROC OPTIONS GROUP= PERFORMANCE
Syntax CMPOPT=optimization-value | (optimization-value-1 ... optimization-value-n) | “optimization-value–1 ... optimization-value-n”| ALL | NONE NOCMPOPT
Syntax Description optimization
specifies the type of optimization that the SAS compiler is to use. Valid values are EXTRAMATH | NOEXTRAMATH specifies to keep or remove mathematical operations that do not affect the outcome of a statement. When you specify EXTRAMATH, the compiler retains the extra mathematical operations. When you specify NOEXTRAMATH, the extra mathematical operations are removed. FUNCDIFFERENCING | NOFUNCDIFFERENCING specifies whether analytic derivatives are computed for user defined functions. When you specify NOFUNCDIFFERENCING, analytic derivatives are computed for user defined functions. When you specify FUNCDIFFERENCING, numeric differencing is used to calculate derivatives for user defined functions. The default is NOFUNCDIFFERENCING. GUARDCHECK | NOGUARDCHECK specifies whether to check for array boundary problems. When you specify GUARDCHECK, the compiler checks for array boundary problems. When you specify NOGUARDCHECK, the compiler does not check for array boundary problems.
SAS System Options
4
CMPOPT= System Option
1813
Interaction: NOGUARDCHECK is set when CMPOPT is set to ALL and when
CMPOPT is set to NONE. MISSCHECK | NOMISSCHECK specifies whether to check for missing values in the data. If the data contains a significant amount of missing data, then you can optimize the compilation by specifying MISSCHECK. If the data rarely contains missing values, then you can optimize the compilation by specifying NOMISSCHECK. PRECISE | NOPRECISE specifies to handle exceptions at an operation boundary or at a statement boundary. When you specify PRECISE, exceptions are handled at the operation boundary. When you specify NOPRECISE, exceptions are handled at the statement boundary. EXTRAMATH, MISSCHECK, PRECISE, GUARDCHECK, and FUNCDIFFERENCING can be specified in any combination when you specify one or more values.
Tip:
ALL
specifies that the compiler is to optimize the machine language code by using the (NOEXTRAMATH NOMISSCHECK NOPRECISE NOGUARDCHECK NOFUNCDIFFERENCING) optimization values. This is the default. Restriction: ALL cannot be specified in combination with any other values. NONE
specifies that the compiler is not set to optimize the machine language code by using the (EXTRAMATH MISSCHECK PRECISE NOGUARDCHECK FUNCDIFFERENCING) optimization values. Restriction: NONE cannot be specified in combination with any other values. NOCMPOPT
specifies to set the value of CMPOPT to ALL. The compiler is to optimize the machine language code by using the (NOEXTRAMATH NOMISSCHECK NOPRECISE NOGUARDCHECK NOFUNCDIFFERENCING) optimization values. Restriction: NOCMPOPT cannot be specified in combination with values for the
CMPOPT option.
Details SAS procedures that perform non-linear statistical modeling or optimization employ a SAS language compiler subsystem that compiles and executes your SAS programs. The compiler subsystem generates machine language code for the computer on which SAS is running. By specifying values with the CMPOPT option, the machine language code can be optimized for efficient execution. The SAS procedures that use the SAS language compiler are CALIS, COMPILE, GENMOD, MODEL, PHREG, NLIN, NLMIXED, NLP, RISK, and SYLK. To specify multiple optimization values, the values must be enclosed in either parentheses, single quotation marks, or double quotation marks. When CMPOPT is set to multiple values, the parentheses or quotation marks are retained as part of the value. They are not retained as part of the value when CMPOPT is set to a single value. If a value is entered more than once, then the last setting is used. For example, if you specify CMPOPT=(PRECISE NOEXTRAMATH NOPRECISE), then the values that are set are NOEXTRAMATH and NOPRECISE. All leading, trailing, and embedded blanks are removed. When you specify EXTRAMATH or NOEXTRAMATH, some of the mathematical operations that are either included or excluded in the machine language code are
1814
4
COLLATE System Option
Chapter 7
x*1 x
x * –1
41
x
4 -1
x+0
x-0
x-x
x
- -x
any operation on two literal constants
4x
Examples OPTIONS Statement
Result
options cmpopt=(extramath);
extramath
options cmpopt="extramath missscheck precise";
"precise extramath extramath"
options nocmpopt;
(noextramath nomisscheck noprecise noguardcheck nofuncdifferencing)
COLLATE System Option Specifies whether to collate multiple copies of printed output. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Log and procedure output control: ODS Printing ODSPRINT
PROC OPTIONS GROUP=
Syntax COLLATE | NOCOLLATE
Syntax Description COLLATE
specifies to collate multiple copies of printed output. NOCOLLATE
specifies not to collate multiple copies of printed output. This is the default.
Details When you send a print job to the printer and you want multiple copies of multiple pages, the COLLATE option controls how the pages are ordered:
SAS System Options
4
COLORPRINTING System Option
1815
COLLATE causes the pages to print consecutively: 123, 123, 123... NOCOLLATE causes the same-numbered pages to print together: 111, 222, 333... Note: You can also control collation with the SAS windowing environment Page Setup window, invoked with the DMPAGESETUP command. 4 Most SAS system options are initialized with default settings when SAS is invoked. However, the default settings and option values for some SAS system options might vary both by operating environment and by site. For details, see the SAS documentation for your operating environment. For additional information on declaring an ODS printer destination, see ODS statements in SAS Output Delivery System: User’s Guide. For additional information on the SAS universal print facility, see “Printing with SAS” in SAS Language Reference: Concepts.
See Also System Option: “COPIES= System Option” on page 1818
COLORPRINTING System Option Specifies whether to print in color if color printing is supported. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Log and procedure output control: ODS Printing PROC OPTIONS GROUP=
ODSPRINT
Syntax COLORPRINTING | NOCOLORPRINTING
Syntax Description COLORPRINTING
specifies to attempt to print in color. NOCOLORPRINTING
specifies not to print in color.
Details Most SAS system options are initialized with default settings when SAS is invoked. However, the default settings and option values for some SAS system options might vary both by operating environment and by site. For details, see the SAS documentation for your operating environment. For additional information on declaring an ODS printer destination, see ODS statements in SAS Output Delivery System: User’s Guide.
1816
COMPRESS= System Option
4
Chapter 7
For additional information on the SAS universal print facility, see “Printing with SAS” in SAS Language Reference: Concepts.
COMPRESS= System Option Specifies the type of compression of observations to use for output SAS data sets. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Files: SAS Files
System administration: Performance PROC OPTIONS GROUP= SASFILES PERFORMANCE Restriction:
The TAPE engine does not support the COMPRESS= system option.
Syntax COMPRESS=NO | YES | CHAR | BINARY
Syntax Description NO
specifies that the observations in a newly created SAS data set are uncompressed (fixed-length records). Alias: N | OFF YES | CHAR
specifies that the observations in a newly created SAS data set are compressed (variable-length records) by SAS using RLE (Run Length Encoding). RLE compresses observations by reducing repeated consecutive characters (including blanks) to two-byte or three-byte representations. Alias: Y, ON Use this compression algorithm for character data. Note: COMPRESS=CHAR is accepted by Version 7 and later versions.
Tip:
4
BINARY
specifies that the observations in a newly created SAS data set are compressed (variable-length records) by SAS using RDC (Ross Data Compression). RDC combines run-length encoding and sliding-window compression to compress the file. This method is highly effective for compressing medium to large (several hundred bytes or larger) blocks of binary data (numeric variables). Because the compression function operates on a single record at a time, the record length needs to be several hundred bytes or larger for effective compression.
Tip:
Operating Environment Information: The syntax that is shown here applies to the OPTIONS statement. On the command line or in a configuration file, the syntax is specific to your operating environment. For details, see the SAS documentation for your operating environment. 4
SAS System Options
4
COMPRESS= System Option
1817
Details Compressing a file is a process that reduces the number of bytes required to represent each observation. Advantages of compressing a file include reduced storage requirements for the file and fewer I/O operations necessary to read or write to the data during processing. However, more CPU resources are required to read a compressed file (because of the overhead of uncompressing each observation), and there are situations when the resulting file size might increase rather than decrease. Use the COMPRESS= system option to compress all output data sets that are created during a SAS session. Use the option only when you are creating SAS data files (member type DATA). You cannot compress SAS views, because they contain no data. Once a file is compressed, the setting is a permanent attribute of the file, which means that to change the setting, you must re-create the file. That is, to uncompress a file, specify COMPRESS=NO for a DATA step that copies the compressed file. Note: For the COPY procedure, the default value CLONE uses the compression attribute from the input data set for the output data set. If the engine for the input data set does not support the compression attribute, then PROC COPY uses the current value of the COMPRESS= system option. For more information about CLONE and NOCLONE, see COPY statement in the DATASETS procedure in the Base SAS Procedures Guide. This interaction does not apply when using SAS/SHARE or SAS/ CONNECT. 4
Comparisons The COMPRESS= system option can be overridden by the COMPRESS= option on the LIBNAME statement and the COMPRESS= data set option. The data set option POINTOBS=YES, which is the default, determines that a compressed data set can be processed with random access (by observation number) rather than sequential access. With random access, you can specify an observation number in the FSEDIT procedure and the POINT= option in the SET and MODIFY statements. When you create a compressed file, you can also specify REUSE=YES (as a data set option or system option) in order to track and reuse space. With REUSE=YES, new observations are inserted in space freed when other observations are updated or deleted. When the default REUSE=NO is in effect, new observations are appended to the existing file. POINTOBS=YES and REUSE=YES are mutually exclusive; that is, they cannot be used together. REUSE=YES takes precedence over POINTOBS=YES; that is, if you set REUSE=YES, SAS automatically sets POINTOBS=NO. The TAPE engine does not support the COMPRESS= system option, but the engine does support the COMPRESS= data set option. The XPORT engine does not support compression.
See Also Data Set Options: “COMPRESS= Data Set Option” on page 19 “POINTOBS= Data Set Option” on page 47 “REUSE= Data Set Option” on page 55 Statements: “LIBNAME Statement” on page 1606
1818
4
COPIES= System Option
Chapter 7
System Option: “REUSE= System Option” on page 1925 “Compressing Data Files” in SAS Language Reference: Concepts
COPIES= System Option Specifies the number of copies to print. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Log and procedure output control: ODS Printing PROC OPTIONS GROUP= ODSPRINT Category:
Syntax COPIES=n
Syntax Description n
specifies the number of copies. Operating Environment Information: Most SAS system options are initialized with default settings when SAS is invoked. However, the default settings and option values for some SAS system options might vary both by operating environment and by site. For details, see the SAS documentation for your operating environment. 4 For additional information on declaring an ODS printer destination, see ODS statements in SAS Output Delivery System: User’s Guide. For additional information on the SAS universal print facility, see “Printing with SAS” in SAS Language Reference: Concepts.
See Also System Option: “COLLATE System Option” on page 1814
SAS System Options
4
CPUCOUNT= System Option
1819
CPUCOUNT= System Option Specifies the number of processors that the thread-enabled applications should assume will be available for concurrent processing. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: System administration: Performance PROC OPTIONS GROUP= PERFORMANCE
Under Windows, OpenVMS, and z/OS, the default is ACTUAL. Under UNIX, the default is either ACTUAL or 4 for systems that have more than four processors.
Default:
If the THREADS system option is set to NOTHREADS, the CPUCOUNT= option has no effect.
Interaction:
Syntax CPUCOUNT= 1 - 1024 | ACTUAL
Syntax Description
1-1024
is the number of CPUs that SAS will assume are available for use by thread-enabled applications. The value is typically set to the actual number of CPUs available to the current process by your configuration.
Tip:
Setting CPUCOUNT= to a number greater than the actual number of available CPUs might result in reduced overall performance of SAS.
Tip:
ACTUAL
returns the number of physical processors that are associated with the operating system where SAS is executing. If the operating system is executing in a partition, the value of the CPUCOUNT system is the number of physical processors that are associated with the operating system in that partition. This number can be less than the physical number of CPUs if the SAS process has been restricted by system administration tools.
Tip:
Setting CPUCOUNT= to ACTUAL at any time causes the option to be reset to the number of physical processors that are associated with the operating system at that time. If the operating system is executing in a partition, the value of the CPUCOUNT system is the number of physical processors that are associated with the operating system in that partition.
Tip:
If your system supports Simultaneous Multi-Threading (SMT), hyperthreading, or Chip Multi-Threading (CMT), the value of the CPUCOUNT= option represents the number of such threads on the system.
Tip:
Details Certain procedures have been modified to take advantage of multiple CPUs by threading the procedure processing. The Base SAS engine also uses threading to create
1820
CPUID System Option
4
Chapter 7
an index. The CPUCOUNT= option provides the information that is needed to make decisions about the allocation of threads. Changing the value of CPUCOUNT= affects the degree of parallelism each thread-enabled process attempts to achieve. Setting CPUCOUNT to a number greater than the actual number of available CPUs might result in reduced overall performance of SAS.
Comparisons When the related system option THREADS is in effect, threading will be active where available. The value of the CPUCOUNT= option affects the performance of THREADS by suggesting how many system CPUs are available for use by thread-enabled SAS procedures.
See Also System Options: “THREADS System Option” on page 1978 “UTILLOC= System Option” on page 1984 “Support for Parallel Processing” in SAS Language Reference: Concepts.
CPUID System Option Specifies whether the CPU identification number is written to the SAS log. Valid in:
configuration file, SAS invocation
Category:
Log and procedure output control: SAS log
PROC OPTIONS GROUP= LOGCONTROL
Syntax CPUID | NOCPUID
Syntax Description
CPUID
specifies that the CPU identification number is printed at the top of the SAS log after the licensing information. NOCPUID
specifies that the CPU identification number is not written to the SAS log.
See Also The SAS Log in SAS Language Reference: Concepts
SAS System Options
4
DATASTMTCHK= System Option
1821
DATASTMTCHK= System Option Specifies which SAS statement keywords are prohibited from being specified as a one-level DATA step name to protect against overwriting an input data set. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Files: SAS Files PROC OPTIONS GROUP= SASFILES
Syntax DATASTMTCHK=COREKEYWORDS | ALLKEYWORDS | NONE
Syntax Description COREKEYWORDS
prohibits certain words as one-level SAS data set names in the DATA statement. They can appear as two-level names. The following keywords cannot appear as one-level SAS data set names: MERGE RETAIN SET UPDATE. For example, SET is not acceptable in the DATA statement, but SAVE.SET and WORK.SET are acceptable. COREKEYWORDS is the default. ALLKEYWORDS
prohibits any keyword that can begin a statement in the DATA step (for example, ABORT, ARRAY, INFILE) as a one-level data set name in the DATA statement. NONE
provides no protection against overwriting SAS data sets.
Details If you omit a semicolon on the DATA statement, you can overwrite an input data set if the next statement is SET, MERGE, or UPDATE. Different, but significant, problems arise when the next statement is RETAIN. DATASTMTCHK= enables you to protect yourself against overwriting the input data set.
1822
DATE System Option
4
Chapter 7
DATE System Option Specifies whether to print the date and time that a SAS program started. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Log and procedure output control: SAS log and procedure output Log and procedure output control: SAS log Log and procedure output control: Procedure output LOG_LISTCONTROL
PROC OPTIONS GROUP=
LISTCONTROL LOGCONTROL
Syntax DATE | NODATE
Syntax Description
DATE
specifies that the date and the time that the SAS program started are printed at the top of each page of the SAS log and any output that is created by SAS. Note: In an interactive SAS session, the date and time are noted only in the output window. 4 NODATE
specifies that the date and the time are not printed.
See Also The SAS Log in SAS Language Reference: Concepts
DATESTYLE= System Option Specifies the sequence of month, day, and year when ANYDTDTE, ANYDTDTM, or ANYDTTME informat data is ambiguous. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Environment control: Language control Input control: Data Processing
PROC OPTIONS GROUP= INPUTCONTROL
LANGUAGECONTROL
SAS System Options
4
DATESTYLE= System Option
1823
Syntax DATESTYLE= MDY | MYD | YMD | YDM | DMY | DYM | LOCALE
Syntax Description MDY
specifies that SAS set the order as month, day, year. MYD
specifies that SAS set the order as month, year, day. YMD
specifies that SAS set the order as year, month, day. YDM
specifies that SAS set the order as year, day, month. DMY
specifies that SAS set the order as day, month, year. DYM
specifies that SAS set the order as day, year, month. LOCALE
specifies that SAS set the order based on the value that corresponds to the LOCALE= system option value and is one of the following: MDY | MYD | YMD | YDM | DMY | DYM.
Details System option DATESTYLE= identifies the order of month, day, and year. The default value is LOCALE. The default LOCALE system option value is English, therefore, the default DATESTYLE order is MDY. Operating Environment Information: See “Locale Values” in SAS National Language Support (NLS): Reference Guide to get the default settings for each locale option value.
See Also System Option: “LOCALE System Option: UNIX, Windows, OpenVMS, and z/OS” in SAS National Language Support (NLS): Reference Guide Informats: “ANYDTDTEw. Informat” on page 1257 “ANYDTDTMw. Informat” on page 1259 “ANYDTTMEw. Informat” on page 1262
4
1824
DEFLATION= System Option
4
Chapter 7
DEFLATION= System Option Specifies the level of compression for device drivers that support the Deflate compression algorithm. DEFLATE Valid in: configuration file, SAS invocation, OPTIONS statement, SAS System Options window Requirement: The UPRINTCOMPRESSION system option must be set in order to compress files. Category: Log and procedure output control: ODS Printing PROC OPTIONS GROUP= ODSPRINT Alias:
Syntax DEFLATION=n | MIN | MAX
Syntax Description n
specifies the level of compression. The larger the number, the greater the compression. For example, n=0 is the minimum compression level (completely uncompressed), and n=9 is the maximum compression level. Default: 6 Range: 0–9 MIN
specifies the minimum compression level of 0. MAX
specifies the maximum compression level of 9.
Details The DEFLATION= system option controls the level of compression for device drivers that support Deflate compression. The PRINTERPATH= system option must be set to one of the following SAS device drivers that support Deflate compression: the PDF device driver or the SVG Universal Printer drivers. The ODS PRINTER statement option, COMPRESS=, takes precedence over the DEFLATION system option.
See Also System options: “PRINTERPATH= System Option” on page 1920 “UPRINTCOMPRESSION System Option” on page 1982 Statements: “ODS PRINTER Statement” in the SAS Output Delivery System: User’s Guide
SAS System Options
4
DEVICE= System Option
1825
DETAILS System Option Specifies whether to include additional information when files are listed in a SAS library. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Log and procedure output control: SAS log and procedure output
Log and procedure output control: SAS log Log and procedure output control: Procedure output PROC OPTIONS GROUP= LOG_LISTCONTROL
LISTCONTROL LOGCONTROL
Syntax DETAILS | NODETAILS
Syntax Description DETAILS
includes additional information when some SAS procedures and windows display a listing of files in a SAS library. NODETAILS
does not include additional information.
Details The DETAILS specification sets the default display for these components of SAS:
3 the CONTENTS procedure 3 the DATASETS procedure. The type and amount of additional information that displays depends on which procedure or window you use.
See Also The SAS Log in SAS Language Reference: Concepts
DEVICE= System Option Specifies the device driver to which SAS/GRAPH sends procedure output. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Alias:
DEV=
1826
DKRICOND= System Option
4
Chapter 7
Graphics: Driver settings PROC OPTIONS GROUP= GRAPHICS See: DEVICE= System Option in the documentation for your operating environment. Category:
Syntax DEVICE=device-driver-specification
Syntax Description device-driver-specification
specifies the name of a device driver.
Details If you omit the device-driver name, you are prompted to enter a driver name when you execute a procedure that produces graphics. Operating Environment Information: The syntax that is shown applies to the OPTIONS statement. However, when you specify DEVICE= either on the command-line or in a configuration file, the syntax is specific to your operating environment and might include additional or alternate punctuation. 4
See Also Device Drivers in SAS/GRAPH: Reference
DKRICOND= System Option Specifies the level of error detection to report when a variable is missing from an input data set during the processing of a DROP=, KEEP=, or RENAME= data set option. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Files: SAS Files
PROC OPTIONS GROUP= SASFILES
Syntax DKRICOND=ERROR | WARN | WARNING | NOWARN | NOWARNING
SAS System Options
4
DKROCOND= System Option
1827
Syntax Description ERROR
sets the error flag and writes an error message to the SAS log when a variable is missing from an input data set during the processing of a DROP=, KEEP=, or RENAME= data set option. WARN|WARNING
writes a warning message to the SAS log when a variable is missing from an input data set during the processing of a DROP=, KEEP=, or RENAME= data set option. NOWARN|NOWARNING
does not write a warning message to the SAS log when a variable is missing from an input data set during the processing of a DROP=, KEEP=, or RENAME= data set option.
Examples In the following statements, if the variable X is not in data set B and DKRICOND=ERROR, SAS sets the error flag to 1 and displays error messages: data a; set b(drop=x); run;
See Also System Option: “DKROCOND= System Option” on page 1827
DKROCOND= System Option Specifies the level of error detection to report when a variable is missing for an output data set during the processing of a DROP=, KEEP=, or RENAME= data set option. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Files: SAS Files PROC OPTIONS GROUP= SASFILES
Syntax DKROCOND=ERROR | WARN | WARNING | NOWARN | NOWARNING
1828
DLDMGACTION= System Option
4
Chapter 7
Syntax Description ERROR
sets the error flag and writes an error message to the SAS log when a variable is missing for an output data set during the processing of a DROP=, KEEP=, or RENAME= data set option. WARN | WARNING
writes a warning message to the SAS log when a variable is missing for an output data set during the processing of a DROP=, KEEP=, or RENAME= data set option. NOWARN | NOWARNING
does not write a warning message to the SAS log when a variable is missing for an output data set during the processing of a DROP=, KEEP=, or RENAME= data set option.
Examples In the following statements, if the variable X is not in data set A and DKROCOND=ERROR, SAS sets the error flag to 1 and displays error messages: data a; drop x; run;
See Also System Option: “DKRICOND= System Option” on page 1826
DLDMGACTION= System Option Specifies the type of action to take when a SAS data set or a SAS catalog is detected as damaged. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Files: SAS Files
PROC OPTIONS GROUP= SASFILES
Syntax DLDMGACTION=FAIL | ABORT | REPAIR | NOINDEX | PROMPT
SAS System Options
4
DMR System Option
1829
Syntax Description FAIL
stops the step and issues an error message to the log immediately. This is the default for batch mode. ABORT
terminates the step and issues an error message to the log, and ends the SAS session. REPAIR
For data files, automatically repairs and rebuilds indexes and integrity constraints, unless the data file is truncated. You use the REPAIR statement to restore the truncated data file. It issues a warning message to the log. This is the default for interactive mode. For catalogs, automatically deletes catalog entries for which an error occurs during the repair process. NOINDEX
For data files, automatically repairs the data file without the indexes and integrity constraints, deletes the index file, updates the data file to reflect the disabled indexes and integrity constraints, and limits the data file to be opened only in INPUT mode. A warning is written to the SAS log instructing you to execute the PROC DATASETS REBUILD statement to correct or delete the disabled indexes and integrity constraints. For more information, see the “REBUILD Statement” in the “DATASETS Procedure” in Base SAS Procedures Guide and “Recovering Disabled Indexes and integrity Contraints” in SAS Language Reference: Concepts. Restriction: NOINDEX does not apply to damaged catalogs or libraries, only data files. PROMPT
For data sets, displays a dialog box where you can specify either FAIL, ABORT, REPAIR, or NOINDEX. For a damaged catalog or library, PROMPT displays a dialog box where you can specify either FAIL, ABORT, or REPAIR.
DMR System Option Specifies whether to enable SAS to invoke a server session for use with a SAS/CONNECT client. configuration file, SAS invocation Category: Environment control: Initialization and operation PROC OPTIONS GROUP= EXECMODES Valid in:
Syntax DMR | NODMR
1830
DMS System Option
4
Chapter 7
Syntax Description
DMR
enables you to invoke a remote SAS session in order to connect with a SAS/ CONNECT client. NODMR
disables you from invoking a remote SAS session.
Details You normally invoke the remote SAS session from a local session by including DMR with the SAS command in a script that contains a TYPE statement. (A script is a text file that contains statements to establish or terminate the SAS/CONNECT link between the local and the remote SAS sessions.) The following SAS execution mode invocation option has precedence over this option:
3 OBJECTSERVER DMR overrides all other SAS execution mode invocation options. See “Order of Precedence” on page 1774 for more information on invocation option precedence.
See Also DMR information in SAS/CONNECT User’s Guide
DMS System Option Specifies whether to invoke the SAS windowing environment and display the Log, Editor, and Output windows. Valid in:
configuration file, SAS invocation
Category:
Environment control: Initialization and operation
PROC OPTIONS GROUP= EXECMODES
Syntax DMS | NODMS
Syntax Description
DMS
invokes the SAS windowing environment and displays the Log, an Editor window, and Output windows. NODMS
invokes an interactive line mode SAS session.
SAS System Options
4
DMSEXP System Option
1831
Details When you invoke SAS and you are using a configuration file or the command line to control your system option settings, it is possible to create a situation where some system option settings conflict with other system option settings. The following invocation system options, in order, have precedence over the DMS invocation system option: 1 OBJECTSERVER. 2 DMR 3 SYSIN
If you specify DMR while using another invocation option of equal precedence to invoke SAS, SAS uses the last option that is specified. See “Order of Precedence” on page 1774 for more information about invocation option precedence.
See Also System Options: “DMR System Option” on page 1829 “DMSEXP System Option” on page 1831 “EXPLORER System Option” on page 1849
DMSEXP System Option Specifies whether to invoke the SAS windowing environment and display the Explorer, Editor, Log, Output, and Results windows. Valid in:
configuration file, SAS invocation
Category: Environment control: Initialization and operation PROC OPTIONS GROUP= EXECMODES
Syntax DMSEXP | NODMSEXP
Syntax Description DMSEXP
invokes SAS with the Explorer, Editor, Log, Output, and Results windows active. NODMSEXP
invokes SAS with the Editor, Log, and Output windows active.
Details In order to set DMSEXP or NODMSEXP, the DMS option must be set. The following SAS execution mode invocation options, in order, have precedence over this option: 1 OBJECTSERVER.
1832
DMSLOGSIZE= System Option
4
Chapter 7
2 DMR 3 SYSIN
If you specify DMSEXP with another execution mode invocation option of equal precedence, SAS uses only the last option listed. See “Order of Precedence” on page 1774 for more information on invocation option precedence.
See Also System Options: “DMS System Option” on page 1830 “DMR System Option” on page 1829 “EXPLORER System Option” on page 1849
DMSLOGSIZE= System Option Specifies the maximum number of rows that the SAS Log window can display. Valid in:
configuration file, SAS invocation
Category:
Environment control: Display Log and procedure output control: SAS log
Restriction:
This option is valid only in the SAS windowing environment.
PROC OPTIONS GROUP= ENVDISPLAY
LOGCONTROL
Syntax DMSLOGSIZE= n | nK | hexX | MIN | MAX
Syntax Description n | nK
specifies the maximum number of rows that can be displayed in the SAS windowing environment Log window in multiples of 1 (n) or 1,024 (nK). For example, a value of 800 specifies 800 rows, and a value of 3K specifies 3,072 rows. Valid values range from 500 to 999999. The default is 99999. hexX
specifies the maximum number of rows that can be displayed in the SAS windowing environment Log window as a hexadecimal value. You must specify the value beginning with a number (0-9), followed by an X. For example, 2ffx specifies 767 rows and 0A00x specifies 2,560 rows. MIN
specifies to set the maximum number of rows that can be displayed in the SAS windowing environment Log window to 500.
SAS System Options
4
DMSOUTSIZE= System Option
1833
MAX
specifies to set the maximum number of rows that can be displayed in the SAS windowing environment Log window to 999999.
Details When the maximum number of rows have been displayed in the Log window, SAS prompts you to either file, print, save, or clear the Log window.
See Also System Option: “DMSOUTSIZE= System Option” on page 1833 “The SAS Log” in SAS Language Reference: Concepts
DMSOUTSIZE= System Option Specifies the maximum number of rows that the SAS Output window can display. configuration file, SAS invocation Category: Environment control: Display Restriction: This option is valid only in the SAS windowing environment. PROC OPTIONS GROUP= ENVDISPLAY Valid in:
Syntax DMSOUTSIZE= n | nK | hexX | MIN | MAX
Syntax Description n | nK
specifies the maximum number of rows that can be displayed in the SAS windowing environment Output window in multiples of 1 (n) or 1,024 (nK). For example, a value of 800 specifies 800 rows, and a value of 3K specifies 3,072 rows. Valid values range from 500 to 999999. The default is 99999. hexX
specifies the maximum number of rows that can be displayed in the SAS windowing environment Output window as a hexadecimal value. You must specify the value beginning with a number (0-9), followed by an X. For example, 2ffx specifies 767 rows and 0A00x specifies 2,560 rows. MIN
specifies to set the maximum number of rows that can be displayed in the SAS windowing environment Output window to 500. MAX
specifies to set the maximum number of rows that can be displayed in the SAS windowing environment Output window to 999999.
1834
DMSPGMLINESIZE= System Option
4
Chapter 7
Details When the maximum number of rows have been displayed in the Output window, SAS prompts you to either file, print, save, or clear the Output window.
See Also System Option: “DMSLOGSIZE= System Option” on page 1832
DMSPGMLINESIZE= System Option Specifies the maximum number of characters in a Program Editor line. Valid in:
configuration file, SAS invocation
Category:
Environment control: Display
PROC OPTIONS GROUP= ENVDISPLAY
Syntax DMSPGMLINESIZE= n
Syntax Description n
specifies the maximum number of characters in a Program Editor line. Default: 136 Range:
136–960
DMSSYNCHK System Option In the SAS windowing environment, specifies whether to enable syntax check mode for DATA step and PROC step processing. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Environment control: Error handling
PROC OPTIONS GROUP= ERRORHANDLING
Syntax DMSSYNCHK | NODMSSYNCHK
SAS System Options
4
DMSSYNCHK System Option
1835
Syntax Description DMSSYNCHK
enables syntax check mode for statements that are submitted within the SAS windowing environment. NODMSSYNCHK
does not enable syntax check mode for statements that are submitted within the SAS windowing environment.
Details If a syntax or semantic error occurs in a DATA step after the DMSSYNCHK option is set, then SAS enters syntax check mode, which remains in effect from the point where SAS encountered the error to the end of the code that was submitted. After SAS enters syntax mode, all subsequent DATA step statements and PROC step statements are validated. While in syntax check mode, only limited processing is performed. For a detailed explanation of syntax check mode, see “Syntax Check Mode” in the “Error Processing in SAS” section of SAS Language Reference: Concepts. CAUTION:
Place the OPTIONS statement that enables DMSSYNCHK before the step for which you want it to take effect. If you place the OPTIONS statement inside a step, then DMSSYNCHK will not take effect until the beginning of the next step. 4 If NODMSSYNCHK is in effect, SAS processes the remaining steps even if an error occurs in the previous step.
Comparisons You use the DMSSYNCHK system option to validate syntax in an interactive session by using the SAS windowing environment. You use the SYNTAXCHECK system option to validate syntax in a non-interactive or batch SAS session. You can use the ERRORCHECK= option to specify the syntax check mode for the LIBNAME statement, the FILENAME statement, the %INCLUDE statement, and the LOCK statement in SAS/SHARE.
See Also System options: “ERRORCHECK= System Option” on page 1848 “SYNTAXCHECK System Option” on page 1971 “Error Processing” in SAS Language Reference: Concepts
1836
DSNFERR System Option
4
Chapter 7
DSNFERR System Option When a SAS data set cannot be found, specifies whether SAS issues an error message. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Environment control: Error handling
PROC OPTIONS GROUP= ERRORHANDLING
Syntax DSNFERR | NODSNFERR
Syntax Description DSNFERR
specifies that SAS issue an error message and stop processing if a reference is made to a SAS data set that does not exist. NODSNFERR
specifies that SAS ignore the error message and continue processing if a reference is made to a SAS data set that does not exist. The data set reference is treated as if _NULL_ had been specified.
Comparisons 3 DSNFERR is similar to the BYERR system option, which issues an error message and stops processing if the SORT procedure attempts to sort a _NULL_ data set.
3 DSNFERR is similar to the VNFERR system option, which sets the error flag for a missing variable when a _NULL_ data set is used.
See Also System Options: “BYERR System Option” on page 1799 “VNFERR System Option” on page 1994
SAS System Options
4
DTRESET System Option
1837
DTRESET System Option Specifies whether to update the date and time in the SAS log and in the procedure output file. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Log and procedure output control: SAS log and procedure output
Log and procedure output control: SAS log Log and procedure output control: Procedure output PROC OPTIONS GROUP= LOG_LISTCONTROL
LISTCONTROL LOGCONTROL
Syntax DTRESET | NODTRESET
Syntax Description DTRESET
specifies that SAS update the date and time in the titles of the SAS log and the procedure output file. NODTRESET
specifies that SAS not update the date and time in the titles of the SAS log and the procedure output file.
Details The DTRESET system option updates the date and time in the titles of the SAS log and the procedure output file. This update occurs when the page is being written. The smallest time increment that is reflected is minutes. The DTRESET option is especially helpful in obtaining a more accurate date and time stamp when you run long SAS jobs. When you use NODTRESET, SAS displays the date and time that the job originally started.
See Also “The SAS Log” in SAS Language Reference: Concepts
1838
4
DUPLEX System Option
Chapter 7
DUPLEX System Option Specifies whether duplex (two-sided) printing is enabled. configuration file, SAS invocation, OPTIONS statement, SAS System Options
Valid in:
window Log and procedure output control: ODS Printing
Category:
PROC OPTIONS GROUP= ODSPRINT Restriction:
This option is ignored if the printer does not support duplex (two-sided)
printing.
Syntax DUPLEX| NODUPLEX
Syntax Description
DUPLEX
specifies that duplex (two-sided) printing is enabled. Interaction: When DUPLEX is selected, the setting of the BINDING= option
determines how the paper is oriented before output is printed on the second side. NODUPLEX
specifies that duplex (two-sided) printing is not enabled. This is the default.
Details Note that duplex (two-sided) printing can be used only on printers that support duplex output. Operating Environment Information: Most SAS system options are initialized with default settings when SAS is invoked. However, the default settings for some SAS system options might vary both by operating environment and by site. Option values might also vary both by operating environment and by site. For details, see the SAS documentation for your operating environment. 4
See Also System Option: “BINDING= System Option” on page 1794 For information about declaring an ODS printer destination, see the ODS PRINTER Statement in SAS Output Delivery System: User’s Guide. For information about SAS Universal Printing, see Printing with SAS in SAS Language Reference: Concepts.
SAS System Options
4
EMAILAUTHPROTOCOL= System Option
1839
ECHOAUTO System Option Specifies whether the statements in the autoexec file are written to the SAS log as they are executed. Valid in:
configuration file, SAS invocation
Category: Log and procedure output control: SAS log PROC OPTIONS GROUP= LOGCONTROL
Syntax ECHOAUTO | NOECHOAUTO
Syntax Description ECHOAUTO
specifies that the SAS statements in the autoexec file are written to the SAS log as they are executed. Requirement: To print autoexec file statements in the SAS log, the SOURCE system option must be set. NOECHOAUTO
specifies that SAS statements in the autoexec file are not written in the SAS log, even though they are executed.
Details Regardless of the setting of this option, messages that result from errors in the autoexec files are printed in the SAS log.
See Also System Option: “SOURCE System Option” on page 1944 The SAS Log in SAS Language Reference: Concepts
EMAILAUTHPROTOCOL= System Option Specifies the authentication protocol for SMTP E-mail. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Communications: Email PROC OPTIONS GROUP= EMAIL
1840
EMAILFROM System Option
4
Chapter 7
Syntax EMAILAUTHPROTOCOL= NONE | LOGIN
Syntax Description LOGIN
specifies that the LOGIN authentication protocol is used. For more information about the order of authentication, see “Sending E-Mail through SMTP” in SAS Language Reference: Concepts. Note: When you specify LOGIN, you might also need to specify EMAILID and EMAILPW. If you omit EMAILID, SAS will look up your user ID and use it. If you omit EMAILPW, no password is used. 4 NONE
specifies that no authentication protocol is used.
Comparisons For the SMTP access method, use this option in conjunction with the EMAILID=, EMAILPW=, EMAILPORT, and EMAILHOST system options. EMAILID= provides the user name, EMAILPW= provides the password, EMAILPORT specifies the port to which the SMTP server is attached, EMAILHOST specifies the SMTP server that supports e-mail access for your site, and EMAILAUTHPROTOCOL= provides the protocol.
See Also System Options: “EMAILHOST= System Option” on page 1841 “EMAILID= System Option” on page 1842 “EMAILPORT System Option” on page 1843 “EMAILPW= System Option” on page 1844
EMAILFROM System Option When sending e-mail by using SMTP, specifies whether the e-mail option FROM is required in either the FILE or FILENAME statement. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Communications: Email
PROC OPTIONS GROUP= EMAIL
Syntax EMAILFROM | NOEMAILFROM
SAS System Options
4
EMAILHOST= System Option
1841
Syntax Description EMAILFROM
specifies that the FROM e-mail option is required when sending e-mail by using either the FILE or FILENAME statements. NOEMAILFROM
specifies that the FROM e-mail option is not required when sending e-mail by using either the FILE or FILENAME statements.
See Also Statements: “FILE Statement” on page 1454 “FILENAME Statement, EMAIL (SMTP) Access Method” on page 1482
EMAILHOST= System Option Specifies one or more SMTP servers that support e-mail access. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Communications: Email PROC OPTIONS GROUP=
EMAIL
Syntax EMAILHOST= server EMAILHOST=( ’server-1’ ’server-2’ )
Syntax Description server
specifies one or more Simple Mail Transfer Protocol (SMTP) server domain names for your site. Note: The system administrator for your site will provide this information.
4
Range: The maximum number of characters that can be specified for SMTP servers
is 1,024 When more than one server name is specified, the list must be enclosed in parentheses and each server name must be enclosed in single or double quotation marks..
Requirement:
Details When more than one SMTP server is specified, SAS attempts to connect to e-mail servers in the order that they are specified. E-mail is delivered to the first server that
1842
EMAILID= System Option
4
Chapter 7
SAS connects to. If SAS is not able to connect to any of the specified servers, the attempt to deliver e-mail fails and SAS returns an error. Operating Environment Information: To enable the SMTP interface that SAS provides, you must also specify the EMAILSYS=SMTP system option. For information about EMAILSYS, see the documentation for your operating environment. 4
Comparisons For the SMTP access method, use this option in conjunction with the EMAILID=, EMAILAUTHPROTOCOL=, EMAILPORT, and EMAILPW system options. EMAILID= provides the user name, EMAILPW= provides the password, EMAILPORT specifies the port to which the SMTP server is attached, EMAILHOST specifies SMTP servers that supports e-mail access for your site, and EMAILAUTHPROTOCOL= provides the protocol.
See Also System Option: “EMAILAUTHPROTOCOL= System Option” on page 1839 “EMAILID= System Option” on page 1842 “EMAILPORT System Option” on page 1843 “EMAILPW= System Option” on page 1844
EMAILID= System Option Identifies an e-mail sender by specifying either a logon ID, an e-mail profile, or an e-mail address. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Communications: Email PROC OPTIONS GROUP= EMAIL Category:
Syntax EMAILID =logonid| profile|emailaddress
Syntax Description logonid
specifies the logon ID for the user running SAS. Maximum: The maximum number of characters is 32,000. profile
see documentation for your e-mail system to determine the profile name. email-address
specifies the fully qualified e-mail address of the user running SAS.
SAS System Options
Requirement:
4
EMAILPORT System Option
1843
The e-mail address is valid only when SMTP is enabled.
If the value of email-address contains a space, you must enclose it in double quotation marks.
Requirement:
Details The EMAILID= system option specifies the logon ID, profile, or e-mail address to use with your e-mail system.
Comparisons For the SMTP access method, use this option in conjunction with the EMAILAUTHPROTOCOL=, EMAILPW=, EMAILPORT, and EMAILHOST system options. EMAILID= provides the user name, EMAILPW= provides the password, EMAILPORT specifies the port to which the SMTP server is attached, EMAILHOST specifies the SMTP server that supports e-mail access for your site, and EMAILAUTHPROTOCOL= provides the protocol.
See Also System Options: “EMAILAUTHPROTOCOL= System Option” on page 1839 “EMAILHOST= System Option” on page 1841 “EMAILPORT System Option” on page 1843 “EMAILPW= System Option” on page 1844
EMAILPORT System Option Specifies the port that the SMTP server is attached to. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Communications: Email PROC OPTIONS GROUP=
EMAIL
Syntax EMAILPORT
Syntax Description port-number
specifies the port number that is used by the SMTP server that you specified on the EMAILHOST option. Note: The system administrator for your site will provide this information.
4
1844
EMAILPW= System Option
4
Chapter 7
Details Operating Environment Information: If you use the SMTP protocol that SAS provides, you must also specify the EMAILSYS SMTP system option. For information about EMAILSYS, see the documentation for your operating environment. 4
Comparisons For the SMTP access method, use this option in conjunction with the EMAILID=, EMAILAUTHPROTOCOL= , EMAILPW= , and EMAILHOST system options. EMAILID= provides the user name, EMAILPW= provides the password, EMAILPORT specifies the port to which the SMTP server is attached, EMAILHOST specifies the SMTP server that supports email access for your site, and EMAILAUTHPROTOCOL= provides the protocol.
See Also System Option: “EMAILAUTHPROTOCOL= System Option” on page 1839 “EMAILHOST= System Option” on page 1841 “EMAILID= System Option” on page 1842 “EMAILPW= System Option” on page 1844
EMAILPW= System Option Specifies an e-mail logon password. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Communications: Email
PROC OPTIONS GROUP= EMAIL
Syntax EMAILPW= "password"
PASSWORD
specifies the logon password for your logon name. Restriction: If “password” contains a space, you must enclose the value in double
quotation marks.
Details If you do not specify the EMAILID and EMAILPW system options at invocation (and you are not otherwise logged in to your e-mail system), SAS will prompt you for them when you initiate your e-mail.
SAS System Options
4
ENGINE= System Option
1845
Comparisons For the SMTP access method, use this option in conjunction with the EMAILID=, EMAILAUTHPROTOCOL=, EMAILPORT, and EMAILHOST system options. EMAILID= provides the user name, EMAILPW= provides the password, EMAILPORT specifies the port to which the SMTP server is attached, EMAILHOST specifies the SMTP server that supports e-mail access for your site, and EMAILAUTHPROTOCOL= provides the protocol.
See Also System Options: “EMAILAUTHPROTOCOL= System Option” on page 1839 “EMAILHOST= System Option” on page 1841 “EMAILID= System Option” on page 1842 “EMAILPORT System Option” on page 1843
ENGINE= System Option Specifies the default access method for SAS libraries. Valid in:
configuration file, SAS invocation
Category: Files: SAS Files PROC OPTIONS GROUP= SASFILES See:
ENGINE= System Option in the documentation for your operating environment.
Syntax ENGINE=engine-name
Syntax Description
engine-name
specifies an engine name.
Details The ENGINE= system option specifies which default engine name is associated with a SAS library. The default engine is used when a SAS library points to an empty directory or a new file. The default engine is also used on directory-based systems, which can store more than one SAS file type within a directory. For example, some operating environments can store SAS files from multiple versions in the same directory. Operating Environment Information: Valid engine names depend on your operating environment. For details, see the SAS documentation for your operating environment.
4
1846
ERRORABEND System Option
4
Chapter 7
See Also “SAS I/O Engines” in SAS Language Reference: Concepts
ERRORABEND System Option Specifies whether SAS responds to errors by terminating. configuration file, SAS invocation, OPTIONS statement, SAS System Options
Valid in:
window Alias:
ERRABEND | NOERRABEND
Category:
Environment control: Error handling
PROC OPTIONS GROUP= ERRORHANDLING
Syntax ERRORABEND | NOERRORABEND
Syntax Description
ERRORABEND
specifies that SAS terminate for most errors (including syntax errors and file not found errors) that would normally cause it to issue an error message, set OBS=0, and go into syntax-check mode (if syntax checking is enabled). SAS also terminates if an error occurs in any global statement other than the LIBNAME and FILENAME statements. Use the ERRORABEND system option with SAS production programs, which presumably should not encounter any errors. If errors are encountered and ERRORABEND is in effect, SAS brings the errors to your attention immediately by terminating. ERRORABEND does not affect how SAS handles notes such as invalid data messages.
Tip:
NOERRORABEND
specifies that SAS handle errors normally, that is, issue an error message, set OBS=0, and go into syntax-check mode (if syntax checking is enabled).
See Also System options: “ERRORBYABEND System Option” on page 1847 “ERRORCHECK= System Option” on page 1848 “Global Statements” on page 1385
SAS System Options
4
ERRORBYABEND System Option
1847
ERRORBYABEND System Option Specifies whether SAS ends a program when an error occurs in BY-group processing. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Environment control: Error handling PROC OPTIONS GROUP= ERRORHANDLING
Syntax ERRORBYABEND | NOERRORBYABEND
Syntax Description ERRORBYABEND
specifies that SAS ends a program for BY-group error conditions that would normally cause it to issue an error message. NOERRORBYABEND
specifies that SAS handle BY-group errors normally, that is, by issuing an error message and continuing processing.
Details If SAS encounters one or more BY-group errors while ERRORBYABEND is in effect, SAS brings the errors to your attention immediately by ending your program. ERRORBYABEND does not affect how SAS handles notes that are written to the SAS log. Note: Use the ERRORBYABEND system option with SAS production programs that should be error free. 4
See Also System Option: “ERRORABEND System Option” on page 1846
1848
ERRORCHECK= System Option
4
Chapter 7
ERRORCHECK= System Option Specifies whether SAS enters syntax-check mode when errors are found in the LIBNAME, FILENAME, %INCLUDE, and LOCK statements. configuration file, SAS invocation, OPTIONS statement, SAS System Options
Valid in:
window Environment control: Error handling PROC OPTIONS GROUP= ERRORHANDLING Category:
Syntax ERRORCHECK=NORMAL | STRICT
Syntax Description NORMAL
specifies not to place the SAS program into syntax-check mode when an error occurs in a LIBNAME or FILENAME statement, or in a LOCK statement in SAS/SHARE software. In addition, the program or session does not terminate when a %INCLUDE statement fails due to a non-existent file. STRICT
specifies to place the SAS program into syntax-check mode when an error occurs in a LIBNAME or FILENAME statement, or in a LOCK statement in SAS/SHARE software. If the ERRORABEND system option is set and an error occurs in either a LIBNAME or FILENAME statement, SAS terminates. In addition, SAS terminates when a %INCLUDE statement fails due to a non-existent file.
See Also System option: “ERRORABEND System Option” on page 1846
ERRORS= System Option Specifies the maximum number of observations for which SAS issues complete error messages. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Environment control: Error handling Log and procedure output control: SAS log
PROC OPTIONS GROUP= ERRORHANDLING
LOGCONTROL
SAS System Options
4
EXPLORER System Option
1849
Syntax ERRORS=n| nK | nM | nG | nT | MIN| MAX | hexX
Syntax Description
n | nK | nM | nG | nT
specifies the number of observations for which SAS issues error messages in terms of 1 (n); 1,024 (nK); 1,048,576 (nM); 1,073,741,824 (nG); or 1,099,511,627,776 (nT). For example, a value of 8 specifies eight observations, and a value of 3M specifies 3,145,728 observations. MIN
sets the number of observations for which SAS issues error messages to 0. MAX
sets the maximum number of observations for which SAS issues error messages to the largest signed, 4–byte integer representable in your operating environment. hexX
specifies the maximum number of observations for which SAS issues error messages as a hexadecimal number. You must specify the value beginning with a number (0–9), followed by an X. For example, the value 2dx sets the maximum number of observations for which SAS issues error messages to 45 observations.
Details If data errors are detected in more than n observations, processing continues, but SAS does not issue error messages for the additional errors. Note: If you set ERRORS=0 and an error occurs, or if the maximum number of errors has been reached, a warning message displays in the log which states that the limit set by the ERRORS option has been reached. 4 Operating Environment Information: The syntax that is shown here applies to the OPTIONS statement. On the command line or in a configuration file, the syntax is specific to your operating environment. For details, see the SAS documentation for your operating environment. 4
See Also “The SAS Log” in SAS Language Reference: Concepts
EXPLORER System Option Specifies whether to invoke the SAS windowing environment and display only the Explorer window. Valid in:
configuration file, SAS invocation
Category: Environment control: Initialization and operation PROC OPTIONS GROUP= EXECMODES
1850
FILESYNC= System Option
4
Chapter 7
Syntax EXPLORER | NOEXPLORER
Syntax Description EXPLORER
specifies that the SAS session be invoked with only the Explorer window. NOEXPLORER
specifies that the SAS session be invoked without the Explorer window.
Details The following SAS execution mode invocation options, in order, have precedence over this option: 1 OBJECTSERVER. 2 DMR 3 SYSIN
If you specify EXPLORER with another execution mode invocation option of equal precedence, SAS uses only the last option listed. See “Order of Precedence” on page 1774 for more information on invocation option precedence.
See Also System Options: “DMS System Option” on page 1830 “DMSEXP System Option” on page 1831
FILESYNC= System Option Specifies when operating system buffers that contain contents of permanent SAS files are written to disk. configuration file, SAS invocation Category: Files: SAS Files PROC OPTIONS GROUP= SASFILES Valid in:
See:
FILESYNC= System Option in the documentation for your operating environment.
Syntax FILESYNC= SAS | CLOSE | HOST | SAVE
SAS System Options
4
FIRSTOBS= System Option
1851
Syntax Description SAS
specifies that SAS requests the operating system to force buffered data to be written to disk when it is best for the integrity of the SAS file. CLOSE
specifies that SAS requests the operating system to force buffered data to be written to disk when the SAS file is closed. HOST
specifies that the operating system schedules when the buffered data for a SAS file is written to disk. This is the default. SAVE
specifies that the buffers are written to disk when the SAS file is saved.
Details By using the FILESYNC= system option, SAS can tell the operating system when to force data that is temporarily stored in operating system buffers to be written to disk. Only SAS files in a permanent SAS library are affected; files in a temporary library are not affected. If you specify a value other than the default value of HOST, the following occurs: 3 the length of time it takes to run a SAS job increases 3 the small chance of loosing data in the event of a system failure is further reduced Consult with your system administrator before you change the value of the FILESYNC= system option to a value other than the default value. Operating Environment Information: Under z/OS, the FILESYNC= system option affects SAS files only in UNIX file system (UFS) libraries. For more information, see “FILESYNC= System Option” in SAS Companion for z/OS 4
FIRSTOBS= System Option Specifies the observation number or external file record that SAS processes first. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Files: SAS Files PROC OPTIONS GROUP= SASFILES
Syntax FIRSTOBS= n | nK | nM | nG | nT | hexX | MIN | MAX
1852
FIRSTOBS= System Option
4
Chapter 7
Syntax Description
n | nK | nM | nG | nT
specifies the number of the first observation or external file record to process, with n being an integer. Using one of the letter notations results in multiplying the integer by a specific value. That is, specifying K (kilo) multiplies the integer by 1,024; M (mega) multiplies by 1,048,576 ; G (giga) multiplies by 1,073,741,824; or T (tera) multiplies by 1,099,511,627,776. For example, a value of 8 specifies the eighth observations or records, and a value of 3m specifies observation or record 3,145,728. hexX
specifies the number of the first observation or the external file record to process as a hexadecimal value. You must specify the value beginning with a number (0–9), followed by an X. For example, the value 2dx specifies the 45th observation. MIN
sets the number of the first observation or external file record to process to 1. This is the default. MAX
sets the number of the first observation to process to the maximum number of observations in the data sets or records in the external file, up to the largest 63 eight-byte, signed integer, which is 2 -1, or approximately 9.2 quintillion observations.
Details The FIRSTOBS= system option is valid for all steps for the duration of your current SAS session or until you change the setting. To affect any single SAS data set, use the FIRSTOBS= data set option. You can apply FIRSTOBS= processing to WHERE processing. For details, see “Processing a Segment of Data That Is Conditionally Selected” in SAS Language Reference: Concepts. Operating Environment Information: The syntax that is shown here applies to the OPTIONS statement. On the command line or in a configuration file, the syntax is specific to your operating environment. For details, see the documentation for your operating environment. 4
Comparisons 3 You can override the FIRSTOBS= system option by using the FIRSTOBS= data set option and by using the FIRSTOBS= option as a part of the INFILE statement.
3 While the FIRSTOBS= system option specifies a starting point for processing, the OBS= system option specifies an ending point. The two options are often used together to define a range of observations or records to be processed.
Examples If you specify FIRSTOBS=50, SAS processes the 50th observation of the data set first.
SAS System Options
4
FMTERR System Option
1853
This option applies to every input data set that is used in a program or a SAS process. In this example, SAS begins reading at the 11th observation in the data sets OLD, A, and B: options firstobs=11; data a; set old; /* 100 observations */ run; data b; set a; run; data c; set b; run;
Data set OLD has 100 observations, data set A has 90, B has 80, and C has 70. To avoid decreasing the number of observations in successive data sets, use the FIRSTOBS= data set option in the SET statement. You can also reset FIRSTOBS=1 between a DATA step and a PROC step.
See Also Data Set Option: “FIRSTOBS= Data Set Option” on page 25 Statement: “INFILE Statement” on page 1541 System Option: “OBS= System Option” on page 1890
FMTERR System Option When a variable format cannot be found, specifies whether SAS generates an error or continues processing. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Environment control: Error handling PROC OPTIONS GROUP= ERRORHANDLING
Syntax FMTERR | NOFMTERR
1854
FMTSEARCH= System Option
4
Chapter 7
Syntax Description FMTERR
specifies that when SAS cannot find a specified variable format, it generates an error message and does not allow default substitution to occur. NOFMTERR
replaces missing formats with the w. or $w. default format, issues a note, and continues processing.
See Also System Option: “FMTSEARCH= System Option” on page 1854
FMTSEARCH= System Option Specifies the order in which format catalogs are searched. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Environment control: Files
PROC OPTIONS GROUP= ENVFILES See:
FMTSEARCH= System Option under OpenVMS
Syntax FMTSEARCH=(catalog-specification-1... catalog-specification-n)
Syntax Description catalog-specification
searches format catalogs in the order listed, until the desired member is found. The value of catalog-specification can be either libref or libref.catalog. If only the libref is given, SAS assumes that FORMATS is the catalog name.
Details The WORK.FORMATS catalog is always searched first, and the LIBRARY.FORMATS catalog is searched next, unless one of them appears in the FMTSEARCH= list. If a catalog appears in the FMTSEARCH= list, the catalog is searched in the order in which it appears in the list. If a catalog in the list does not exist, that particular item is ignored and searching continues. Operating Environment Information: Under the Windows, UNIX, and z/OS operating environments, you can use the APPEND or INSERT system options to add additional catalog-specification. For details, see the documentation for the APPEND and INSERT system options. 4
SAS System Options
4
FONTEMBEDDING System Option
1855
Examples If you specify FMTSEARCH=(ABC DEF.XYZ GHI), SAS searches for requested formats or informats in this order: 1 WORK.FORMATS 2 LIBRARY.FORMATS 3 ABC.FORMATS 4 DEF.XYZ 5 GHI.FORMATS.
If you specify FMTSEARCH=(ABC WORK LIBRARY) SAS searches in this order: 1 ABC.FORMATS 2 WORK.FORMATS 3 LIBRARY.FORMATS.
Because WORK appears in the FMTSEARCH list, WORK.FORMATS is not automatically searched first.
See Also System Option: “APPEND= System Option” on page 1789 “INSERT= System Option” on page 1871 “FMTERR System Option” on page 1853
FONTEMBEDDING System Option Specifies whether font embedding is enabled in Universal Printer and SAS/GRAPH printing. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Log and procedure output control: ODS Printing PROC OPTIONS GROUP= ODSPRINT
Syntax FONTEMBEDDING | NOFONTEMBEDDING
Syntax Description FONTEMBEDDING
specifies to enable font embedding. This is the default. NOFONTEMBEDDING
specifies to disable font embedding.
1856
FONTRENDERING= System Option
4
Chapter 7
Details When FONTEMBEDDING is set, fonts can be embedded, or included, in the output files that are created by the Universal Printer and SAS/GRAPH. Output files with embedded fonts do not rely on fonts being installed on the computer that is used to view or print the output file. Embedding fonts increases the file size. When NOFONTEMBEDDING is set, the output files rely on the fonts being installed on the computer that is used to view or print the font. When you print or create PostScript files, if the specified font is recognized by SAS but is not available on the printer, SAS substitutes the most similar, standard font in the output. For example, the Helvetica font would replace any occurrence of Albany AMT. This guarantees that the printer is capable of printing the text. To determine which fonts will be substituted for a given printer, use the Print Setup window, the Registry Editor, or the REGEDIT procedure to display the Printer Setup properties. Under Fonts, any individual fonts that are listed will be recognized by the printer. All other fonts, including those that are available via a link in the SAS Registry, will be substituted in the document when the document is created.
FONTRENDERING= System Option Specifies whether SAS/GRAPH devices that are based on the SASGDGIF, SASGDTIF, and SASGDIMG modules render fonts by using the operating system or by using the FreeType engine. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Log and procedure output control: ODS Printing
PROC OPTIONS GROUP= ODSPRINT
Syntax FONTRENDERING=HOST_PIXELS | FREETYPE_POINTS
Syntax Description HOST_PIXELS
specifies that fonts are rendered by the operating system and that font size is requested in pixels. Operating Environment Information: On z/OS, HOST_PIXELS is not supported. If HOST_PIXELS is specified, SAS uses FREETYPE_POINTS as the value for this option. 4 FREETYPE_POINTS
specifies that fonts are rendered by the FreeType engine and that font size is requested in points. This is the default.
Details Use the FONTRENDERING= system option to specify how SAS/GRAPH devices that are based on the SASGDGIF, SASGDTIF, and SASGDIMG modules render fonts. When
SAS System Options
4
FONTSLOC= System Option
1857
the operating system renders fonts, the font size is requested in pixels. When the FreeType engine renders fonts, the font size is requested in points. Use the GDEVICE procedure to determine which module a SAS/GRAPH device uses: proc gdevice c=sashelp.devices browse nofs; list devicename; quit;
For example, proc gdevice c=sashelp.devices browse nofs; list gif; quit;
The following is partial output from the GDEVICE procedure output: GDEVICE procedure Listing from SASHELP.DEVICES - Entry GIF Orig Driver: GIF Description: GIF File Format *** Institute-supplied *** Lrows: 43 Xmax: 8.333 IN Lcols: 88 Ymax: 6.250 IN Prows: 0 Pcols: 0 Aspect: 0.000 Driver query: Y
Module:
SASGDGIF
Hsize: 0.000 IN Vsize: 0.000 IN Horigin: 0.000 IN Vorigin: 0.000 IN Rotate: Queued messages: N
Model: 6031 Type: EXPORT Xpixels: Ypixels:
800 600
The Module entry names the module used by the device.
See Also “SAS/GRAPH Fonts” in SAS/GRAPH: Reference
FONTSLOC= System Option Specifies the location of the fonts that are supplied by SAS; names the default font file location for registering fonts that use the FONTREG procedure. Valid in:
configuration file, SAS invocation
Category: Environment control: Display PROC OPTIONS GROUP= ENVDISPLAY See:
FONTSLOC= System Option in the documentation for your operating environment
Syntax FONTSLOC= “location”
1858
4
FORMCHAR= System Option
Chapter 7
Syntax Description “location”
specifies a fileref or the location of the SAS fonts that are used during the SAS session. Note: marks.
If “location” is a fileref, you do not need to enclose the value in quotation
4
FORMCHAR= System Option Specifies the default output formatting characters. configuration file, SAS invocation, OPTIONS statement, SAS System Options
Valid in:
window Log and procedure output control: Procedure output PROC OPTIONS GROUP= LISTCONTROL See: FORMCHAR= System Option in the documentation for your operating environment. Category:
Syntax FORMCHAR= ’formatting-characters’
Syntax Description ’formatting-characters’
specifies any string or list of strings of characters up to 64 bytes long. If fewer than 64 bytes are specified, the string is padded with blanks on the right. Tip: For consistent results when you move your document to different computers, issue the following OPTIONS statement before using ODS destinations other than the Listing destination: options formchar"|----|+|---+=|-/\*";
Details Formatting characters are used to construct tabular output outlines and dividers for various procedures, such as the FREQ, REPORT, and TABULATE procedures. If you omit formatting characters as an option in the procedure, the default specifications given in the FORMCHAR= system option are used. Note that you can also specify a hexadecimal character constant as a formatting character. When you use a hexadecimal constant with this option, SAS interprets the value of the hexadecimal constant as appropriate for your operating system. Note: To ensure that row and column separators and boxed tabular reports are printed legibly when using the standard forms characters, you must use these resources:
SAS System Options
4
FORMDLIM= System Option
1859
3 either the SAS Monospace or the SAS Monospace Bold font 3 a printer that supports TrueType fonts
4
See Also For further information about how Base SAS procedures use formatting characters, see the Base SAS Procedures Guide. For procedures in other products that use formatting characters, see the documentation for that product. “Printing with SAS” in SAS Language Reference: Concepts
FORMDLIM= System Option Specifies a character to delimit page breaks in SAS output. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Log and procedure output control: Procedure output PROC OPTIONS GROUP= LISTCONTROL
Syntax FORMDLIM=’delimiting-character’
Syntax Description ’delimiting-character’
specifies in quotation marks a character written to delimit pages. Normally, the delimit character is null, as in this statement: options formdlim=’’;
Details When the delimit character is null, a new physical page starts whenever a new page occurs. However, you can conserve paper by allowing multiple pages of output to appear on the same page. For example, this statement writes a line of dashes (- -) where normally a page break would occur: options formdlim=’-’;
When a new page is to begin, SAS skips a single line, writes a line consisting of the dashes that are repeated across the page, and skips another single line. There is no skip to the top of a new physical page. Resetting FORMDLIM= to null causes physical pages to be written normally again. Operating Environment Information: The syntax that is shown here applies to the OPTIONS statement. On the command line or in a configuration file, the syntax is specific to your operating environment. For details, see the SAS documentation for your operating environment. 4
1860
FORMS= System Option
4
Chapter 7
FORMS= System Option If forms are used for printing, specifies the default form to use. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Environment control: Display PROC OPTIONS GROUP= ENVDISPLAY Category:
Syntax FORMS=form-name
Syntax Description form-name
specifies the name of the form. Tip: To create a customized form, use the FSFORM command in a windowing environment.
Details The default form contains settings that control various aspects of interactive windowing output, including printer selection, text body, and margins. The FORMS= system option also customizes output from the PRINT command (when FORM= is omitted) or output from interactive windowing procedures. Operating Environment Information: The syntax that is shown here applies to the OPTIONS statement. On the command line or in a configuration file, the syntax is specific to your operating environment. For details, see the SAS documentation for your operating environment. 4
GSTYLE System Option Specifies whether ODS styles can be used in the generation of graphs that are stored as GRSEG catalog entries. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Graphics: Driver settings Log and procedure output control: ODS Printing PROC OPTIONS GROUP= GRAPHICS ODSPRINT Category:
Syntax GSTYLE | NOGSTYLE
SAS System Options
4
HELPBROWSER= System Option
1861
Syntax Description GSTYLE
specifies that ODS styles can be used in the generation of graphs that are stored as GRSEG catalog entries. If no style is specified, the default style for the given output destination is used. This is the default. NOGSTYLE
specifies to not use ODS styles in the generation of graphs that are stored as GRSEG catalog entries. Tip: Use NOGSTYLE for compatibility of graphs generated before SAS 9.2.
Details The GSTYLE system option affects only graphic output that is generated using GRSEGs. The GSTYLE option does not affect the use of ODS styles in graphs that are generated by the following means: 3 Java device driver 3 ActiveX device driver 3 SAS/GRAPH statistical graphic procedures 3 SAS/GRAPH template language 3 ODS GRAPHICS ON statement
GWINDOW System Option Specifies whether SAS displays SAS/GRAPH output in the GRAPH window. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Graphics: Driver settings PROC OPTIONS GROUP= GRAPHICS
Syntax GWINDOW | NOGWINDOW
Syntax Description GWINDOW
displays SAS/GRAPH software output in the GRAPH window, if your site licenses SAS/GRAPH software and if your personal computer has graphics capability. NOGWINDOW
displays graphics outside of the windowing environment.
HELPBROWSER= System Option Specifies the browser to use for SAS Help and ODS output.
1862
HELPENCMD System Option
Valid in:
4
Chapter 7
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Environment control: Help
PROC OPTIONS GROUP= HELP
Syntax HELPBROWSER=REMOTE | SAS
Syntax Description REMOTE
specifies to use the remote browser for the Help. The location of the remote browser is determined by the HELPHOST and the HELPPORT system options. This is the default value for the OpenVMS, UNIX, z/OS, and Windows 64-bit operating environments. SAS
specifies to use the SAS browser for the Help. This is the default for the Windows 32-bit operating environment.
See Also System options: “HELPHOST System Option” on page 1863 “HELPPORT= System Option” on page 1864 Viewing Output and Help in the SAS Remote Browser in SAS Companion for OpenVMS on HP Integrity Servers Viewing Output and Help in the SAS Remote Browser in SAS Companion for UNIX Environments Viewing Output and Help in the SAS Remote Browser in SAS Companion for Windows Using the SAS Remote Browser in SAS Companion for z/OS
HELPENCMD System Option Specifies whether SAS uses the English version or the translated version of the keyword list for the command–line Help. configuration file, SAS invocation Category: Environment control: Files PROC OPTIONS GROUP= HELP Valid in:
Syntax HELPENCMD | NOHELPENCMD
SAS System Options
4
HELPHOST System Option
1863
Syntax Description HELPENCMD
specifies that SAS use the English version of the keyword list for the command-line help, although the index will still be displayed with translated keywords. This is the default. NOHELPENCMD
specifies that SAS use the translated version of the keyword list for the command-line help, if a translated version exists.
Details Set NOHELPENCMD if you want the command-line help to locate keywords by using the localized terms. By default, all terms on the command line will be read as English.
See Also System Options: HELPINDEX System Option in SAS Companion for Windows , SAS Companion for UNIX Environments, and SAS Companion for OpenVMS on HP Integrity Servers HELPLOC System Option in SAS Companion for Windows, SAS Companion for UNIX Environments, SAS Companion for OpenVMS on HP Integrity Servers, and SAS Companion for z/OS HELPTOC System Option in SAS Companion for Windows, SAS Companion for UNIX Environments, and SAS Companion for OpenVMS on HP Integrity Servers
HELPHOST System Option Specifies the name of the computer where the remote browser is to send Help and ODS output. Default:
NULL
Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Environment control: Help PROC OPTIONS GROUP= HELP See:
HELPHOST= System Option under OpenVMS UNIX Windowsz/OS
Syntax HELPHOST="host"
"host"
specifies the name of the computer where the remote help is to be displayed. Quotation marks or parentheses are required. The maximum number of characters is 2,048.
1864
HELPPORT= System Option
4
Chapter 7
Details Operating Environment Information: If you do not specify the HELPHOST option, the location where SAS displays the Help depends on your operating environment. See the HELPHOST system option in the documentation for your operating environment. 4
See Also “HELPBROWSER= System Option” on page 1861 “HELPPORT= System Option” on page 1864 Viewing Output and Help in the SAS Remote Browser in the SAS Companion for OpenVMS on HP Integrity Servers Viewing Output and Help in the SAS Remote Browser in the SAS Companion for UNIX Environments Viewing Output and Help in the SAS Remote Browser in the SAS Companion for Windows Using the SAS Remote Browser in the SAS Companion for z/OS
HELPPORT= System Option Specifies the port number for the remote browser client. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Environment control: Help PROC OPTIONS GROUP= HELP Category:
Syntax HELPPORT=port-number
port-number
specifies the port number for the SAS Remote Browser Server. Range: 0–65535 Default: 0
Details When HELPPORT is set to 0, SAS uses the default port number for the remote browser server.
See Also “HELPBROWSER= System Option” on page 1861 “HELPHOST System Option” on page 1863
SAS System Options
4
HTTPSERVERPORTMIN= System Option
1865
Viewing Output and Help in the SAS Remote Browser in the SAS Companion for OpenVMS on HP Integrity Servers Viewing Output and Help in the SAS Remote Browser in the SAS Companion for UNIX Environments Viewing Output and Help in the SAS Remote Browser in the SAS Companion for Windows Using the SAS Remote Browser in the SAS Companion for z/OS
HTTPSERVERPORTMAX= System Option Specifies the highest port number that can be used by the SAS HTTP server for remote browsing. configuration file, SAS invocation Category: Communications: Networking and encryption PROC OPTIONS GROUP= Communications Valid in:
Syntax HTTPSERVERPORTMAX=max-port-number
Syntax Description max-port-number
specifies the highest port number that can be used by the SAS HTTP server for remote browsing. Range: 0–65535 Default: 0
Details Use the HTTPSERVERPORTMAX= and HTTPSERVERPORTMIN= system options to specify a range of port values that the remote browser HTTP server can use to dynamically assign a port number when a firewall is configured between SAS and the HTTP server.
See Also System options: “HTTPSERVERPORTMIN= System Option” on page 1865
HTTPSERVERPORTMIN= System Option Specifies the lowest port number that can be used by the SAS HTTP server for remote browsing.
1866
IBUFNO= System Option
4
Chapter 7
configuration file, SAS invocation Category: Communications: Networking and encryption PROC OPTIONS GROUP= Communications Valid in:
Syntax HTTPSERVERPORTMIN=min-port-number
Syntax Description min-port-number
specifies the lowest port number that can be used by the SAS HTTP server for remote browsing. Range: 0–65535 Default: 0
Details Use the HTTPSERVERPORTMIN and HTTPSERVERPORTMAX system options to specify a range of port values that the remote browser HTTP server can use to dynamically assign a port number when a firewall is configured between SAS and the HTTP server.
See Also System option: “HTTPSERVERPORTMAX= System Option” on page 1865
IBUFNO= System Option Specifies an optional number of extra buffers to be allocated for navigating an index file. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Files: SAS Files PROC OPTIONS GROUP= SASFILES Default: 0 Category:
Syntax IBUFNO=n | nK | nM | nG | nT |hexX | MIN | MAX
Syntax Description
SAS System Options
4
IBUFSIZE= System Option
1867
n | nK | nM | nG | nT
specifies the number of extra index buffers to be allocated in multiples of 1 (bytes); 1,024 (kilobytes); 1,048,576 (megabytes); 1,073,741,824 (gigabytes); or 1,099,511,627,776 (terabytes). For example, a value of 8 specifies eight buffers, and a value of 3k specifies 3,072 buffers. Restriction: Maximum value is 10,000. hexX
specifies the number of extra index buffers as a hexadecimal value. You must specify the value beginning with a number (0–9), followed by an X. For example, the value 2dx specifies 45 buffers. MIN
sets the number of extra index buffers to 0. This is the default. MAX
sets the maximum number of extra index buffers to 10,000.
Details An index is an optional SAS file that you can create for a SAS data file in order to provide direct access to specific observations. The index file consists of entries that are organized into hierarchical levels, such as a tree structure, and connected by pointers. When an index is used to process a request, such as for WHERE processing, SAS does a binary search on the index file and positions the index to the first entry that contains a qualified value. SAS uses the value’s identifier to directly access the observation that contains the value. SAS requires memory for buffers when an index is actually used. The buffers are not required unless SAS uses the index, but they must be allocated in preparation for the index that is being used. SAS automatically allocates a minimal number of buffers in order to navigate the index file. Typically, you do not need to specify extra buffers. However, using IBUFNO= to specify extra buffers could improve execution time by limiting the number of input/ output operations that are required for a particular index file. However, the improvement in execution time comes at the expense of increased memory consumption. Note: Whereas too few buffers allocated to the index file decrease performance, overallocation of index buffers creates performance problems as well. Experimentation is the best way to determine the optimal number of index buffers. For example, experiment with ibufno=3, then ibufno=4, and so on, until you find the least number of buffers that produces satisfactory performance results. 4
See Also “Understanding SAS Indexes” in SAS Language Reference: Concepts. System Option: “IBUFSIZE= System Option” on page 1867
IBUFSIZE= System Option Specifies the buffer page size for an index file.
1868
IBUFSIZE= System Option
Valid in:
4
Chapter 7
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Files: SAS Files
PROC OPTIONS GROUP= SASFILES
Specify a page size before the index file is created. After it is created, you cannot change the page size.
Restriction:
Syntax IBUFSIZE=n | nK | nM | nG | nT | hexX | MAX
Syntax Description
n | nK | nM | nG | nT
specifies the page size to process in multiples of 1 (bytes); 1,024 (kilobytes); 1,048,576 (megabytes); 1,073,741,824 (gigabytes); or 1,099,511,627,776 (terabytes). For example, a value of 8 specifies 8 bytes, and a value of 3k specifies 3,072 bytes. The default is 0, which causes SAS to use the minimum optimal page size for the operating environment. hexX
specifies the page size as a hexadecimal value. You must specify the value beginning with a number (0–9), followed by an X. For example, the value 2dx sets the page size to 45 bytes. MAX
sets the page size for an index file to the maximum possible number. For IBUFSIZE=, the value is 32,767 bytes.
Details An index is an optional SAS file that you can create for a SAS data file in order to provide direct access to specific observations. The index file consists of entries that are organized into hierarchical levels, such as a tree structure, and connected by pointers. When an index is used to process a request, such as for WHERE processing, SAS does a search on the index file in order to rapidly locate the requested records. Typically, you do not need to specify an index page size. However, the following situations could require a different page size:
3 The page size affects the number of levels in the index. The more pages there are, the more levels in the index. The more levels, the longer the index search takes. Increasing the page size allows more index values to be stored on each page, thus reducing the number of pages (and the number of levels). The number of pages required for the index varies with the page size, the length of the index value, and the values themselves. The main resource that is saved when reducing levels in the index is I/O. If your application is experiencing a lot of I/O in the index file, increasing the page size might help. However, you must re-create the index file after increasing the page size.
3 The index file structure requires a minimum of three index values to be stored on a page. If the length of an index value is very large, you might get an error message that the index could not be created because the page size is too small to hold three index values. Increasing the page size should eliminate the error.
SAS System Options
Note:
4
INITCMD System Option
Experimentation is the best way to determine the optimal index page size.
1869
4
See Also “Understanding SAS Indexes” in SAS Language Reference: Concepts. “IBUFNO= System Option” on page 1866
INITCMD System Option Specifies an application invocation command and optional SAS windowing environment or text editor commands that SAS executes before processing AUTOEXEC file during SAS invocation. Valid in:
configuration file, SAS invocation
Category: Environment control: Initialization and operation PROC OPTIONS GROUP= EXECMODES
Syntax INITCMD "command-1 "
Syntax Description command-1
specifies any SAS command that invokes an application window. Some valid values are: AF ANALYST ASSIST DESIGN EIS FORECAST GRAPH HELP IMAGE LAB MINER PHCLINICAL PHKINETICS PROJMAN QUERY
1870
INITSTMT= System Option
4
Chapter 7
RUNEIS SQC XADX. Interaction: If you specify FORECAST for command-1, you cannot use
windowing-command-n. windowing-command-n
specifies a valid windowing command or text editor command. Separate multiple commands with semicolons. These commands are processed in sequence. If you use a windowing command that impacts flow, such as the BYE command, it might delay or prohibit processing. Restriction: Do not use the windowing-command-n argument when you enter a
command for an application that submits SAS statements or commands during initialization of the application, that is, during autoexec file initialization.
Details The INITCMD system option suppresses the Log, Output, Program Editor, and Explorer windows when SAS starts so that application window is the first screen that you see. The suppressed windows do not appear, but you can activate them. You can use the ALTLOG option to direct log output for viewing. If windows are initiated by an autoexec file or the INITSTMT option, the window that is displayed by the INITCMD option is displayed last. When you exit an application that is invoked with the INITCMD option, your SAS session ends. You can use the INITCMD option in a windowing environment only. Otherwise, the option is ignored and a warning message is issued. If command-1 is not a valid command, the option is ignored and a warning message is issued. The following SAS execution mode invocation options, in order, have precedence over this option: 1 OBJECTSERVER. 2 DMR 3 SYSIN
If you specify INITCMD with another execution mode invocation option of equal precedence, SAS uses only the last option listed. See “Order of Precedence” on page 1774 for more information on invocation option precedence.
Examples INITCMD "AFA c=mylib.myapp.primary.frame dsname=a.b" INITCMD "ASSIST; FSVIEW SASUSER.CLASS"
INITSTMT= System Option Specifies a SAS statement to execute after any statements in the autoexec file and before any statements from the SYSIN= file. Valid in: Alias:
configuration file, SAS invocation
IS=
Category:
Environment control: Initialization and operation
SAS System Options
4
INSERT= System Option
1871
PROC OPTIONS GROUP= EXECMODES See:
INITSTMT= System Option under Windows OpenVMS
Syntax INITSTMT=’statement’
Syntax Description
’statement’
specifies any SAS statement or statements. Requirements:
statement must be able to run on a step boundary.
Operating Environment Information: On the command line or in a configuration file, the syntax is specific to your operating environment. The SYSIN= system option might not be supported by your operating environment. For details, see the SAS documentation for your operating environment. 4
Comparisons INITSTMT= specifies the SAS statements to be executed at SAS initialization, and the TERMSTMT= system option specifies the SAS statements to be executed at SAS termination.
Examples Here is an example of using this option on UNIX: sas -initstmt ’%put you have used the initstmt; data x; x=1; run;’
See Also System Option: “TERMSTMT= System Option” on page 1976
INSERT= System Option Inserts the specified value as the first value of the specified system option. Valid in:
OPTIONS statement, SAS System Option Window
Category: Environment control: Files PROC OPTIONS GROUP= ENVFILES See:
INSERT= System Option in the documentation for your operating environment.
1872
INTERVALDS= System Option
4
Chapter 7
Syntax INSERT=(system-option-1=argument-1 system-option-n=argument-n)
Syntax Description system-option
can be CMPLIB, FMTSEARCH, MAPS, SASAUTOS, or SASSCRIPT. argument
specifies a new value that you want as the first value of system-option. argument can be any value that could be specified for system-option if system-option is set using the OPTIONS statement.
Details If you specify a new value for the CMPLIB=, FMTSEARCH=, MAPS=, SASAUTOS=, or SASSCRIPT= system options, the new value replaces the value of the option. Instead of replacing the value, you can use the INSERT= system option to add an additional value to the option as the first value of the option.
Comparison The INSERT= system option adds a new value to the beginning of the current value of the CMPLIB=, FMTSEARCH=, MAPS=, SASAUTOS=, or SASSCRIPT= system options. The APPEND= system option adds a new value to the end of one of these system options.
Examples The following table shows the results of adding a value to the beginning of the FMTSEARCH= option value: Current FMTSEARCH= Value
Value of INSERT= System Option
New FMTSEARCH= Value
(WORK LIBRARY)
(fmtsearch=(abc def))
(ABC DEF WORK LIBRARY)
See Also System option: “APPEND= System Option” on page 1789
INTERVALDS= System Option Specifies one or more interval name and value pairs, where the value is a SAS data set that contains user-supplied holidays. The interval can be used as an argument to the INTNX and INTCK functions.
SAS System Options
4
INTERVALDS= System Option
1873
configuration file, SAS invocation, OPTIONS statement, SAS System Options
Valid in:
window Category: Input control: Data processing PROC OPTIONS GROUP= INPUTCONTROL Requirement:
The set of interval-value pairs must be enclosed in parentheses.
Syntax INTERVALDS=(interval-1=libref.dataset-name-1 )
Syntax Description
interval
specifies the name of an interval. The value of interval is the data set that is named in libref.dataset-name. When you specify multiple intervals, the interval name must not be the same as another interval.
Requirement:
libref.dataset-name
specifies the libref and the data set name of the file that contains user-supplied holidays.
Details The INTCK and INTNX functions specify interval as the interval name in the function argument list to reference a data set that names user-supplied intervals. The same libref.dataset-name can be assigned to different intervals. An error occurs when more than one interval of the same name is defined for the INTERVALDS system option.
Examples This example assigns a single data set to an interval on the SAS command line or in a configuration file. -intervalds (mycompany=mycompany.holidays)
The next example assigns multiple intervals using the OPTIONS statement. The intervals subsid1 and subsid2 are assigned the same libref and data set name. options intervalds=(mycompany=mycompany.holidays subsid1=subsid.holidays subsid2=subsid.holidays);
See Also Functions: “INTCK Function” on page 806 “INTNX Function” on page 822 About Date and Time Intervals in SAS Language Reference: Concepts
1874
INVALIDDATA= System Option
4
Chapter 7
INVALIDDATA= System Option Specifies the value that SAS assigns to a variable when invalid numeric data is encountered. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Input control: Data Processing
PROC OPTIONS GROUP= INPUTCONTROL
Syntax INVALIDDATA=’character’
Syntax Description ’character’
specifies the value to be assigned, which can be a letter (A through Z, a through z), a period (.), or an underscore (_). The default value is a period.
Details The INVALIDDATA= system option specifies the value that SAS is to assign to a variable when invalid numeric data is read with an INPUT statement or the INPUT function. Operating Environment Information: The syntax that is shown here applies to the OPTIONS statement. On the command line or in a configuration file, the syntax is specific to your operating environment. For details, see the SAS documentation for your operating environment. 4
JPEGQUALITY= System Option Specifies the JPEG quality factor that determines the ratio of image quality to the level of compression for JPEG files produced by the SAS/GRAPH JPEG device driver. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Requirement:
The DEVICE graphic option must be set to the SAS/GRAPH JPEG device
driver. Category:
Log and procedure output control: ODS Printing
PROC OPTIONS GROUP= ODSPRINT
Syntax JPEGQUALITY= n | MIN | MAX
SAS System Options
4
LABEL System Option
1875
Syntax Description n
specifies an integer that indicates the JPEG quality factor. The quality of the image increases with larger numbers and decreases with smaller numbers. JPEG files are compressed less for higher-quality images. Therefore, the JPEG file size is greater for higher-quality images. For example, n=100 is completely uncompressed and the image quality is highest. When n=0, the image is produced at the maximum compression level with the lowest quality. Range: 0–100 Default: 75 MIN
specifies to set the JPEG quality factor to 0, which has the lowest image quality and the greatest file compression. MAX
specifies to set the JPEG quality factor to 100, which has the highest image quality with no file compression.
Details The optimal quality value varies for each image. The default value of 75 is a good starting value that you can use to optimize the quality of an image within a compressed file. You can increase or decrease the value until you are satisfied with the image quality. Values between 50 and 95 produce the best quality images. When the value is 24 or less, some viewers might not be able to display the JPEG file. When you create such a file, SAS writes the following caution to the SAS log: Caution: quantization tables are too coarse for baseline JPEG.
See Also Graph options: DEVICE
LABEL System Option Specifies whether SAS procedures can use labels with variables. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Log and procedure output control: Procedure output PROC OPTIONS GROUP=
Syntax LABEL | NOLABEL
LISTCONTROL
1876
4
_LAST_= System Option
Chapter 7
Syntax Description LABEL
specifies that SAS procedures can use labels with variables. The LABEL system option must be in effect before the LABEL option of any procedure can be used. NOLABEL
specifies that SAS procedures cannot use labels with variables. If NOLABEL is specified, the LABEL option of a procedure is ignored.
Details A label is a string of up to 256 characters that can be written by certain procedures in place of the variable’s name.
See Also Data Set Option: “LABEL= Data Set Option” on page 36 Statements: “ODS PROCLABEL Statement” in SAS Output Delivery System: User’s Guide.
_LAST_= System Option Specifies the most recently created data set. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Files: SAS Files PROC OPTIONS GROUP= SASFILES Category:
Syntax _LAST_=SAS-data-set
Syntax Description SAS-data-set
specifies a SAS data set name. Restriction: No data set options are allowed. Restriction: Use libref.membername or membername syntax, not a string that is enclosed in quotation marks, to specify a SAS data set name. Note: You can use quotation marks in the libref.membername or membername syntax if the libref or member name is associated with a SAS/ACCESS engine that supports member names with syntax that requires quoting or name literal
SAS System Options
4
LEFTMARGIN= System Option
1877
(n-literal) specification. For more information, see the SAS/ACCESS documentation. 4
Details By default, SAS automatically keeps track of the most recently created SAS data set. Use the _LAST_= system option to override the default. _LAST_= is not allowed with data set options. Operating Environment Information: The syntax that is shown here applies to the OPTIONS statement. On the command line or in a configuration file, the syntax is specific to your operating environment. For details, see the SAS documentation for your operating environment. 4
LEFTMARGIN= System Option Specifies the print margin for the left side of the page. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Log and procedure output control: ODS Printing PROC OPTIONS GROUP=
ODSPRINT
Syntax LEFTMARGIN=margin-size
Syntax Description margin-size
specifies the size of the left print margin. Restriction: The left margin should be small enough so that the left margin plus
the right margin is less than the width of the paper. Interactions: Changing the value of this option might result in changes to the
value of the LINESIZE= system option.
specifies the units for margin-size. The margin-unit can be in for inches or cm for centimeters. is saved as part of the value of the BOTTOMMARGIN system option whether or not it is specified. Default: inches
Details All margins have a minimum that is dependent on the printer and the paper size. The default value of the LEFTMARGIN system option is 0.00 in.
1878
LINESIZE= System Option
4
Chapter 7
Operating Environment Information: Most SAS system options are initialized with default settings when SAS is invoked. However, the default settings and option values for some SAS system options might vary both by operating environment and by site. For details, see the SAS documentation for your operating environment. 4 For additional information on declaring an ODS printer destination, see ODS statements in SAS Output Delivery System: User’s Guide. For additional information on the SAS universal print facility, see “Printing with SAS” in SAS Language Reference: Concepts.
See Also System Options: “BOTTOMMARGIN= System Option” on page 1795 “RIGHTMARGIN= System Option” on page 1927 “TOPMARGIN= System Option” on page 1980
LINESIZE= System Option Specifies the line size for the SAS log and for SAS procedure output. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Alias:
LS=
Category:
Log and procedure output control: SAS log and procedure output Log and procedure output control: SAS log Log and procedure output control: Procedure output
PROC OPTIONS GROUP=
LOG_LISTCONTROL LISTCONTROL LOGCONTROL
See:
LINESIZE= System Option in the documentation for your operating environment.
Syntax LINESIZE=n | MIN | MAX | hexX
Syntax Description n
specifies the number of characters in a line. MIN
sets the number of characters in a line to 64. MAX
sets the number of characters in a line to 256.
SAS System Options
4
LOGPARM= System Option
1879
hexX
specifies the number of characters in a line as a hexadecimal number. You must specify the value beginning with a number (0–9), followed by an X. For example, the value 0FAx sets the line size of the SAS procedure output to 250.
Details The LINESIZE= system option specifies the line size (printer line width) in characters for the SAS log and the SAS output that are used by the DATA step and procedures. The LINESIZE= system option affects the following output:
3 the Output window for the ODS LISTING destination 3 output produced for an ODS markup destination by a DATA step where the FILE statement destination is PRINT (the FILE PRINT ODS statement is not affected by the LINESIZE= system option)
3 procedures that produce only characters that cannot be scaled, such as the PLOT procedure, the CALENDAR procedure, the TIMEPLOT procedure, the FORMS procedure, and the CHART procedure
Operating Environment Information: The syntax that is shown here applies to the OPTIONS statement. On the command line or in a configuration file, the syntax is specific to your operating environment. For details, see the SAS documentation for your operating environment. 4
See Also “The SAS Log” in SAS Language Reference: Concepts
LOGPARM= System Option Specifies when SAS log files are opened, closed, and, in conjunction with the LOG= system option, how they are named. Valid in: Restriction:
configuration file, SAS invocation LOGPARM= is valid only in line mode and in batch mode
Category: Log and procedure output control: SAS log PROC OPTIONS GROUP= LOGCONTROL See:
LOGPARM= System Option in the documentation for your operating environment.
Syntax LOGPARM= “ ”
1880
LOGPARM= System Option
4
Chapter 7
Syntax Description OPEN=APPEND | REPLACE | REPLACEOLD
when a log file already exists, specifies how the contents of the existing file are treated. APPEND appends the log when opening an existing file. If the file does not already exist, a new file is created. REPLACE overwrites the current contents when opening an existing file. If the file does not already exist, a new file is created. REPLACEOLD replaces files that are more than one day old. If the file does not already exist, a new file is created. Operating Environment Information: For z/OS, see the SAS documentation for your operating environment for limitations on the use of OPEN=REPLACEOLD.
4
Default: REPLACE ROLLOVER=AUTO|NONE|SESSION | n | nG | nM | nG
specifies when or if the SAS log “rolls over,” that is, when the current log is closed and a new one is opened. AUTO causes an automatic “rollover” of the log when the directives in the value of the LOG= option change, that is, the current log is closed and a new log file is opened. Interaction: The name of the new log file is determined by the value of the LOG=
system option. If LOG= does not contain a directive, however, the name would never change, so the log would never roll over, even when ROLLOVER=AUTO. NONE specifies that rollover does not occur, even when a change occurs in the name that is specified with the LOG= option. Interaction: If the LOG= value contains any directives, they do not resolve. For
example, if Log="#b.log" is specified, the directive “#” does not resolve, and the name of the log file remains "#b.log". SESSION at the beginning of each SAS session, opens the log file, resolves directives that are specified in the LOG= system option, and uses its resolved value to name the new log file. During the course of the session, no rollover is performed. n |nK |nM |nG causes the log to rollover when the log reaches a specific size, stated in multiples of 1 (bytes); 1,024 (kilobytes); 1,048,576 (megabytes); or 1,073,741,824 (gigabytes). When the log reaches the specified size, it is closed and renamed by appending “old” to the log filename, and if it exists, the lock file for a server log. For example, a filename of 2008Dec01.log would be renamed 2008Dec01old.log. A new log file is opened using the name specified in the LOG= option. CAUTION:
Old log files can be overwritten. SAS maintains only one old log file with the same name as the open log file. If rollover occurs more than once, the old log file is overwritten. 4
SAS System Options
4
LOGPARM= System Option
1881
Restriction: The minimum log file size is 10K. See also: “Log Filenames” in SAS Language Reference: Concepts Default: NONE Interaction: Rollover is triggered by a change in the value of the LOG= option. Restriction: Rollover will not occur more often than once a minute. See Also: LOG= system option under Windows, UNIX, z/OS WRITE=BUFFERED | IMMEDIATE
specifies when content is written to the SAS log. BUFFERED writes content to the SAS log only when a buffer is full in order to increase efficiency. IMMEDIATE writes to the SAS log each time that statements are submitted that produce content for the SAS log. SAS does no buffering of log messages. Default: BUFFERED
Under Windows, the buffered log contents are written periodically, using an interval that is specified by SAS.
Tip:
Details The LOGPARM= system option controls the opening and closing of SAS log files when SAS is operating in batch mode or in line mode. This option also controls the naming of new log files, in conjunction with the LOG= system option and the use of directives in the value of LOG=. Using directives in the value of the LOG= system option enables you to control when logs are open and closed and how they are named, based on actual time events, such as time, month, and day of week. Operating Environment Information: Under the Windows and UNIX operating environments, you can begin directives with either the % symbol or the # symbol, and use both symbols in the same directive. For example, -log=mylog%b#C.log. Under z/OS, begin directives only with the # symbol. For example, -log=mylog#b#c.log. Under OpenVMS, begin directives only with the % symbol. For example, -log=mylog%b%c.log. 4 The following table contains a list of directives that are valid in LOG= values: Table 7.4 Directives for Controlling the Name of SAS Log Files Directive
Description
Range
%a or #a
Locale’s abbreviated day of week
Sun–Sat
%A or #A
Locale’s full day of week
Sunday–Saturday
%b or #b
Local’s abbreviated month
Jan–Dec
%B or #B
Locale’s full month
January–December
%C or #C
Century number
00–99
%d or #d
Day of the month
01–31
1882
LOGPARM= System Option
4
Chapter 7
Directive
Description
Range
%H or #H
Hour
00–23
%j or #j
Julian day
001–366
%l or #l *
User name
alphanumeric string that is the name of the user that started SAS
%M or #M
Minutes
00–59
%m or #m
Month number
01–12
%n or #n
Current system nodename (without domain name)
none
%p or #p *
Process ID
alphanumeric string that is the SAS session process ID
%s or #s
Seconds
00–59
%u or #u
Day of week
1= Monday–7=Sunday
%v or #v *
Unique identifier
alphanumeric string that creates a log filename that does not currently exist
%w or #w
Day of week
0=Sunday–6=Saturday
%W or #W
Week number (Monday as first day; all days in new year preceding first Monday are in week 00)
00–53
%y or #y
Year without century
00–99
%Y or #Y
Full year
1970–9999
%%
Percent escape writes a single percent sign in the log filename.
%
##
Pound escape writes a single pound sign in the log filename.
#
* Because %v, %l, and %p are not a time-based format, the log filename will never change after it has been generated; therefore, the log will never roll over. In these situations,specifying ROLLOVER=AUTO is equivalent to specifying ROLLOVER=SESSION.
Operating Environment Information: See the SAS companion for z/OS for limitations on the length of the log filename under z/OS. 4 Note: Directives that you specify in the LOG= system option are not the same as the conversion characters that you specify to format logging facility logs. Directives specify a format for a log name. Conversion characters specify a format for log messages. Directives and conversion characters that use the same characters might function differently. 4 Note: If you start SAS in batch mode or server mode and the LOGCONFIGLOC= option is specified, logging is done by the SAS logging facility. The traditional SAS log option LOGPARM= is ignored. The traditional SAS log option LOG= is honored only when the %S{App.Log} conversion character is specified in the logging configuration file. For more information, see SAS Logging Facility in SAS Logging: Configuration and Programming Reference. 4
SAS System Options
4
LOGPARM= System Option
1883
Examples Operating Environment Information: The LOGPARM= system option is executed when SAS is invoked. When you invoke SAS at your site, the form of the syntax is specific to your operating environment. See the SAS documentation for your operating environment for details. 4
3 Rolling over the log at a certain time and using directives to name the log according to the time: If this command is submitted at 9:43 AM, this example creates a log file called test0943.log, and the log rolls over each time the log filename changes. In this example, at 9:44 AM, the test0943.log file will be closed, and the test0944.log file will be opened. sas -log "test%H%M.log" -logparm "rollover=auto"
3 Preventing log rollover but using directives to name the log: For a SAS session that begins at 9:34 AM, this example creates a log file named test0934.log, and prevents the log file from rolling over: sas -log "test%H%M.log" -logparm "rollover=session"
3 Preventing log rollover and preventing the resolution of directives: This example creates a log file named test%H%M.log, ignores the directives, and prevents the log file from rolling over during the session: sas -log "test%H%M.log" -logparm "rollover=none"
3 Creating log files with unique identifiers: This example uses a unique identifier to create a log file with a unique name: sas -log "test%v.log" -logparm "rollover=session"
SAS replaces the directive %v with process_IDvn, where process_ID is a numeric process identifier that is determined by the operating system and n is an integer number, starting with 1. The letter v that is between process_ID and n is always a lowercase letter. For this example, process_ID is 3755. If the file does not already exist, SAS creates a log file with the name test3755v1.log. If test3755v1.log does exist, SAS attempts to create a log file by incrementing n by 1, and this process continues until SAS can generate a log file. For example, if the file test3755v1.log exists, SAS attempts to create the file test3755v2.log.
3 Naming a log file by the user that started SAS: This example creates a log filename that contains the user name that started the SAS session: sas -log "%l.log" -logparm "rollover=session";
See Also “The SAS Log” in SAS Language Reference: Concepts
1884
LRECL= System Option
4
Chapter 7
LRECL= System Option Specifies the default logical record length to use for reading and writing external files. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Files: External files PROC OPTIONS GROUP= EXTFILES
Syntax LRECL=n | nK | nM | nG | nT | hexX | MIN | MAX
Syntax Description n
specifies the logical record length in multiples of 1 (bytes); 1,024 (kilobytes); 1,048,576 (megabytes); 1,073,741,824 (gigabytes); or 1,099,511,627,776 (terabytes). For example, a value of 32 specifies 32 bytes, and a value of 32k specifies 32,767 bytes. Default: 256 Range: 1–32767 hexX
specifies the logical record length as a hexadecimal value. You must specify the value beginning with a number (0–9), followed by an X. For example, the value 2dx sets the logical record length to 45 characters. MIN
specifies a logical record length of 1. MAX
specifies a logical record length of 32,767.
Details The logical record length for reading or writing external files is first determined by the LRECL= option in the access method statement, function, or command that is used to read or write an individual file, or the DDName value in the z/OS operating environment. If the logical record length is not specified by any of these means, SAS uses the value that is specified by the LRECL= system option. Use a value for the LRECL= system option that is not an arbitrary large value. Large values for this option can result in excessive use of memory, which can degrade performance. Operating Environment Information: Under z/OS, the LRECL= system option is recognized only for reading and writing HFS files. 4
MAPS= System Option Specifies the location of the SAS library that contains SAS/GRAPH map data sets.
SAS System Options
Valid in:
4
MERGENOBY System Option
1885
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Graphics: Driver settings
GRAPHICS See: MAPS= System Option in the documentation for your operating environment PROC OPTIONS GROUP=
Syntax MAPS=location-of-maps
Syntax Description location-of-maps
specifies either a physical path, an environment variable, or a libref to locate the SAS/GRAPH map data sets. Restriction: If you specify a libref, you must specify the MAPS option in the configuration file. Operating Environment Information: The syntax shown here applies to the OPTIONS statement. On the command line or in a configuration file, the syntax is specific to your operating environment. For more information, see the SAS documentation for your operating environment. 4 Operating Environment Information: Under the Windows, UNIX, and z/OS operating environments, you can use the APPEND or INSERT system options to add additional location-of-maps. For more information, see the documentation for the APPEND and INSERT system options. 4
See Also System Options: “APPEND= System Option” on page 1789 “INSERT= System Option” on page 1871
MERGENOBY System Option Specifies the type of message that is issued when MERGE processing occurs without an associated BY statement. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Files: SAS Files PROC OPTIONS GROUP=
SASFILES
Syntax MERGENOBY= NOWARN |WARN | ERROR
1886
MISSING= System Option
4
Chapter 7
Syntax Description NOWARN
specifies that no warning message is issued. This is the default. WARN
specifies that a warning message is issued. ERROR
specifies that an error message is issued.
MISSING= System Option Specifies the character to print for missing numeric values. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Log and procedure output control: SAS log and procedure output Log and procedure output control: SAS log Log and procedure output control: Procedure output
PROC OPTIONS GROUP=
LOG_LISTCONTROL LISTCONTROL LOGCONTROL
Syntax MISSING=character
Syntax Description character
specifies the value to be printed. The value can be any character. Single or double quotation marks are optional. The period is the default. Operating Environment Information: The syntax that is shown above applies to the OPTIONS statement. However, when you specify the MISSING= system option on the command line or in a configuration file, the syntax is specific to your operating environment and might include additional or alternate punctuation. For details, see the SAS documentation for your operating environment. 4
Details The MISSING= system option does not apply to special missing values such as .A and .Z.
See Also “The SAS Log” in SAS Language Reference: Concepts
SAS System Options
4
MSGLEVEL= System Option
1887
MSGLEVEL= System Option Specifies the level of detail in messages that are written to the SAS log. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Log and procedure output control: SAS log PROC OPTIONS GROUP=
LOGCONTROL
Syntax MSGLEVEL= N | I
Syntax Description N
specifies to print notes, warnings, CEDA message, and error messages only. N is the default. I
specifies to print additional notes pertaining to index usage, merge processing, and sort utilities, along with standard notes, warnings, CEDA message, and error messages.
Details Some of the conditions under which the MSGLEVEL= system option applies are as follows: 3 If MSGLEVEL=I, SAS writes informative messages to the SAS log about index processing. In general, when a WHERE expression is executed for a data set with indexes, the following information appears in the SAS log: 3 if an index is used, a message displays that specifies the name of the index 3 if an index is not used but one exists that could optimize at least one condition in the WHERE expression, messages provide suggestions that describe what you can do to influence SAS to use the index. For example, a message could suggest to sort the data set into index order or to specify more buffers. 3 a message displays the IDXWHERE= or IDXNAME= data set option value if the setting can affect index processing.
3 If MSGLEVEL=I, SAS writes a warning to the SAS log when a MERGE statement would cause variables to be overwritten.
3 If MSGLEVEL=I, SAS writes a message that indicates which sorting product was used. 3 For informative messages about queries by an application to a SAS/SHARE server, MSGLEVEL=I must be set for the SAS session where the SAS/SHARE server is running. The messages are written to the SAS log for the SAS session that runs the SAS/SHARE server.
See Also “The SAS Log” in SAS Language Reference: Concepts
1888
MULTENVAPPL System Option
4
Chapter 7
MULTENVAPPL System Option Specifies whether the fonts available in a SAS application font selector window lists only the SAS fonts that are available in all operating environments. Valid in:
configuration file, SAS invocation, OPTIONS statement
Category:
Environment control: Initialization and operation
PROC OPTIONS GROUP=
EXECMODES
Syntax MULTENVAPPL | NOMULTENVAPPL
Syntax Description MULTENVAPPL
specifies that an application font selector window list only the SAS fonts. NOMULTENVAPPL
specifies that an application font selector window list only the operating environment fonts.
Details The MULTENVAPPL system option enables applications that support a font selection window, such as SAS/AF, SAS/FSP, SAS/EIS, or SAS/GIS, to choose a SAS font that is supported in all operating environments. Choosing a SAS font ensures portability of applications across all operating environments. When NOMULTENVAPPL is in effect, the application font selector window has available only the fonts that are specific to your operating environment. SAS might need to resize operating environment fonts, which could result in text that is difficult to read. If the application is ported to another environment and the font is not available, a font is selected by the operating environment.
NEWS= System Option Specifies an external file that contains messages to be written to the SAS log, immediately after the header. Valid in:
configuration file, SAS invocation
Category:
Environment control: Files Log and procedure output control: SAS log
PROC OPTIONS GROUP=
ENVFILES LOGCONTROL
See:
NEWS= System Option in the documentation for your operating environment.
SAS System Options
4
NOTES System Option
1889
Syntax NEWS=external-file
Syntax Description
external-file
specifies an external file. Operating Environment Information: A valid file specification and its syntax are specific to your operating environment. Although the syntax is generally consistent with the command line syntax of your operating environment, it might include additional or alternate punctuation. For details, see the SAS documentation for your operating environment. 4
Details The NEWS file can contain information for uses, including news items about SAS. The contents of the NEWS file are written to the SAS log immediately after the SAS header.
See Also “The SAS Log” in SAS Language Reference: Concepts
NOTES System Option Specifies whether notes are written to the SAS log. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Log and procedure output control: SAS log PROC OPTIONS GROUP=
LOGCONTROL
Syntax NOTES | NONOTES
Syntax Description
NOTES
specifies that SAS write notes to the SAS log. NONOTES
specifies that SAS does not write notes to the SAS log. NONOTES does not suppress error and warning messages.
1890
4
NUMBER System Option
Chapter 7
Details You must specify NOTES for SAS programs that you send to SAS for problem determination and resolution.
See Also “The SAS Log” in SAS Language Reference: Concepts
NUMBER System Option Specified whether to print the page number in the title line of each page of SAS output. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Log and procedure output control: SAS log and procedure output Log and procedure output control: SAS log Log and procedure output control: Procedure output
PROC OPTIONS GROUP=
LOG_LISTCONTROL LISTCONTROL LOGCONTROL
Syntax NUMBER | NONUMBER
Syntax Description
NUMBER
specifies that SAS print the page number on the first title line of each page of SAS output. NONUMBER
specifies that SAS not print the page number on the first title line of each page of SAS output.
See Also “The SAS Log” in SAS Language Reference: Concepts
OBS= System Option Specifies the observation that is used to determine the last observation to process, or specifies the last record to process.
SAS System Options
Valid in:
4
OBS= System Option
1891
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Files: SAS Files
SASFILES See: OBS= System Option in the documentation for your operating environment PROC OPTIONS GROUP=
Syntax OBS= n | nK | nM | nG | nT | hexX | MIN | MAX
Syntax Description n | nK | nM | nG | nT
specifies a number to indicate when to stop processing, with n being an integer. Using one of the letter notations results in multiplying the integer by a specific value. That is, specifying K (kilo) multiplies the integer by 1,024; M (mega) multiplies by 1,048,576; G (giga) multiplies by 1,073,741,824; or T (tera) multiplies by 1,099,511,627,776. For example, a value of 20 specifies 20 observations or records, while a value of 3m specifies 3,145,728 observations or records. hexX
specifies a number to indicate when to stop processing as a hexadecimal value. You must specify the value beginning with a number (0–9), followed by an X. For example, the hexadecimal value F8 must be specified as 0F8x in order to specify the decimal equivalent of 248. The value 2dx specifies the decimal equivalent of 45. MIN
sets the number to 0 to indicate when to stop processing. Interaction: If OBS=0 and the NOREPLACE option is in effect, then SAS can still take certain actions because it actually executes each DATA and PROC step in the program, using no observations. For example, SAS executes procedures, such as CONTENTS and DATASETS, that process libraries or SAS data sets. External files are also opened and closed. Therefore, even if you specify OBS=0, when your program writes to an external file with a PUT statement, an end-of-file mark is written, and any existing data in the file is deleted. MAX
sets the number to indicate when to stop processing to the maximum number of 63 observations or records, up to the largest eight-byte, signed integer, which is 2 -1, or approximately 9.2 quintillion. This is the default.
Details OBS= tells SAS when to stop processing observations or records. To determine when to stop processing, SAS uses the value for OBS= in a formula that includes the value for OBS= and the value for FIRSTOBS=. The formula is (obs - firstobs) + 1 = results
For example, if OBS=10 and FIRSTOBS=1 (which is the default for FIRSTOBS=), the result is 10 observations or records, that is (10 - 1) + 1 = 10. If OBS=10 and FIRSTOBS=2, the result is nine observations or records, that is, (10 - 2) + 1 = 9. OBS= is valid for all steps during your current SAS session or until you change the setting.
1892
OBS= System Option
4
Chapter 7
You can also use OBS= to control analysis of SAS data sets in PROC steps. If SAS is processing a raw data file, OBS= specifies the last line of data to read. SAS counts a line of input data as one observation, even if the raw data for several SAS data set observations is on a single line. Operating Environment Information: The syntax that is shown here applies to the OPTIONS statement. On the command line or in a configuration file, the syntax is specific to your operating environment. For details, see the SAS documentation for your operating environment. 4
Comparisons 3 An OBS= specification from either a data set option or an INFILE statement option takes precedence over the OBS= system option. 3 While the OBS= system option specifies an ending point for processing, the FIRSTOBS= system option specifies a starting point. The two options are often used together to define a range of observations to be processed.
Examples
Example 1: Using OBS= to Specify When to Stop Processing Observations This example illustrates the result of using OBS= to tell SAS when to stop processing observations. This example creates a SAS data set, executes the OPTIONS statement by specifying FIRSTOBS=2 and OBS=12, and executes the PRINT procedure. The result is 11 observations, that is, (12 - 2) + 1 = 11. The result of OBS= in this situation appears to be the observation number that SAS processes last, because the output starts with observation 2, and ends with observation 12, but this result is only a coincidence. data Ages; input Name $ Age; datalines; Miguel 53 Brad 27 Willie 69 Marc 50 Sylvia 40 Arun 25 Gary 40 Becky 51 Alma 39 Tom 62 Kris 66 Paul 60 Randy 43 Barbara 52 Virginia 72 ; options firstobs=2 obs=12; proc print data=Ages; run;
SAS System Options
4
OBS= System Option
1893
Output 7.4 PROC PRINT Output Using OBS= and FIRSTOBS= The SAS System Obs 2 3 4 5 6 7 8 9 10 11 12
Name Brad Willie Marc Sylvia Arun Gary Becky Alma Tom Kris Paul
1 Age 27 69 50 40 25 40 51 39 62 66 60
Example 2: Using OBS= with WHERE Processing
This example illustrates the result of using OBS= along with WHERE processing. The example uses the data set that was created in Example 1, which contains 15 observations, and the example assumes a new SAS session with the defaults FIRSTOBS=1 and OBS=MAX. First, here is the PRINT procedure with a WHERE statement. The subset of the data results in 12 observations: proc print data=Ages; where Age LT 65; run;
Output 7.5 PROC PRINT Output Using a WHERE Statement The SAS System Obs 1 2 4 5 6 7 8 9 10 12 13 14
Name Miguel Brad Marc Sylvia Arun Gary Becky Alma Tom Paul Randy Barbara
1 Age 53 27 50 40 25 40 51 39 62 60 43 52
Executing the OPTIONS statement with OBS=10 and the PRINT procedure with the WHERE statement results in 10 observations, that is, (10 - 1) + 1 = 10. Note that with WHERE processing, SAS first subsets the data and then SAS applies OBS= to the subset. options obs=10; proc print data=Ages; where Age LT 65; run;
1894
OBS= System Option
Output 7.6
4
Chapter 7
PROC PRINT Output Using a WHERE Statement and OBS= The SAS System Obs 1 2 4 5 6 7 8 9 10 12
Name Miguel Brad Marc Sylvia Arun Gary Becky Alma Tom Paul
2 Age 53 27 50 40 25 40 51 39 62 60
The result of OBS= appears to be how many observations to process, because the output consists of 10 observations, ending with the observation number 12. However, the result is only a coincidence. If you apply FIRSTOBS=2 and OBS=10 to the subset, the result is nine observations, that is, (10 - 2) + 1 = 9. OBS= in this situation is neither the observation number to end with nor how many observations to process; the value is used in the formula to determine when to stop processing. options firstobs=2 obs=10; proc print data=Ages; where Age LT 65; run;
Output 7.7
PROC PRINT Output Using WHERE Statement, OBS=, and FIRSTOBS= The SAS System Obs 2 4 5 6 7 8 9 10 12
Name Brad Marc Sylvia Arun Gary Becky Alma Tom Paul
3 Age 27 50 40 25 40 51 39 62 60
Example 3: Using OBS= When Observations Are Deleted This example illustrates the result of using OBS= for a data set that has deleted observations. The example uses the data set that was created in Example 1, with observation 6 deleted. The example also assumes a new SAS session with the defaults FIRSTOBS=1 and OBS=MAX. First, here is PROC PRINT output of the modified file: &proc print data=Ages; run;
SAS System Options
4
OBS= System Option
1895
Output 7.8 PROC PRINT Output Showing Observation 6 Deleted The SAS System Obs 1 2 3 4 5 7 8 9 10 11 12 13 14 15
Name Miguel Brad Willie Marc Sylvia Gary Becky Alma Tom Kris Paul Randy Barbara Virginia
1 Age 53 27 69 50 40 40 51 39 62 66 60 43 52 72
Executing the OPTIONS statement with OBS=12, then the PRINT procedure, results in 12 observations, that is, (12 - 1) + 1 = 12: options obs=12; proc print data=Ages; run;
Output 7.9 PROC PRINT Output Using OBS= The SAS System Obs 1 2 3 4 5 7 8 9 10 11 12 13
Name Miguel Brad Willie Marc Sylvia Gary Becky Alma Tom Kris Paul Randy
2 Age 53 27 69 50 40 40 51 39 62 66 60 43
The result of OBS= appears to be how many observations to process, because the output consists of 12 observations, ending with the observation number 13. However, if you apply FIRSTOBS=2 and OBS=12, the result is 11 observations, that is (12 - 2) + 1 = 11. OBS= in this situation is neither the observation number to end with nor how many observations to process; the value is used in the formula to determine when to stop processing. options firstobs=2 obs=12; proc print data=Ages; run;
1896
ORIENTATION= System Option
Output 7.10
4
Chapter 7
PROC PRINT Output Using OBS= and FIRSTOBS= The SAS System Obs 2 3 4 5 7 8 9 10 11 12 13
Name Brad Willie Marc Sylvia Gary Becky Alma Tom Kris Paul Randy
3 Age 27 69 50 40 40 51 39 62 66 60 43
See Also Data Set Options: “FIRSTOBS= Data Set Option” on page 25 “OBS= Data Set Option” on page 38 “REPLACE= Data Set Option” on page 54
System Option: “FIRSTOBS= System Option” on page 1851
ORIENTATION= System Option Specifies the paper orientation to use when printing to a printer. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Log and procedure output control: ODS Printing
PROC OPTIONS GROUP=
ODSPRINT
Syntax ORIENTATION=PORTRAIT | LANDSCAPE | REVERSEPORTRAIT | REVERSELANDSCAPE
Syntax Description
PORTRAIT
specifies the paper orientation as portrait. This is the default.
SAS System Options
4
OVP System Option
1897
LANDSCAPE
specifies the paper orientation as landscape. REVERSEPORTRAIT
specifies the paper orientation as reverse portrait to enable printing on paper with prepunched holes. The reverse side of the page is printed upside down. REVERSELANDSCAPE
specifies the paper orientation as reverse landscape to enable printing on paper with prepunched holes. The reverse side of the page is printed upside down.
Details Changing the value of this option might result in changes to the values of the portable LINESIZE= and PAGESIZE= system options. Operating Environment Information: Most SAS system options are initialized with default settings when you invoke SAS. However, the default settings for some SAS system options vary both by operating environment and by site. For details, see the SAS documentation for your operating environment. 4 For additional information on declaring an ODS printer destination, see ODS statements in SAS Output Delivery System: User’s Guide For additional information on the SAS universal print facility, see “Printing with SAS” in SAS Language Reference: Concepts.
OVP System Option Specifies whether overprinting of error messages to make them bold, is enabled. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Log and procedure output control: SAS log PROC OPTIONS GROUP=
LOGCONTROL
Syntax OVP | NOOVP
Syntax Description OVP
specifies that overprinting of error messages is enabled. NOOVP
specifies that overprinting of error messages is disabled. This is the default.
Details When OVP is specified, error messages are emphasized when SAS overprints the error message two additional times with overprint characters.
1898
PAGEBREAKINITIAL System Option
4
Chapter 7
When output is displayed to a monitor, OVP is overridden and is changed to NOOVP.
See Also “The SAS Log” in SAS Language Reference: Concepts
PAGEBREAKINITIAL System Option Specifies whether to begin the SAS log and procedure output files on a new page. Valid in:
configuration file, SAS invocation
Category:
Log and procedure output control: SAS log and procedure output Log and procedure output control: SAS log Log and procedure output control: Procedure output
PROC OPTIONS GROUP= LOG_LISTCONTROL
LISTCONTROL LOGCONTROL See: PAGEBREAKINITIAL System Option in the documentation for your operating environment.
Syntax PAGEBREAKINITIAL | NOPAGEBREAKINITIAL
Syntax Description PAGEBREAKINITIAL
specifies to begin the SAS log and procedure output files on a new page. NOPAGEBREAKINITIAL
specifies not to begin the SAS log and procedure output files on a new page. NOPAGEBREAKINITIAL is the default.
Details The PAGEBREAKINITIAL option inserts a page break at the start of the SAS log and procedure output files.
See Also “The SAS Log” in SAS Language Reference: Concepts
SAS System Options
4
PAGENO= System Option
1899
PAGENO= System Option Resets the SAS output page number. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Log and procedure output control: Procedure output PROC OPTIONS GROUP= See:
LISTCONTROL
PAGENO= System Option in the documentation for your operating environment.
Syntax PAGENO=n | nK | hexX | MIN | MAX
Syntax Description n | nK
specifies the page number in multiples of 1 (n); 1,024 (nK). For example, a value of 8 sets the page number to 8 and a value of 3k sets the page number to 3,072. hexX
specifies the page number as a hexadecimal number. You must specify the value beginning with a number (0-9), followed by an X. For example, the value 2dx sets the page number to 45. MIN
sets the page number to the minimum number, 1. MAX
specifies the maximum page number as the largest signed, four–byte integer that is representable in your operating environment.
Details The PAGENO= system option specifies a beginning page number for the next page of output that SAS produces. Use PAGENO= to reset page numbering during a SAS session. Operating Environment Information: The syntax that is shown here applies to the OPTIONS statement. On the command line or in a configuration file, the syntax is specific to your operating environment. For details, see the SAS documentation for your operating environment. 4
1900
PAGESIZE= System Option
4
Chapter 7
PAGESIZE= System Option Specifies the number of lines that compose a page of SAS output. Valid in: configuration file, SAS invocation, OPTIONS statement, SAS System Options window Alias: PS= Category: Log and procedure output control: SAS log and procedure output Log and procedure output control: SAS log Log and procedure output control: Procedure output PROC OPTIONS GROUP= LOG_LISTCONTROL LISTCONTROL LOGCONTROL See: PAGESIZE= System Option in the documentation for your operating environment.
Syntax PAGESIZE=n| nK | hexX | MIN | MAX
Syntax Description n | nK
specifies the number of lines that compose a page in terms of lines (n)or units of 1,024 lines (nK). hex
specifies the number of lines that compose a page as a hexadecimal number. You must specify the value beginning with a number (0–9), followed by an X. For example, the value 2dx sets the number of lines that compose a page to 45 lines. MIN
sets the number of lines that compose a page to the minimum setting, 15. MAX
sets the number of lines that compose a page to the maximum setting, 32,767. Operating Environment Information: The syntax that is shown here applies to the OPTIONS statement. On the command line or in a configuration file, valid values and range vary with your operating environment. For details, see the SAS documentation for your operating environment. 4
Details The PAGESIZE= system option affects the following output: 3 the Output window for the ODS LISTING destination 3 the ODS markup destinations when the PRINT option is used in the FILE statement in a DATA step (the FILE PRINT ODS statement is not affected by the PAGESIZE= system option) 3 procedures that produce characters that cannot be scaled, such as the PLOT procedure, the CALENDAR procedure, the TIMEPLOT procedure, the FORMS procedure, and the CHART procedure
SAS System Options
4
PAPERSIZE= System Option
1901
See Also “The SAS Log” in SAS Language Reference: Concepts
PAPERDEST= System Option Specifies the name of the output bin to receive printed output. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Log and procedure output control: ODS Printing
ODSPRINT This option is ignored if the printer does not have multiple output bins.
PROC OPTIONS GROUP= Restriction:
Syntax PAPERDEST=printer-bin-name
Syntax Description printer-bin-name
specifies the bin to receive printed output. Restriction: Maximum length is 200 characters. Operating Environment Information: Most SAS system options are initialized with default settings when SAS is invoked. However, the default settings and option values for some SAS system options might vary both by operating environment and by site. For details, see the SAS documentation for your operating environment. 4 For additional information on declaring an ODS printer destination, see ODS statements in SAS Output Delivery System: User’s Guide. For additional information on the SAS universal print facility, see “Printing with SAS” in SAS Language Reference: Concepts.
See Also System Options: “PAPERSIZE= System Option” on page 1901 “PAPERSOURCE= System Option” on page 1903 “PAPERTYPE= System Option” on page 1904
PAPERSIZE= System Option Specifies the paper size to use for printing.
1902
PAPERSIZE= System Option
Valid in:
4
Chapter 7
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Environment control: Language control Log and procedure output control: ODS Printing
PROC OPTIONS GROUP=
LANGUAGECONTROL ODSPRINT
Syntax PAPERSIZE=paper_size_name| (“width_value” “height_value”)| (’width_value’ ’height_value’) | (width_value height_value)
Syntax Description
paper_size_name
specifies a predefined paper size. The default is either LETTER or A4, depending on the locale. Default: Letter
Refer to the Registry Editor, or use PROC REGISTRY to obtain a listing of supported paper sizes. Additional values can be added.
Valid Values:
When the name of a predefined paper size contains spaces, enclose the name in single or double quotation marks.
Requirement:
Restriction: The maximum length is 200 characters. (“width_value”, “height_value”)
specifies paper width and height as positive floating-point values. Default: inches Range:
in or cm for width_value, height_value
Details If you specify a predefined paper size or a custom size that is not supported by your printer, the printer default paper size is used. The printer default paper size is locale dependent and can be changed using the Page Setup dialog box. Fields that specify values for paper sizes can either be separated by blanks or commas. Note: Changing the value of this option can result in changes to the values of the portable LINESIZE= and PAGESIZE= system options. 4 Operating Environment Information: Most SAS system options are initialized with default settings when SAS is invoked. However, the default settings and option values for some SAS system options can vary both by operating environment and by site. For details, see the SAS documentation for your operating environment. 4 For additional information on declaring an ODS printer destination, see ODS statements in the SAS Output Delivery System: User’s Guide. For additional information on the SAS universal print facility, see “Printing with SAS” in SAS Language Reference: Concepts.
SAS System Options
4
PAPERSOURCE= System Option
1903
Examples The first OPTIONS statement sets a paper size value that is a paper size name from the SAS Registry. The second OPTIONS statement sets a specific width and height for a paper size. options papersize="460x640 Pixels"; options papersize=("4.5" "7");
In the first example, quotation marks are required because there is a space in the name. In the second example, quotation marks are not required. When no measurement units are specified, SAS writes the following warning to the SAS log: WARNING: Units were not specified on the PAPERSIZE option. Inches will be used. WARNING: Units were not specified on the PAPERSIZE option. Inches will be used.
You can avoid the warning message by adding the unit type, in or cm, to the value with no space separating the value and the unit type: options papersize=(4.5in 7in);
See Also System Options: “PAPERDEST= System Option” on page 1901 “PAPERSOURCE= System Option” on page 1903 “PAPERTYPE= System Option” on page 1904
PAPERSOURCE= System Option Specifies the name of the paper bin to use for printing. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Log and procedure output control: ODS Printing PROC OPTIONS GROUP= Restriction:
ODSPRINT
This option is ignored if the printer does not have multiple input bins.
Syntax PAPERSOURCE=printer-bin-name
Syntax Description printer-bin-name
specifies the bin that sends paper to the printer.
1904
PAPERTYPE= System Option
4
Chapter 7
Operating Environment Information: For instructions on how to specify a printer bin, see the SAS documentation for your operating environment. 4
See Also System Options: “PAPERDEST= System Option” on page 1901 “PAPERSIZE= System Option” on page 1901 “PAPERTYPE= System Option” on page 1904 For information about declaring an ODS printer destination, see the ODS PRINTER statement in SAS Output Delivery System: User’s Guide. For information about SAS Universal Printing, see Printing with SAS in SAS Language Reference: Concepts.
PAPERTYPE= System Option Specifies the type of paper to use for printing. Valid in:
configuration file, SAS invocation, OPTIONS statement SAS System Options
window Log and procedure output control: ODS Printing PROC OPTIONS GROUP= ODSPRINT Category:
Syntax PAPERTYPE=paper-type-string
Syntax Description paper-type-string
specifies the type of paper. Maximum length is 200. Values vary by printer, site, and operating environment. Default: Values vary by site and operating environment. Range:
Operating Environment Information: For instructions on how to specify the type of paper, see the SAS documentation for your operating environment. There is a very large number of possible values for this option. 4
SAS System Options
4
PARM= System Option
1905
See Also System Options: “PAPERDEST= System Option” on page 1901 “PAPERSIZE= System Option” on page 1901 “PAPERSOURCE= System Option” on page 1903 For information about declaring an ODS printer destination, see the ODS PRINTER statement in SAS Output Delivery System: User’s Guide For information about SAS Universal Printing, see Printing with SAS in SAS Language Reference: Concepts.
PARM= System Option Specifies a parameter string that is passed to an external program. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Environment control: Files PROC OPTIONS GROUP=
ENVFILES
Syntax PARM=string
Syntax Description string
specifies a character string that contains a parameter.
Examples This statement passes the parameter X=2 to an external program: options parm=’x=2’;
Operating Environment Information: Other methods of passing parameters to external programs depend on your operating environment and on whether you are running in interactive line mode or batch mode. For details, see the SAS documentation for your operating environment. 4
1906
PARMCARDS= System Option
4
Chapter 7
PARMCARDS= System Option Specifies the file reference to open when SAS encounters the PARMCARDS statement in a procedure. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Environment control: Files PROC OPTIONS GROUP= ENVFILES See: PARMCARDS= System Option in the documentation for your operating
environment.
Syntax PARMCARDS=file-ref
Syntax Description file-ref
specifies the file reference to open.
Details The PARMCARDS= system option specifies the file reference of a file that SAS opens when it encounters a PARMCARDS (or PARMCARDS4) statement in a procedure. SAS writes all data lines after the PARMCARDS (or PARMCARDS4) statement to the file until it encounters a delimiter line of either one or four semicolons. The file is then closed and made available to the procedure to read. There is no parsing or macro expansion of the data lines. Operating Environment Information: The syntax shown here applies to the OPTIONS statement. On the command line or in a configuration file, the syntax is specific to your operating environment. For details, see the SAS documentation for your operating environment. 4
PDFACCESS System Option Specifies whether text and graphics from PDF documents can be read by screen readers for the visually impaired. Requirement: Adobe Acrobat Reader or Professional 5.0 and later versions Valid in: configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Log and procedure output control: PDF PROC OPTIONS GROUP= PDF
Syntax PDFACCESS | NOPDFACCESS
SAS System Options
4
PDFASSEMBLY System Option
1907
Syntax Description
PDFACCESS
specifies that text and graphics from an ODS PDF document can be read by screen readers for the visually impaired. NOPDFACCESS
specifies that text and graphics from an ODS PDF document cannot be read by screen readers for the visually impaired.
Details When the PDFSECURITY system option is set to HIGH, SAS sets the PDFACCESS option. If the PDFSECURITY option is set to LOW or NONE, this option is not functional. When the PDFSECURITY option is set to NONE, screen readers can read PDF text and graphics. The following document properties are set for this option: Value of PDFACCESS
Value of PDFSECURITY
Document Properties
NOPDFACCESS
HIGH
Content Extraction for Accessibility is set to Not Allowed.
See Also System Options: “PDFSECURITY= System Option” on page 1917 Securing ODS Generated PDF Files in SAS Language Reference: Concepts
PDFASSEMBLY System Option Specifies whether PDF documents can be assembled. Requirement: Valid in:
Adobe Acrobat Reader or Professional 5.0 and later versions
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Log and procedure output control: PDF PROC OPTIONS GROUP= PDF
Syntax PDFASSEMBLY | NOPDFASSEMBLY
1908
PDFCOMMENT System Option
4
Chapter 7
Syntax Description
PDFASSEMBLY
specifies that PDF documents can be assembled. NOPDFASSEMBLY
specifies that PDF documents cannot be assembled. This is the default.
Details When a PDF document is assembled, pages can be rotated, inserted, and deleted, and bookmarks and thumbnail images can be added. When the PDFSECURITY system option is set to HIGH, SAS sets the PDFASSEMBLY option. If the PDFSECURITY option is set to LOW or NONE, this option is not functional. When the PDFSECURITY option is set to NONE, PDF documents can be assembled.
See Also System Options: “PDFSECURITY= System Option” on page 1917 Securing ODS Generated PDF Files in SAS Language Reference: Concepts
PDFCOMMENT System Option Specifies whether PDF document comments can be modified. Requirement: Valid in:
Adobe Acrobat Reader or Professional 5.0 and later versions
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Log and procedure output control: PDF
PROC OPTIONS GROUP= PDF
Syntax PDFCOMMENT | NOPDFCOMMENT
Syntax Description
PDFCOMMENT
specifies that PDF document comments can be modified. NOPDFCOMMENT
specifies that PDF document comments cannot be modified. This is the default.
SAS System Options
4
PDFCONTENT System Option
1909
Details When the PDFSECURITY system option is set to either LOW or HIGH, SAS sets the PDFCOMMENT option. If the PDFSECURITY option is set to NONE, the PDFCOMMENT option is not functional and PDF document comments can be modified. The following document properties are set for this option: Value of PDFCOMMENT
Value of PDFSECURITY
Document Properties
NOPDFCOMMENT
LOW
Commenting is set to Not Allowed Filling in of fields is set to Not Allowed
When PDFSECURITY=LOW, the settings for the PDFCOMMENT and PDFFILLIN options are dependent on each other. A change in either of these options changes the other option to the similar setting. For example, if PDFSECURITY=LOW, and PDFCOMMENT and PDFFILLIN are set, and if the PDFCOMMENT setting is modified to NOPDFCOMMENT, then SAS sets NOPDFFILLIN. When PDFSECURITY=HIGH, PDFCOMMENT and PDFFILLIN can be set independently.
See Also System Options: “PDFFILLIN System Option” on page 1912 “PDFSECURITY= System Option” on page 1917 Securing ODS Generated PDF Files in SAS Language Reference: Concepts
PDFCONTENT System Option Specifies whether the contents of a PDF document can be changed. Requirement: Valid in:
Adobe Acrobat Reader or Professional 3.0 and later versions
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Log and procedure output control: PDF PROC OPTIONS GROUP= PDF
Syntax PDFCONTENT | NOPDFCONTENT
1910
PDFCOPY System Option
4
Chapter 7
Syntax Description
PDFCONTENT
specifies that the contents of a PDF document can be changed. NOPDFCONTENT
specifies that the contents of a PDF document cannot be changed.
Details When the PDFSECURITY option is set to either LOW or HIGH, SAS sets the PDFCONTENT option. If the PDFSECURITY option is set to NONE, this option is not functional and the PDF document can be changed. The following document properties are set for this option: Value of PDFCONTENT
Value of PDFSECURITY
Document Properties
PDFCONTENT
HIGH
Page Extraction and Commenting are set to Not Allowed.
NOPDFCONTENT
Not applicable
Changing the Document and Document Assembly are both set to Not Allowed.
See Also System Options: “PDFSECURITY= System Option” on page 1917 Securing ODS Generated PDF Files in SAS Language Reference: Concepts
PDFCOPY System Option Specifies whether text and graphics from a PDF document can be copied. Requirement: Valid in:
Adobe Acrobat Reader or Professional 3.0 and later versions
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Log and procedure output control: PDF
PROC OPTIONS GROUP= PDF
Syntax PDFCOPY | NOPDFCOPY
SAS System Options
4
PDFCOPY System Option
1911
Syntax Description PDFCOPY
specifies that text and graphics from a PDF document can be copied. This is the default. NOPDFCOPY
specifies that text and graphics from a PDF document cannot be copied.
Details When the PDFSECURITY system option is set to either LOW or HIGH, SAS sets the PDFCOPY option. If the PDFSECURITY option is set to NONE, this option is not functional and PDF documents can be copied. The following document properties are set for this option: Value of PDFCOPY
Value of PDFSECURITY
Document Properties
NOPDFCOPY
LOW
Printing, Content Copying, and Content Copying for Accessibility are set to Allowed. All other properties are set to Not Allowed.
NOPDFCOPY
HIGH
Changing the Document, Document Assembly, Content Copying, Page Extraction, and Commenting are set to Not Allowed.
See Also System Options: “PDFSECURITY= System Option” on page 1917 Securing ODS Generated PDF Files in SAS Language Reference: Concepts
1912
PDFFILLIN System Option
4
Chapter 7
PDFFILLIN System Option Specifies whether PDF forms can be filled in. Requirement: Valid in:
Adobe Acrobat Reader or Professional 5.0 and later versions
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Log and procedure output control: PDF
PROC OPTIONS GROUP= PDF
Syntax PDFFILLIN | NOPDFFILLIN
Syntax Description
PDFFILLIN
specifies that PDF forms can be filled in. NOPDFFILLIN
specifies that PDF forms cannot be filled in.
Details When the PDFSECURITY option is set to HIGH, SAS sets the PDFFILLIN option. If the PDFSECURITY option is set to LOW or NONE, this option is not functional. When the PDFSECURITY option is set to NONE, PDF forms can be filled in. The following document properties are set for this option: Value of PDFFILLIN
Value of PDFSECURITY
Document Properties
NOPDFFILLIN
LOW
Changing the Document, Document Assembly, Page Extraction, Commenting, Filling of form fields, Signing, and Creation of Template Pages are set to Not Allowed.
When PDFSECURITY=LOW, the settings for the PDFCOMMENT and PDFFILLIN options are dependent on each other. A change in either of these options changes the other option to the similar setting. For example, if PDFSECURITY=LOW, and PDFCOMMENT and PDFFILLIN are set, and if the PDFCOMMENT setting is modified to NOPDFCOMMENT, then SAS sets NOPDFFILLIN. When PDFSECURITY=HIGH, PDFCOMMENT and PDFFILLIN can be set independently.
SAS System Options
4
PDFPAGELAYOUT= System Option
1913
See Also System Options: “PDFCOMMENT System Option” on page 1908 “PDFSECURITY= System Option” on page 1917 Securing ODS Generated PDF Files in SAS Language Reference: Concepts
PDFPAGELAYOUT= System Option Specifies the page layout for PDF documents. Adobe Acrobat Reader or Professional 5.0 and later versions configuration file, SAS invocation, OPTIONS statement, SAS System Options
Requirement: Valid in:
window Category: Log and procedure output control: PDF PROC OPTIONS GROUP= PDF
Syntax PDFPAGELAYOUT= DEFAULT | SINGLEPAGE | CONTINUOUS | FACING | CONTINUOUSFACING
Syntax Description DEFAULT
specifies to use the current page layout for Acrobat Reader. This is the default. SINGLEPAGE
specifies to display one page at a time in the viewing area. CONTINUOUS
specifies to display all document pages in the viewing area in a single column. FACING
specifies to display only two pages in the viewing area, with the even pages on the left and the odd pages on the right. Requirement: Acrobat Reader 5.0 or later version is required. CONTINUOUSFACING
specifies to display all pages in the viewing area, two pages side by side. The even pages display on the left, and the odd pages display on the right.
See Also System option: “ PDFPAGEVIEW= System Option” on page 1914 Securing ODS Generated PDF Files in SAS Language Reference: Concepts
1914
PDFPAGEVIEW= System Option
4
Chapter 7
PDFPAGEVIEW= System Option Specifies the page viewing mode for PDF documents. Adobe Acrobat Reader or Professional 5.0 and later versions configuration file, SAS invocation, OPTIONS statement, SAS System Options
Requirement: Valid in:
window Category:
Log and procedure output control: PDF
PROC OPTIONS GROUP= PDF
Syntax PDFPAGEVIEW= DEFAULT | ACTUAL | FITPAGE | FITWIDTH | FULLSCREEN
Syntax Description DEFAULT
specifies to use the current page view setting for Acrobat Reader. This is the default. ACTUAL
specifies to set the page view setting to 100%. FITPAGE
specifies to view a page using the full extent of the viewing window, maintaining the height and width aspect ratio. FITWIDTH
specifies to view a page using the full width of the viewing window. The height of the document is not scaled to fit the page. FULLSCREEN
specifies to view a page using the full screen. This option disables the table of contents, bookmarks, and all other document access aids, such as accessing a specific page.
See Also System option: “ PDFPAGEVIEW= System Option” on page 1914 Securing ODS Generated PDF Files in SAS Language Reference: Concepts
SAS System Options
4
PDFPASSWORD= System Option
1915
PDFPASSWORD= System Option Specifies the password to use to open a PDF document and the password used by a PDF document owner. Requirement: Alias:
Adobe Acrobat Reader or Professional 3.0 and later versions
PDFPW
Valid in:
configuration file, SAS invocation, OPTIONS statement
Category: Log and procedure output control: PDF
System administration: Security PROC OPTIONS GROUP= PDF
SECURITY Security
Syntax PDFPASSWORD=(OPEN= | OPEN="" < OWNER= | OWNER="">) PDFPASSWORD=(OWNER= | OWNER="" | OPEN="">) PDFPASSWORD=(OPEN= | OPEN="") PDFPASSWORD=(OWNER= | OWNER="" )
Syntax Description OPEN="password"
specifies the password to open a PDF document. The quotation marks are optional. password specifies a set of characters, up to 32 characters, that are used to validate that a user has permission to open a PDF document. Restriction: The OPEN password must be different from the OWNER password. OPEN=""
specifies to reset the password to open a PDF document to null. When the password is set to null, no password is necessary to open a PDF document. This is the default. Restriction: Null values are not valid. OWNER="password"
specifies the password for the PDF document owner. The quotation marks are optional.
1916
PDFPRINT= System Option
4
Chapter 7
password specifies a set of characters, up to 32 characters, that are used to validate the owner of a PDF document. Restriction: The OWNER password must be different from the OPEN password. Restriction: Null values are not valid. OWNER=""
specifies to reset the password used by a PDF document owner to null. When the password is set to null, the owner does not need a password for the PDF document. This is the default. Restriction: Null values are not valid.
Details You can set the PDFPASSWORD option at any time, but it is ignored until the PDFSECURITY system option is set to either LOW or HIGH. When the PDFSECURITY option is set to NONE, passwords for a PDF document are not needed.
See Also System option: “ PDFPAGEVIEW= System Option” on page 1914 “PDFSECURITY= System Option” on page 1917 Securing ODS Generated PDF Files in SAS Language Reference: Concepts
PDFPRINT= System Option Specifies the resolution to print PDF documents. Adobe Acrobat Reader or Professional 3.0 and later versions, depending on PDFPRINT setting Valid in: configuration files, SAS invocation, OPTIONS statement, SAS System Options window Category: Log and procedure output control: PDF PROC OPTIONS GROUP= PDF Requirement:
Syntax PDFPRINT= HRES | LRES | NONE
Syntax Description HRES
specifies to print PDF documents at the highest resolution available on the printer. This is the default for Acrobat Reader or Professional 5.0 and later versions. Requirement: Acrobat Reader or Professional 5.0 and later versions.
SAS System Options
4
PDFSECURITY= System Option
1917
LRES
specifies to print PDF documents at a lower resolution for draft-quality documents. Requirement: Acrobat Reader or Professional 3.0 and later versions. Restriction: PDFPRINT=LRES can be set only when the PDFSECURITY option is set to HIGH. NONE
specifies the PDF documents have no print resolution. Requirement: Any version of Acrobat Reader or Professional. Restriction: PDFPRINT=NONE can be set only when the PDFSECURITY option is set to HIGH or LOW.
Details When the PDFSECURITY option is set to NONE, PDF documents can be printed. The following table shows the option settings for allowing high and low resolution printing: Value of PDFPRINT
Value of PDFSECURITY
Printing Resolution Allowed
LRES
LOW
High resolution printing
LRES
HIGH
Low resolution (150 dpi) printing
See Also System option: “ PDFPAGEVIEW= System Option” on page 1914 Securing ODS Generated PDF Files in SAS Language Reference: Concepts
PDFSECURITY= System Option Specifies the printing permissions for PDF documents. Requirements: Adobe Acrobat Reader or Professional 3.0 and later versions, unless otherwise noted. Valid in: configuration file, SAS invocation, OPTIONS statement, SAS System Options window Restriction: The PDFSECURITY option is valid for UNIX, Windows, and z/OS operating systems, but only in countries where importing encryption software is legal. Category: Log and procedure output control: PDF System administration: Security PROC OPTIONS GROUP= PDF SECURITY
Syntax PDFSECURITY= HIGH | LOW | NONE
1918
PDFSECURITY= System Option
4
Chapter 7
Syntax Description HIGH
specifies that SAS encrypts PDF documents using a 128-bit encryption algorithm. Requirements: When PDFSECURITY=HIGH, you must use Acrobat 5.0 or later version. Interaction: At least one password must be set using the PDFPASSWORD= system option when PDFSECURITY=HIGH or LOW. LOW
specifies that SAS encrypts PDF documents using a 40-bit encryption algorithm. Interaction: At least one password must be set using the PDFPASSWORD= system option when PDFSECURITY=HIGH or LOW. NONE
specifies that no encryption is performed on PDF documents. This is the default.
Details The following table shows the PDF options that SAS sets when the PDFSECURITY option is set to HIGH, LOW, or NONE. When the PDFSECURITY option is set to NONE, there are no restrictions on PDF documents, and the PDF options are not functional. Table 7.5 How SAS Sets PDF Options Values for the PDFSECURITY Settings PDFSECURITY Settings Option
HIGH
LOW
NONE
PDFACCESS
PDFACCESS
Not functional
Not functional
PDFASSEMBLY
PDFASSEMBLY
Not functional
Not functional
PDFCOMMENT
PDFCOMMENT
PDFCOMMENT
Not functional
PDFCONTENT
PDFCONTENT
PDFCONTENT
Not functional
PDFCOPY
PDFCOPY
PDFCOPY
Not functional
PDFFILLIN
PDFFILLIN
Not functional
Not functional
PDFPRINT
PRFPRINT=HRES
PDFPRINT=HRES
Not functional
See Also System option: “PDFACCESS System Option” on page 1906 “PDFASSEMBLY System Option” on page 1907 “PDFCOMMENT System Option” on page 1908 “PDFCONTENT System Option” on page 1909 “PDFCOPY System Option” on page 1910 “PDFFILLIN System Option” on page 1912 “PDFPASSWORD= System Option” on page 1915 “PDFPRINT= System Option” on page 1916
SAS System Options
4
PRIMARYPROVIDERDOMAIN= System Option
1919
“Securing ODS Generated PDF Files” in SAS Output Delivery System: User’s Guide
PRIMARYPROVIDERDOMAIN= System Option Specifies the domain name of the primary authentication provider. Valid in:
configuration file, SAS invocation
Alias PRIMPD= Category: Environment control: Initialization and operation PROC OPTIONS GROUP= EXECMODES
Syntax PRIMARYPROVIDERDOMAIN=domain-name
Syntax Description domain-name
specifies the name of the domain that authenticates user names. Requirement: If the domain name contains one or more spaces, the domain name must be enclosed in quotation marks.
Details By default, users who log on to the SAS Metadata Server are authenticated by the operating system that hosts the SAS Metadata Server. You can specify an alternate authentication provider by using the AUTHPROVIDERDOMAIN= system option. User IDs that are verified by an alternate authentication provider must be in the format user-ID@domain-name (for example, [email protected]). By specifying an authentication provider and a domain name that use the AUTHPROVIDERDOMAIN= and PRIMARYPROVIDERDOMAIN= system options, respectively, you enable users to log on to the SAS Metadata Server by using their usual user ID without using a domain-name suffix on the user ID. For example, by specifying the following system options, users who log on as user-ID or [email protected] can be verified by the authentication provider that is specified by the AUTHPROVIDERDOMAIN= system option: -authproviderdomain ldap:mycompany -primaryproviderdomain mycompany.com
If you specify the PRIMARYPROVIDERDOMAIN system option without specifying the AUTHPROVIDERDOMAIN system option, authentication is performed by the host provider.
Comparison You use the AUTHPROVIDERDOMAIN system option to register and name your Active Directory provider or other LDAP provider. You use the PRIMARYPROVIDERDOMAIN system option to designate the primary authentication provider.
1920
PRINTERPATH= System Option
4
Chapter 7
Examples The following examples show the system options that you might use in a configuration file to define a primary authentication provider domain-name: Active Directory /* Environment variables that describe your Active Directory server -set AD_HOST myhost /* Define authentication provider -authpd ADIR:mycomapny.com -primpd mycompany.com
*/
*/
LDAP /* Environment variables that describe your LDAP server -set LDAP_HOST myhost -set LDAP_BASE "ou=emp, o=us"
*/
/* Define authentication provider */ -authpd LDAP:mycompany.com -primpd mycompany.com
See Also System option: “AUTHPROVIDERDOMAIN System Option” on page 1791 AUTHSERVER System Option in the SAS Companion for Windows Direct LDAP Authentication in the SAS Intelligence Platform: Security Administration Guide
PRINTERPATH= System Option Specifies the name of a registered printer to use for Universal Printing. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window The PRINTERPATH= system option is ignored when the DEVICE= system option is set to the ActiveX or Java devices.
Restriction: Category:
Log and procedure output control: ODS Printing
PROC OPTIONS GROUP=
ODSPRINT
Syntax PRINTERPATH=(’printer-name’ )
SAS System Options
4
PRINTERPATH= System Option
1921
Syntax Description ’printer-name’
must be one of the printers defined in the Registry Editor under Core Printers
I Printing I
When the printer name contains blanks, you must enclose it in quotation marks.
Requirement: fileref
is an optional fileref. If a fileref is specified, it must be defined with a FILENAME statement or an external allocation. If a fileref is not specified, the default output destination can specify a printer in the Printer Setup dialog box, which you open by selecting File I Printer Setup. Parentheses are required only when a fileref is specified.
Details If the PRINTERPATH= option is not a null string, then Universal Printing will be used. If the PRINTERPATH= option does not specify a valid Universal Printing printer, then the default Universal Printer is used.
Comparisons A related system option SYSPRINT specifies which operating system printer will be used for printing. PRINTERPATH= specifies which Universal Printing printer will be used for printing. The operating system printer specified by the SYSPRINT option is used when PRINTERPATH="" (two double quotation marks with no space between them sets a null string).
Examples The following example specifies an output destination that is different from the default: options PRINTERPATH=(corelab out); filename out ’your_file’;
Operating Environment Information: In some operating environments, setting the PRINTERPATH= option might not change the setting of the PMENU print button, which might continue to use operating environment printing. See the SAS documentation for your operating environment for more information. For additional information on declaring an ODS printer destination, see ODS statements in SAS Output Delivery System: User’s Guide. 4 For additional information on the SAS universal print facility, see “Printing with SAS” in SAS Language Reference: Concepts.
1922
PRINTINIT System Option
4
Chapter 7
PRINTINIT System Option Specifies whether to initialize the SAS procedure output file. Valid in:
configuration file, SAS invocation
Category:
Log and procedure output control: Procedure output
PROC OPTIONS GROUP= See:
LISTCONTROL
PRINTINIT System Option in the documentation for your operating environment.
Syntax PRINTINIT | NOPRINTINIT
Syntax Description PRINTINIT
specifies to initialize the SAS procedure output file and resets the file attributes. Specifying PRINTINIT causes the SAS procedure output file to be cleared even when output is not generated.
Tip:
NOPRINTINIT
specifies to preserve the existing procedure output file if no new output is generated. This is the default. Specifying NOPRINTINIT causes the SAS procedure output file to be overwritten only when new output is generated.
Tip:
Details Operating Environment Information: The behavior of the PRINTINIT system option depends on your operating environment. For additional information, see the SAS documentation for your operating environment. 4
PRINTMSGLIST System Option Specifies whether to print all messages to the SAS log or to print only top-level messages to the SAS log. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Log and procedure output control: SAS log
PROC OPTIONS GROUP=
LOGCONTROL
Syntax PRINTMSGLIST | NOPRINTMSGLIST
SAS System Options
4
QUOTELENMAX System Option
1923
Syntax Description PRINTMSGLIST
specifies to print the entire list of messages to the SAS log. PRINTMSGLIST is the default. NOPRINTMSGLIST
specifies to print only the top-level message to the SAS log.
Details For Version 7 and later versions, the return code subsystem allows for lists of return codes. All of the messages in a list are related, in general, to a single error condition, but give different levels of information. This option enables you to see the entire list of messages or just the top-level message.
See Also “The SAS Log” in SAS Language Reference: Concepts
QUOTELENMAX System Option If a quoted string exceeds the maximum length allowed, specifies whether SAS writes a warning message to the SAS log. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Environment control: Error handling PROC OPTIONS GROUP= ERRORHANDLING
Syntax QUOTELENMAX | NOQUOTELENMAX
Syntax Description QUOTELENMAX
specifies that SAS write a warning message to the SAS log about the maximum length for strings in quotation marks.
1924
REPLACE System Option
4
Chapter 7
NOQUOTELENMAX
specifies that SAS does not write a warning message to the SAS log about the maximum length for strings in quotation marks.
Details If a string in quotation marks is too long, SAS writes the following warning to the SAS log: WARNING 32-169: The quoted string currently being processed has become more than 262 characters long. You may have unbalanced quotation marks.
If you are running a program that has long strings in quotation marks, and you do not want to see this warning, use the NOQUOTELENMAX system option to turn off the warning.
REPLACE System Option Specifies whether permanently stored SAS data sets can be replaced. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Files: SAS Files PROC OPTIONS GROUP= SASFILES Category:
Syntax REPLACE | NOREPLACE
Syntax Description REPLACE
specifies that a permanently stored SAS data set can be replaced with another SAS data set of the same name. NOREPLACE
specifies that a permanently stored SAS data set cannot be replaced with another SAS data set of the same name, which prevents the accidental replacement of existing SAS data sets.
Details This option has no effect on data sets in the WORK library, even if you use the WORKTERM= system option to store the WORK library files permanently.
Comparisons The REPLACE= data set option overrides the REPLACE system option.
SAS System Options
4
REUSE= System Option
1925
See Also System Option: “WORKTERM System Option” on page 1997 Data Set Option: “REPLACE= Data Set Option” on page 54
REUSE= System Option Specifies whether SAS reuses space when observations are added to a compressed SAS data set. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Files: SAS Files PROC OPTIONS GROUP=
SASFILES
Syntax REUSE=YES | NO
Syntax Description YES
specifies to track free space and reuses it whenever observations are added to an existing compressed data set. NO
specifies not to track free space. This is the default.
1926
REUSE= System Option
4
Chapter 7
Details If space is reused, observations that are added to the SAS data set are inserted wherever enough free space exists, instead of at the end of the SAS data set. Specifying REUSE=NO results in less efficient usage of space if you delete or update many observations in a SAS data set. However, the APPEND procedure, the FSEDIT procedure, and other procedures that add observations to the SAS data set continue to add observations to the end of the data set, as they do for uncompressed SAS data sets. You cannot change the REUSE= attribute of a compressed SAS data set after it is created. Space is tracked and reused in the compressed SAS data set according to the REUSE= value that was specified when the SAS data set was created, not when you add and delete observations. Even with REUSE=YES, the APPEND procedure will add observations at the end. Operating Environment Information: The syntax that is shown here applies to the OPTIONS statement. On the command line or in a configuration file, the syntax is specific to your operating environment. For details, see the SAS documentation for your operating environment. 4
Comparisons The REUSE= data set option overrides the REUSE= system option. PERFORMANCE NOTE: When using COMPRESS=YES and REUSE=YES system options settings, observations cannot be addressed by observation number. Note that REUSE=YES takes precedence over the POINTOBS=YES data set option setting.
See Also System Option: “COMPRESS= System Option” on page 1816 Data Set Options: “COMPRESS= Data Set Option” on page 19 “REUSE= Data Set Option” on page 55
SAS System Options
4
RIGHTMARGIN= System Option
1927
RIGHTMARGIN= System Option Specifies the print margin for the right side of the page for output directed to an ODS printer destination. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Log and procedure output control: ODS Printing PROC OPTIONS GROUP=
ODSPRINT
Syntax RIGHTMARGIN=margin-size
Syntax Description margin-size
specifies the size of the margin. Restriction: The right margin should be small enough so that the left margin plus the right margin is less than the width of the paper. Interactions: Changing the value of this option might result in changes to the value of the LINESIZE= system option.
specifies the units for margin-size. The margin-unit can be in for inches or cm for centimeters. is saved as part of the value of the RIGHTMARGIN system option. Default: inches
Details All margins have a minimum that is dependent on the printer and the paper size. The default value of the RIGHTMARGIN system option is 0.00 in. Operating Environment Information: Most SAS system options are initialized with default settings when SAS is invoked. However, the default settings and option values for some SAS system options might vary both by operating environment and by site. For details, see the SAS documentation for your operating environment. 4 For additional information on declaring an ODS printer destination, see the ODS statements in SAS Output Delivery System: User’s Guide
See Also System Options: “BOTTOMMARGIN= System Option” on page 1795 “LEFTMARGIN= System Option” on page 1877 “TOPMARGIN= System Option” on page 1980
1928
RSASUSER System Option
4
Chapter 7
RSASUSER System Option Specifies whether to open the SASUSER library for read access or read-write access. Valid in:
configuration file, SAS invocation
Category:
Environment control: Files
PROC OPTIONS GROUP= See:
ENVFILES
RSASUSER System Option in the documentation for your operating environment.
Syntax RSASUSER | NORSASUSER
Syntax Description
RSASUSER
opens the SASUSER library in read-only mode. NORSASUSER
opens the SASUSER library in read-write mode.
Details The RSASUSER system option is useful for sites that use a single SASUSER library for all users and want to prevent users from modifying it. However, it is not useful when users use SAS/ASSIST software, because SAS/ASSIST requires writing to the SASUSER library. Operating Environment Information: For network considerations about using the RSASUSER system option, see the SAS documentation for your operating environment. 4
S= System Option Specifies the length of statements on each line of a source statement and the length of data on lines that follow a DATALINES statement. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Input control: Data Processing
PROC OPTIONS GROUP= INPUTCONTROL
Syntax S=n| nK | nM | nG | nT| hexX | MIN | MAX
SAS System Options
4
S= System Option
1929
Syntax Description n | nK | nM | nG | nT
specifies the length of statements and data in terms of 1 (bytes); 1,024 (kilobytes); 1,048,576 (megabytes); 1,073,741,824 (gigabytes); or 1,099,511,627,776 (terabytes). For example, a value of 8 specifies 8 bytes, and a value of 3m specifies 3,145,728 bytes. hexX
specifies the length of statements and data as a hexadecimal number. You must specify the value beginning with a number (0–9), followed by an X. For example, the value 2dx sets the length of statements and data to 45. MIN
sets the length of statements and data to 0. MAX
sets the length of statements and data to 2,147,483,647.
Details Input can be from either fixed-length or variable-length records. Both fixed-length and variable-length records can be sequenced or unsequenced. The location of the sequence numbers is determined by whether the file record format is fixed-length or variable-length. SAS uses the value of S to determine whether to look for sequence numbers in the input, and to determine how to read the input: Record Type
Value of S
SAS Looks for Sequence Numbers
How SAS Reads The Input
Fixed-length
S>0 or S=MAX
No
The value of S is used as the length of the source or data to be scanned and ignores everything beyond that length on each line.
Fixed-length
S=0 or S=MIN
Yes, at the end of the line of input.
SAS inspects the last n columns (where n is the value of the SEQ= system option) of the first sequence field. If those columns contain numbers, they are assumed to be sequence numbers and SAS ignores the last eight columns of each line. If the n columns contain non-digit characters, SAS reads the last eight columns as data columns.
1930
S= System Option
4
Chapter 7
Record Type
Value of S
SAS Looks for Sequence Numbers
How SAS Reads The Input
Variable-length
S>0 or S=MAX
No
The value of S is used as the starting column of the source or data to be scanned and ignores everything before that length on each line.
Variable-length
S=0 or S=MIN
Yes, at the beginning of each line of input.
SAS inspects the last n columns (where n is the value of the SEQ= system option) of the first sequence field. If those columns contain numbers, they are assumed to be sequence numbers and SAS ignores the first eight columns of each line. If the n columns contain non-digit characters, SAS reads the first eight columns as data columns.
Comparisons The S= system option operates exactly like the S2= system option except that S2= controls input only from a %INCLUDE statement, an autoexec file, or an autocall macro file. Operating Environment Information: The syntax that is shown here applies to the OPTIONS statement. On the command line or in a configuration file, the syntax is specific to your operating environment. For details, see the SAS documentation for your operating environment. 4
See Also System Options: “S2= System Option” on page 1931 “S2V= System Option” on page 1933 “SEQ= System Option” on page 1937
SAS System Options
4
S2= System Option
1931
S2= System Option Specifies the length of statements on each line of a source statement from a %INCLUDE statement, an autoexec file, or an autocall macro file. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Input control: Data Processing PROC OPTIONS GROUP= INPUTCONTROL
Syntax S2=S | n | nK | nM | nG | nT | MIN | MAX | hexX
Syntax Description S
uses the current value of the S= system option to compute the record length of text that comes from a %INCLUDE statement, an autoexec file, or an autocall macro file. n | nK | nM | nG | nT
specifies the length of the statements in a file that is specified in a %INCLUDE statement, an autoexec file, or an autocall macro file, in terms of 1 (bytes); 1,024 (kilobytes); 1,048,576 (megabytes); 1,073,741,824 (gigabytes); or 1,099,511,627,776 (terabytes). For example, a value of 8 specifies 8 bytes, and a value of 3m specifies 3,145,728 bytes. hexX
specifies the length of statements as a hexadecimal number. You must specify the value beginning with a number (0 - 9), followed by an X. For example, the value 2dx sets the length of statements to 45. MIN
sets the length of statements and data to 0. MAX
sets the length of statements and data to 2,147,483,647.
1932
S2= System Option
4
Chapter 7
Details Input can be from either fixed-length or variable-length records. Both fixed-length and variable-length records can be sequenced or unsequenced. The location of the sequence numbers is determined by whether the file record format is fixed-length or variable-length. SAS uses the value of S2 to determine whether to look for sequence numbers in the input, and to determine how to read the input: Record Type
Value of S2
SAS Looks for Sequence Numbers
How SAS Reads The Input
Fixed-length
S2>0 or S2=MAX
No
The value of S2 is used as the length of the source or data to be scanned and ignores everything beyond that length on each line.
Fixed-length
S2=0 or S2=MIN
Yes, at the end of the line of input.
SAS inspects the last n columns (where n is the value of the SEQ= system option) of the first sequence field. If those columns contain numbers, they are assumed to be sequence numbers and SAS ignores the last eight columns of each line. If the n columns contain non-digit characters, SAS reads the last eight columns as data columns.
Variable-length
S2>0 or S2=MAX
No
The value of S2 is used as the starting column of the source or data to be scanned and ignores everything before that length on each line.
Variable-length
S2=0 or S2=MIN
Yes, at the beginning of each line of input.
SAS inspects the last n columns (where n is the value of the SEQ= system option) of the first sequence field. If those columns contain numbers, they are assumed to be sequence numbers and SAS ignores the first eight columns of each line. If the n columns contain non-digit characters, SAS reads the first eight columns as data columns.
SAS System Options
4
S2V= System Option
1933
Operating Environment Information: The syntax that is shown here applies to the OPTIONS statement. On the command line or in a configuration file, the syntax is specific to your operating environment. For details, see the SAS documentation for your operating environment. 4
Comparisons The S2= system option operates exactly like the S= system option except that the S2= option controls input from a %INCLUDE statement, an autoexec file, or an autocall macro file. The S2= system option reads both fixed-length and variable-length record formats from a file specified in a %INCLUDE statement, an autoexec file, or an autocall macro file. The S2V= system option reads only a variable-length record format from a file specified in a%INCLUDE statement, an autoexec file, or an autocall macro file.
See Also System Options: “S= System Option” on page 1928 “S2V= System Option” on page 1933 “SEQ= System Option” on page 1937
S2V= System Option Specifies the starting position to begin reading a file that is specified in a %INCLUDE statement, an autoexec file, or an autocall macro file with a variable length record format. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Input control: Data Processing PROC OPTIONS GROUP= INPUTCONTROL
Syntax S2V=S2 | S | n | nK | nM | nG | nT | MIN | MAX | hexX
Syntax Description S2
specifies to use the current value of the S2= system option to compute the starting position of the variable-sized record to read from a %INCLUDE statement, an autoexec file, or an autocall macro file. This is the default. S
specifies to use the current value of the S= system option to compute the starting position of the variable-sized record to read from a %INCLUDE statement, an autoexec file, or an autocall macro file.
1934
4
S2V= System Option
Chapter 7
n | nK | nM | nG | nT
specifies the starting position of the variable-length record to read that comes from a %INCLUDE statement, an autoexec file, or an autocall macro file, in terms of 1 (bytes); 1,024 (kilobytes); 1,048,576 (megabytes); 1,073,741,824 (gigabytes); or 1,099,511,627,776 (terabytes). For example, a value of 8 specifies 8 bytes, and a value of 3m specifies 3,145,728 bytes. MIN
sets the starting position of the variable-length record to read that comes from a %INCLUDE statement, an autoexec file, or an autocall macro, to 0. MAX
sets the starting position of the variable-length record to read that comes from a %INCLUDE statement, an autoexec file, or an autocall macro, to 2,147,483,647. hexX
specifies the starting position of the variable-length record to read that comes from a %INCLUDE statement, an autoexec file, or an autocall macro, as a hexadecimal number. You must specify the value beginning with a number (0–9), followed by an X.
Details Both the S2V= system option and the S2= system option specify the starting position for reading variable-sized record input from a %INCLUDE statement, an autoexec file, or an autocall macro file. When values for both options are specified, the value of the S2V= system option takes precedence over the value specified for the S2= system option. Operating Environment Information: The syntax shown here applies to the OPTIONS statement. On the command line or in a configuration file, the syntax is specific to your operating environments. For details, see the SAS documentation for your operating environment. 4
Comparisons The S2= system option specifies the starting position for reading both fixed-length and variable-length record formats for input from a %INCLUDE statement, an autoexec file, or an autocall macro file. The S2V= system option specifies the starting position for reading only variable-length record formats for input from a %INCLUDE statement, an autoexec file, or an autocall macro file.
See Also System Options: “S= System Option” on page 1928 “S2= System Option” on page 1931 “SEQ= System Option” on page 1937
SAS System Options
4
SASHELP= System Option
1935
SASHELP= System Option Specifies the location of the SASHELP library. configuration file, SAS invocation Category: Environment control: Files Valid in:
PROC OPTIONS GROUP= See:
ENVFILES
SASHELP= System Option in the documentation for your operating environment.
Syntax SASHELP=library-specification
Syntax Description library-specification
identifies an external library.
Details The SASHELP= system option is set during the installation process and normally is not changed after installation. Operating Environment Information: A valid external library specification is specific to your operating environment. On the command line or in a configuration file, the syntax is specific to your operating environment. For details, see the SAS documentation for your operating environment. 4 Operating Environment Information: Under the Windows, UNIX, and z/OS operating environments, you can use the APPEND or INSERT system options to add additional library-specifications. For more information, see the documentation for the APPEND and INSERT system options. 4
See Also System Options: “APPEND= System Option” on page 1789 “INSERT= System Option” on page 1871
1936
SASUSER= System Option
4
Chapter 7
SASUSER= System Option Specifies the SAS library to use as the SASUSER library. Valid in:
configuration file, SAS invocation
Category:
Environment control: Files
PROC OPTIONS GROUP= See:
ENVFILES
SASUSER= System Option in the documentation for your operating environment.
Syntax SASUSER=library-specification
Syntax Description library-specification
specifies the libref or the physical name that contains a user’s Profile catalog.
Details The library and catalog are created automatically by SAS; you do not have to create them explicitly. Operating Environment Information: A valid library specification and its syntax are specific to your operating environment. On the command line or in a configuration file, the syntax is specific to your operating environment. For details, see the SAS documentation for your operating environment. 4
SAS System Options
4
SEQ= System Option
1937
SEQ= System Option Specifies the length of the numeric portion of the sequence field in input source lines or data lines. configuration file, SAS invocation, OPTIONS statement, SAS System Options
Valid in:
window Category: Input control: Data Processing PROC OPTIONS GROUP= INPUTCONTROL
Syntax SEQ=n| MIN | MAX | hexX
Syntax Description
n
specifies the length in terms of bytes. MIN
sets the minimum length to 1. MAX
sets the maximum length to 8. When SEQ=8, all eight characters in the sequence field are assumed to be numeric.
Tip:
hexX
specifies the length as a hexadecimal. You must specify the value beginning with a number (0–9), followed by an X.
Details Unless the S= or S2= system option specifies otherwise, SAS assumes an eight-character sequence field; however, some editors place some alphabetic information (for example, the filename) in the first several characters. The SEQ= value specifies the number of digits that are right-justified in the eight-character field. For example, if you specify SEQ=5 for the sequence field AAA00010, SAS looks at only the last five characters of the eight-character sequence field and, if the characters are numeric, treats the entire eight-character field as a sequence field.
See Also System Options: “S= System Option” on page 1928 “S2= System Option” on page 1931
1938
SETINIT System Option
4
Chapter 7
SETINIT System Option Specifies whether site license information can be altered. configuration file, SAS invocation Category: System administration: Installation Valid in:
PROC OPTIONS GROUP=
INSTALL
Syntax SETINIT | NOSETINIT
Syntax Description SETINIT
in a non-windowing environment, specifies that you can change license information by running the SETINIT procedure. NOSETINIT
specifies not to allow you to alter site license information after installation.
Details SETINIT is set in the installation process and is not normally changed after installation. The SETINIT option is valid only in a non-windowing SAS session.
SKIP= System Option Specifies the number of lines to skip at the top of each page of SAS output. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Log and procedure output control: Procedure output PROC OPTIONS GROUP= LISTCONTROL Category:
Syntax SKIP=n | hexX | MIN | MAX
SAS System Options
4
SOLUTIONS System Option
1939
Syntax Description
n
specifies the range of lines to skip from 0 to 20. MIN
sets the number of lines to skip to 0, so no lines are skipped. MAX
sets the number of lines to skip to 20. hex
specifies the number of lines to skip as a hexadecimal number. You must specify the value beginning with a number (0–9), followed by an X. For example, the value 0ax specifies to skip 10 lines.
Details The location of the first line is relative to the position established by carriage control or by the forms control buffer on the printer. Most sites define this position so that the first line of a new page begins three or four lines down the form. If this spacing is sufficient, specify SKIP=0 so that additional lines are not skipped. The SKIP= value does not affect the maximum number of lines printed on each page, which is controlled by the PAGESIZE= system option.
SOLUTIONS System Option Specifies whether the SOLUTIONS menu is included in SAS windows. Valid in:
configuration file, SAS invocation
Category: Environment control: Display PROC OPTIONS GROUP=
ENVDISPLAY
Syntax SOLUTIONS | NOSOLUTIONS
Syntax Description
SOLUTIONS
specifies that the SOLUTIONS menu is included in SAS windows. NOSOLUTIONS
specifies that the SOLUTIONS menu is not included in SAS windows.
1940
SORTDUP= System Option
4
Chapter 7
SORTDUP= System Option Specifies whether the SORT procedure removes duplicate variables based on all variables in a data set or the variables that remain after the DROP or KEEP data set options have been applied. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Sort: Procedure options SORT
PROC OPTIONS GROUP=
Syntax SORTDUP=PHYSICAL | LOGICAL
Syntax Description PHYSICAL
removes duplicates based on all the variables that are present in the data set. This is the default. LOGICAL
removes duplicates based on only the variables remaining after the DROP= and KEEP= data set options are processed.
Details The SORTDUP= option specifies what variables to sort to remove duplicate observations when the SORT procedure NODUPRECS option is specified. When SORTDUP= is set to LOGICAL and NODUPRECS is specified in the SORT procedure, duplicate observations are removed based on the variables that remain after a DROP or KEEP operation on the input data set. Setting SORTDUP=LOGICAL increases the number of duplicate observations that are removed because it eliminates variables before observations are compared. Setting SORTDUP=LOGICAL might improve performance. When SORTDUP= is set to PHYSICAL and NODUPRECS is specified in the SORT procedure, duplicate observations are removed based on all of the variables in the input data set.
See Also The SORT Procedure in Base SAS Procedures Guide
SAS System Options
4
SORTEQUALS System Option
1941
SORTEQUALS System Option Specifies whether observations in the output data set with identical BY variable values have a particular order. Valid in: configuration file, SAS invocation, OPTIONS statement, SAS System OPTIONS window Category: Sort: Procedure options PROC OPTIONS GROUP= SORT
Syntax SORTEQUALS | NOSORTEQUALS
SORTEQUALS
specifies that observations with identical BY variable values are to retain the same relative positions in the output data set as in the input data set. NOSORTEQUALS
specifies that no resources be used to control the order of observations with identical BY variable values in the output data set. Interaction: To achieve the best sorting performance when using the THREADS=
system option, specify THREADS=YES and NOSORTEQUALS. To save resources, use NOSORTEQUALS when you do not need to maintain a specific order of observations with identical BY variable values.
Tip:
Comparisons The SORTEQUALS and NOSORTEQUALS system options set the sorting behavior of PROC SORT for your SAS session. The EQUAL or NOEQUAL option in the PROC SORT statement overrides the setting of the system option for an individual PROC step and specifies the sorting behavior for that PROC step only.
See Also Statement Options: EQUALS option for the PROC SORT statement in Base SAS Procedures Guide. System Options: “THREADS System Option” on page 1978
1942
SORTSIZE= System Option
4
Chapter 7
SORTSIZE= System Option Specifies the amount of memory that is available to the SORT procedure. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Sort: Procedure options System administration: Memory MEMORY
PROC OPTIONS GROUP=
SORT See: SORTSIZE= System Option in the documentation for your operating environment.
Syntax SORTSIZE=n | nK | nM | nG | nT | hexX | MIN | MAX
Syntax Description n | nK | nM | nG | nT
specifies the amount of memory in terms of 1 (byte); 1,024 (kilobytes); 1,048,576 (megabytes); 1,073,741,824 (gigabytes); or 1,099,511,627,776 (terabytes). For example, a value of 4000 specifies 4,000 bytes and a value of 2m specifies 2,097,152 bytes. If n=0, the sort utility uses its default. Valid values for SORTSIZE range from 0 to 9,223,372,036,854,775,807. hexX
specifies the amount of memory as a hexadecimal number. This number must begin with a number (0-9), followed by an X. For example, 0fffx specifies 4095 bytes of memory. MIN
specifies the minimum amount of memory available. MAX
specifies the maximum amount of memory available.
Operating Environment Information: Values for MIN and MAX will vary, depending on your operating environment. For details, see the SAS documentation for your operating environment 4
SAS System Options
4
SORTVALIDATE System Option
1943
Details Generally, the value of the SORTSIZE= system option should be less than the physical memory available to your process. If the SORT procedure needs more memory than you specify, the system creates a temporary utility file. PERFORMANCE NOTE: Proper specification of SORTSIZE= can improve sort performance by restricting the swapping of memory that is controlled by the operating environment.
See Also System Option: “SUMSIZE= System Option” on page 1958 “The SORT procedure” in the SAS documentation for your operating environment
SORTVALIDATE System Option Specifies whether the SORT procedure verifies if a data set is sorted according to the variables in the BY statement when a user-specified sort order is denoted in the sort indicator. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Sort: Procedure options PROC OPTIONS GROUP= SORT
Syntax SORTVALIDATE | NOSORTVALIDATE
Syntax Description SORTVALIDATE
specifies that the SORT procedure verifies if the observations in the data set are sorted by the variables specified in the BY statement. NOSORTVALIDATE
specifies that the SORT procedure is not to verify if the observations in the data set are sorted. This is the default.
Details You can use the SORTVALIDATE system option to specify whether the SORT procedure validates that a data set is sorted correctly when the data set sort indicator shows a user-specified sort order. The user can specify a sort order by using the SORTEDBY= data set option in a DATA statement or by using the SORTEDBY= option in the DATASETS procedure MODIFY statement. When the sort indicator is set by a user, SAS cannot be absolutely certain that a data set is sorted according to the variables in the BY statement.
1944
4
SOURCE System Option
Chapter 7
If the SORTVALIDATE system option is set and the data set sort indicator was set by a user, the SORT procedure performs a sequence check on each observation to ensure that the data set is sorted according to the variables in the BY statement. If the data set is not sorted correctly, SAS sorts the data set. At the end of a successful sequence check or at the end of a sort, the SORT procedure sets the Validated sort information to Yes. If a sort is performed, the SORT procedure updates the Sortedby sort information to the variables that are specified in the BY statement. If an output data set is specified, the Validated sort information in the output data set is set to Yes. If no sort is necessary, the data set is copied to the output data set.
See Also Data Set Option: “SORTEDBY= Data Set Option” on page 56 Procedures: “The DATASETS Procedure” in the Base SAS Procedures Guide “The SORT Procedure” in the Base SAS Procedures Guide “Sorted Data Sets” in SAS Language Reference: Concepts
SOURCE System Option Specifies whether SAS writes source statements to the SAS log. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Log and procedure output control: SAS log PROC OPTIONS GROUP= LOGCONTROL Category:
Syntax SOURCE | NOSOURCE
Syntax Description SOURCE
specifies to write SAS source statements to the SAS log. NOSOURCE
specifies not to write SAS source statements to the SAS log.
Details The SOURCE system option does not affect whether statements from a file read with %INCLUDE or from an autocall macro are printed in the SAS log. Note: SOURCE must be in effect when you execute SAS programs that you want to send to SAS for problem determination and resolution. 4
SAS System Options
4
SPOOL System Option
1945
See Also “The SAS Log” in SAS Language Reference: Concepts
SOURCE2 System Option Specifies whether SAS writes secondary source statements from included files to the SAS log. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Log and procedure output control: SAS log PROC OPTIONS GROUP=
LOGCONTROL
Syntax SOURCE2 | NOSOURCE2
Syntax Description SOURCE2
specifies to write to the SAS log secondary source statements from files that have been included by %INCLUDE statements. NOSOURCE2
specifies not to write secondary source statements to the SAS log.
Details Note: SOURCE2 must be in effect when you execute SAS programs that you want to send to SAS for problem determination and resolution. 4
See Also “The SAS Log” in SAS Language Reference: Concepts
SPOOL System Option Specifies whether SAS statements are written to a utility data set in the WORK data library. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Input control: Data Processing PROC OPTIONS GROUP= INPUTCONTROL
Syntax SPOOL | NOSPOOL
1946
SQLCONSTDATETIME System Option
4
Chapter 7
Syntax Description SPOOL
specifies that SAS write statements to a utility data set in the WORK data library for later use by a %INCLUDE or %LIST statement, or by the RECALL command, within a windowing environment. NOSPOOL
specifies that SAS does not write statements to a utility data set. Specifying NOSPOOL accelerates execution time, but you cannot use the %INCLUDE and %LIST statements to resubmit SAS statements that were executed earlier in the session.
Examples Specifying SPOOL is especially helpful in interactive line mode because you can resubmit a line or lines of code by referring to the line numbers. Here is an example of code including line numbers: 00001 00002 00003 00004 00005
data test; input w x y z; datalines; 411.365 101.945 323.782 512.398 ;
If SPOOL is in effect, you can resubmit line number 1 by submitting this statement: %inc 1;
You can also resubmit a range of lines by placing a colon (:) or dash (-) between the line numbers. For example, these statements resubmit lines 1 through 3 and 4 through 5 of the above example: %inc 1:3; %inc 4-5;
SQLCONSTDATETIME System Option Specifies whether the SQL procedure replaces references to the DATE, TIME, DATETIME, and TODAY functions in a query with their equivalent constant values before the query executes. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Files: SAS Files System administration: SQL PROC OPTIONS GROUP= SASFILES SQL Category:
Syntax SQLCONSTDATETIME | NOSQLCONSTDATETIME
SAS System Options
4
SQLREDUCEPUT= System Option
1947
Syntax Description SQLCONSTDATETIME
specifies that the SQL procedure is to replace references to the DATE, TIME, DATETIME, and TODAY functions with their equivalent numeric constant values. NOSQLCONSTDATETIME
specifies that the SQL procedure is not to replace references to the DATE, TIME, DATETIME, and TODAY functions with their equivalent numeric constant values.
Details When the SQLCONSTDATETIME system option is set, the SQL procedure evaluates the DATE, TIME, DATETIME, and TODAY functions in a query once, and uses those values throughout the query. Computing these values once ensures consistency of results when the functions are used multiple times in a query or when the query executes the functions close to a date or time boundary. When the NOSQLCONSTDATETIME system option is set, the SQL procedure evaluates these functions in a query each time it processes an observation. If both the SQLREDUCEPUT system option and the SQLCONSTDATETIME system option are specified, the SQL procedure replaces the DATE, TIME, DATETIME, and TODAY functions with their respective values in order to determine the PUT function value before the query executes: select x from &lib..c where (put(bday, date9.) = put(today(), date9.));
Note: The value that is specified in the SQLCONSTDATETIME system option is in effect for all SQL procedure statements, unless the CONSTDATETIME option on the PROC SQL statement is set. The value of the CONSTDATETIME option takes precedence over the SQLCONSTDATETIME system option. However, changing the value of the CONSTDATETIME option does not change the value of the SQLCONSTDATETIME system option. 4
See Also System option: “SQLREDUCEPUT= System Option” on page 1947 PROC SQL statement CONSTDATETIME option in Base SAS Procedures Guide Improving Query Performance in SAS SQL Procedure User’s Guide
SQLREDUCEPUT= System Option For the SQL procedure, specifies the engine type that a query uses for which optimization is performed by replacing a PUT function in a query with a logically equivalent expression. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Files: SAS Files
System administration: SQL System administration: Performance
1948
SQLREDUCEPUT= System Option
4
Chapter 7
PROC OPTIONS GROUP= SASFILES
SQL PERFORMANCE
Syntax SQLREDUCEPUT= ALL | NONE | DBMS | BASE
Syntax Description ALL
specifies that optimization is performed on all PUT functions regardless of any engine that is used by the query to access the data. NONE
specifies that no optimization is to be performed. DBMS
specifies that optimization is performed on all PUT functions whose query is performed by a SAS/ACCESS engine. This is the default. Requirement: The first argument to the PUT function must be a variable obtained by a table that is accessed using a SAS/ACCESS engine. BASE
specifies that optimization is performed on all PUT functions whose query is performed by a SAS/ACCESS engine or a Base SAS engine.
Details By using the SQLREDUCEPUT= system option, you can specify that SAS reduces the PUT function as much as possible before the query is processed. If the query also contains a WHERE clause, the evaluation of the WHERE clause is simplified. The following SELECT statements are examples of queries that would be reduced if this option was set to any value other than none: select x, y from &lib..b where (PUT(x, abc.) in (’yes’, ’no’)); select x from &lib..a where (PUT(x, udfmt.) = trim(left(’small’)));
If both the SQLREDUCEPUT system option and the SQLCONSTDATETIME system option are specified, the SQL procedure replaces the DATE, TIME, DATETIME, and TODAY functions with their respective values to determine the PUT function value before the query executes. The following two SELECT clauses show the original and optimized queries: select x from &lib..c where (put(bday, date9.) = put(today(), date9.));
would be reduced to select x from &lib..c where (put(bday, date9.) = "01Jun2008");
If a query does not contain the PUT function, optimization is not performed. Note: The value that is specified in the SQLREDUCEPUT system option is in effect for all SQL procedure statements, unless the REDUCEPUT option in the PROC SQL statement is set. The value of the REDUCEPUT option takes precedence over the SQLREDUCEPUT system option. However, changing the value of the REDUCEPUT option does not change the value of the SQLREDUCEPUT system option. 4
SAS System Options
4
SQLREDUCEPUTOBS= System Option
1949
See Also System option: “SQLCONSTDATETIME System Option” on page 1946 “SQLREDUCEPUTOBS= System Option” on page 1949 “PROC SQL Statement REDUCEPUT option” in the Base SAS Procedures Guide “Improving Query Performance” in the SAS SQL Procedure User’s Guide
SQLREDUCEPUTOBS= System Option For the SQL procedure when the SQLREDUCEPUT= system option is set to NONE, specifies the minimum number of observations that must be in a table in order for PROC SQL to consider optimizing the PUT function in a query. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Files: SAS Files
System administration: SQL System administration: Performance If the SQLREDUCEPUT= system option is set to NONE, conditions for both the SQLREDUCEPUTOBS= and SQLREDUCEPUTVALUES= system options must be met in order for the SQL procedure to consider optimizing the PUT function.
Interaction:
PROC OPTIONS GROUP= SASFILES
SQL PERFORMANCE
Syntax SQLREDUCEPUTOBS=n | nK | nM | nG | nT |hexX | MIN | MAX
Syntax Description n | nK | nM | nG | nT
specifies the number of observations that must be in a table before the SQL procedure considers to optimize the PUT function. number-of-observations is an integer that can be allocated in multiples of 1 (bytes); 1,024 (kilobytes); 1,048,576 (megabytes); 1,073,741,824 (gigabytes); or 1,099,511,627,776 (terabytes). For example, a value of 8 specifies eight buffers, and a value of 3k specifies 3,072 buffers. Default: 0, which indicates that there is no minimum number of observations in a
table required for the SQL procedure to optimize the PUT function. 63
Range: 0 – 2 –1, or approximately 9.2 quintillion hexX
specifies the number of observations that must be in a table before the SQL procedure considers to optimize the PUT function as a hexadecimal value. You must
1950
SQLREDUCEPUTVALUES= System Option
4
Chapter 7
specify the value beginning with a number (0–9), followed by an X. For example, the value 2dx specifies 45 buffers. MIN
sets the number of observations that must be in a table before the SQL procedure considers to optimize the PUT function to 0. A value of 0 indicates that there is no minimum number of observations required. This is the default. MAX
sets the maximum number of observations that must be in a table before the SQL 63 procedure considers to optimize the PUT function to 2 –1, or approximately 9.2 quintillion.
Details For databases that allow implicit pass-through when the row count for a table is not known, the SQL procedure allows the optimization in order for the query to be executed by the database. When the SQLREDUCEPUT= system option is set to NONE, the SQL procedure considers the value of both the SQLREDUCEPUTVALUES= and SQLREDUCEPUTOBS= system options and determines whether to optimize the PUT function. For databases that do not allow implicit pass-through, the SQL procedure does not perform the optimization, and more of the query is performed by SAS.
See Also System options: “SQLREDUCEPUT= System Option” on page 1947 “Improving Query Performance” in the SAS SQL Procedure User’s Guide
SQLREDUCEPUTVALUES= System Option For the SQL procedure when the SQLREDUCEPUT= system option is set to NONE, specifies the maximum number of SAS format values that can exist in a PUT function expression in order for PROC SQL to consider optimizing the PUT function in a query. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Files: SAS Files System administration: SQL System administration: Performance
If the SQLREDUCEPUT= system option is set to NONE, conditions for both the SQLREDUCEPUTVALUES= and SQLREDUCEPUTOBS= system options must be met in order for the SQL procedure to consider optimizing the PUT function.
Interaction:
PROC OPTIONS GROUP= SASFILES
SQL PERFORMANCE
SAS System Options
4
SQLREDUCEPUTVALUES= System Option
1951
Syntax SQLREDUCEPUTVALUES= n | nK | nM | nG | nT | hexX | MIN | MAX
Syntax Description
n | nK | nM | nG | nT
specifies the number of SAS format values that can exist in a PUT function expression, where n is an integer that can be allocated in multiples of 1 (bytes); 1,024 (kilobytes); 1,048,576 (megabytes); 1,073,741,824 (gigabytes); or 1,099,511,627,776 (terabytes). For example, a value of 8 specifies eight buffers, and a value of 3k specifies 3,072 buffers. Default: 0, which indicates that there is no minimum number of SAS format values
that can exist in a PUT function expression. Range: 0–5,000 Interaction: If the number of format values in a PUT function expression is greater
than this value, the SQL procedure does not optimize the PUT function. hexX
specifies the number of SAS format values that can exist in a PUT function expression. You must specify the value beginning with a number (0–9), followed by an X. For example, the value 2dx specifies 45 buffers. MIN
sets the number of SAS format values that can exist in a PUT function expression to 0. A value of 0 indicates that there is no minimum number of SAS format values that are required. This is the default. MAX
sets the maximum number of SAS format values that can exist in a PUT function expression to 5,000.
Details Some formats, especially user-defined formats, can contain many format values. Depending on the number of matches for a given PUT function expression, the resulting expression can list many format values. If the number of format values becomes too large, the query performance can degrade. When the SQLREDUCEPUT= system option is set to NONE, the SQL procedure considers the value of both the SQLREDUCEPUTVALUES= and SQLREDUCEPUTOBS= system options and determines whether to optimize the PUT function.
See Also System options: “SQLREDUCEPUT= System Option” on page 1947 “SQLREDUCEPUTOBS= System Option” on page 1949 “Improving Query Performance” in the SAS SQL Procedure User’s Guide
1952
SQLREMERGE System Option
4
Chapter 7
SQLREMERGE System Option Specifies whether the SQL procedure can process queries that use remerging of data. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Files: SAS Files System administration: SQL
PROC OPTIONS GROUP= SASFILES
SQL
Syntax SQLREMERGE | NOSQLREMERGE
Syntax Description SQLREMERGE
specifies that the SQL procedure can process queries that use remerging of data. NOSQLREMERGE
specifies that the SQL procedure cannot process queries that use remerging of data.
Details The remerge feature of the SQL procedure makes two passes through a table, using data in the second pass that was created in the first pass, in order to complete a query. When the NOSQLREMERGE system option is set, the SQL procedure cannot process remerging of data. If remerging is attempted when the NOSQLREMERGE option is set, an error is written to the SAS log.
See Also PROC SQL statement REMERGE option in Base SAS Procedures Guide Remerging Data in the summary-function component of the SQL procedure in Base SAS Procedures Guide Improving Query Performance
SQLUNDOPOLICY= System Option Specifies whether the SQL procedure keeps or discards updated data if errors occur while the data is being updated. configuration file, SAS invocation, Options statement Category: Files: SAS Files Valid in:
SAS System Options
4
SQLUNDOPOLICY= System Option
1953
System administration: SQL PROC OPTIONS GROUP= SASFILES
SQL
Syntax SQLUNDOPOLICY=NONE | OPTIONAL | REQUIRED
Syntax Description
NONE
specifies to keep changes that are made by the INSERT and UPDATE statements. OPTIONAL
specifies to reverse changes that are made by the INSERT and UPDATE statements as long as reversing the changes is reliable. REQUIRED
specifies to undo all changes that are made by the INSERT and UPDATE statements, up to the point of the error. This is the default. CAUTION:
Some UNDO operations cannot reliably reverse changes. In some situations, reversing the effects of the INSERT and UPDATE statements cannot be done reliably. When operations cannot be reversed, the SQL procedure issues an error message and does not execute the statement. For example, when a program uses a SAS/ACCESS view, or when a SAS data set is accessed through a SAS/SHARE server and is opened with the data set option CNTLLEV=RECORD, changes cannot be reliably reversed. 4 CAUTION:
Some UNDO operations might not reverse changes. In situations where multiple transactions are made to the same record, PROC SQL might not reverse a change; it will issue an error message instead. For example, if an error occurs during an insert, PROC SQL can delete a record that another user updated. In that case, the UNDO statement is not executed, and an error message is issued. 4
Details The value that is specified in the SQLUNDOPOLICY= system option is in effect for all SQL procedure statements, unless the UNDO_POLICY option in the PROC SQL statement is set. The value of the UNDO_POLICY option takes precedence over the SQLUNDOPOLICY= system option. The RESET statement can also be used to set or reset the UNDO_POLICY option. However, changing the value of the UNDO_POLICY option does not change the value of the SQLUNDOPOLICY= system option. Once the procedure completes, the undo policy reverts to the value of the SQLUNDOPOLICY= system option. If you are updating a data set using the SPD Engine, you can significantly improve processing performance by setting SQLUNDOPOLICY=NONE. However, ensure that NONE is an appropriate setting for your application.
1954
STARTLIB System Option
4
Chapter 7
See Also PROC SQL Statement UNDO_POLICY option in the Base SAS Procedures Guide
STARTLIB System Option Specifies whether SAS assigns user-defined permanent librefs when SAS starts. Valid in:
configuration file, SAS invocation
Category:
Files: External files
PROC OPTIONS GROUP=
EXTFILES
Syntax STARTLIB | NOSTARTLIB
Syntax Description STARTLIB
specifies that when SAS starts, SAS assigns user-defined permanent librefs. STARTLIB is the default for the windowing environment. NOSTARTLIB
specifies that SAS is not to assign user-defined permanent librefs when SAS starts. NOSTARTLIB is the default for batch mode, interactive line mode, and noninteractive mode.
Details You assign a permanent libref only in the windowing environment by using the New Library window and by selecting the Enable at startup check box. SAS stores the permanent libref in the SAS registry. To open the New Library window, right-mouse click Libraries in the Explorer window and select New. Alternatively, type DMLIBASSIGN in the command box. In the windowing environment, SAS automatically assigns permanent librefs when SAS starts because STARTLIB is the default. In all other execution modes (batch, interactive line, and noninteractive), SAS assigns permanent librefs only when you start SAS with the STARTLIB option specified either on the command line or in the configuration file.
SAS System Options
4
STEPCHKPT System Option
1955
STEPCHKPT System Option Specifies whether checkpoint-restart data is to be recorded for a batch program. configuration file, SAS invocation
Valid in:
Category: Environment control: Error handling Requirement:
can be used only in batch mode
PROC OPTIONS GROUP= ERRORHANDLING
Syntax STEPCHKPT | NOSTEPCHKPT
Syntax Description
STEPCHKPT
enables checkpoint mode, which specifies to record checkpoint-restart data. NOSTEPCHKPT
disables checkpoint mode, which specifies not to record checkpoint-restart data. This is the default.
Details Using the STEPCHKPT system option puts SAS in checkpoint mode for SAS programs that run in batch. Each time a DATA step or PROC step executes, SAS records data in a checkpoint-restart library. If a program terminates without completing, the program can be resubmitted, beginning with the step that was executing when the program terminated. To ensure that the checkpoint-restart data is accurate, when you specify the STEPCHKPT option, also specify the ERRORCHECK STRICT option and set the ERRORABEND option so that SAS terminates for most errors. Checkpoint mode is not valid for batch programs that contain the DM statement, which submits commands to SAS. If checkpoint mode is enabled and SAS encounters a DM statement, checkpoint mode is disabled and the checkpoint catalog entry is deleted.
See Also System Options: “STEPCHKPTLIB= System Option” on page 1956 “STEPRESTART System Option” on page 1957 “ERRORABEND System Option” on page 1846 “ERRORCHECK= System Option” on page 1848 Statement: “CHECKPOINT EXECUTE_ALWAYS Statement” on page 1413 “Restarting Batch Programs” in SAS Language Reference: Concepts
1956
STEPCHKPTLIB= System Option
4
Chapter 7
STEPCHKPTLIB= System Option Specifies the libref of the library where checkpoint-restart data is saved. Valid in: configuration file, SAS invocation Category: Environment control: Error handling Requirement: can be used only in batch mode PROC OPTIONS GROUP= ERRORHANDLING
Syntax STEPCHKPTLIB=libref
Syntax Description libref
specifies the libref that identifies the library where the checkpoint-restart data is saved. Default: Work Requirement: The LIBNAME statement that identifies the checkpoint-restart library must use the BASE engine and be the first statement in the batch program.
Details When the STEPCHKPT system option is specified, checkpoint-restart data for batch programs is saved in the libref that is specified in the STEPCHKPTLIB= system option. If no libref is specified, SAS uses the Work library to save checkpoint data. The LIBNAME statement that defines the libref must be the first statement in the batch program. If the Work library is used to save checkpoint data, the NOWORKTERM and NOWORKINIT system options must be specified so that the checkpoint-restart data is available when the batch program is resubmitted. These two options ensure that the Work library is saved when SAS ends and is restored when SAS starts. If the NOWORKTERM option is not specified, the Work library is deleted at the end of the SAS session and the checkpoint-restart data is lost. If the NOWORKINIT option is not specified, a new Work library is created when SAS starts, and again the checkpoint-restart data is lost. The STEPCHKPTLIB= option must be specified for any SAS session that accesses checkpoint-restart data that is not saved to the Work library.
See Also System Options: “STEPCHKPT System Option” on page 1955 “STEPRESTART System Option” on page 1957 “WORKINIT System Option” on page 1996 “WORKTERM System Option” on page 1997 Statement: “CHECKPOINT EXECUTE_ALWAYS Statement” on page 1413 “Restarting Batch Programs” in SAS Language Reference: Concepts
SAS System Options
4
STEPRESTART System Option
1957
STEPRESTART System Option Specifies whether to execute a batch program by using checkpoint-restart data. configuration file, SAS invocation Category: Environment control: Error handling Valid in:
Requirement:
can be used only in batch mode
PROC OPTIONS GROUP= ERRORHANDLING
Syntax STEPRESTART | NOSTEPRESTART
Syntax Description STEPRESTART
enables restart mode, which specifies to execute the batch program by using the checkpoint-restart data. NOSTEPRESTART
disables restart mode, which specifies not to execute the batch program using checkpoint-restart data.
Details You specify the STEPRESTART option when you want to resubmit a batch program that ran in checkpoint mode and terminated before it completed. When you resubmit the batch program, SAS determines from the checkpoint data which DATA step or PROC step was executing when the program terminated, and resumes executing the batch program by using that DATA or PROC step.
See Also System Options: “STEPCHKPT System Option” on page 1955 “STEPCHKPTLIB= System Option” on page 1956 Statement: “CHECKPOINT EXECUTE_ALWAYS Statement” on page 1413 “Restarting Batch Programs” in SAS Language Reference: Concepts
1958
SUMSIZE= System Option
4
Chapter 7
SUMSIZE= System Option Specifies a limit on the amount of memory that is available for data summarization procedures when class variables are active. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window System administration: Memory PROC OPTIONS GROUP= MEMORY Category:
Syntax SUMSIZE=n | nK | nM | nG | nT | hexX | MIN | MAX
Syntax Description n | nK | nM | nG | nT
specifies the amount of memory in terms of 1 (bytes); 1,024 (kilobytes); 1,048,576 (megabytes); 1,073,741,824 (gigabytes); or 1,099,511,627,776 (terabytes). When n=0, the default value, the amount of memory is determined by values of the MEMSIZE option and the REALMEMSIZE option. Valid values for SUMSIZE range from 0 to 2 (n-1) where n is the data width in bits (32 or 64) of the operating system. hexX
specifies the amount of memory as a hexadecimal number. You must specify the value beginning with a number (0–9), followed by an X. For example, a value of 0fffx specifies 4,095 bytes of memory. MIN
specifies the minimum amount of memory available. MAX
specifies the maximum amount of memory available.
Details The SUMSIZE= system option affects the MEANS, OLAP, REPORT, SUMMARY, SURVEYFREQ, SURVEYLOGISTIC, SURVEYMEANS, and TABULATE procedures. Proper specification of SUMSIZE= can improve procedure performance by restricting the swapping of memory that is controlled by the operating environment. Generally, the value of the SUMSIZE= system option should be less than the physical memory available to your process. If the procedure you are using needs more memory than you specify, the system creates a temporary utility file. If the value of SUMSIZE is greater than the values of the MEMSIZE option and the REALMEMSIZE option, SAS uses the values of the MEMSIZE option and REALMEMSIZE option.
SAS System Options
4
SVGHEIGHT= System Option
1959
See Also System Options: “SORTSIZE= System Option” on page 1942 “MEMSIZE System Option” in the documentation for your operating environment. “REALMEMSIZE System Option” in the documentation for your operating environment.
SVGCONTROLBUTTONS Specifies whether to display the paging control buttons and an index in a multipage SVG document. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Log and procedure output control: SVG PROC OPTIONS GROUP= SVG
Syntax SVGCONTROLBUTTONS | NOSVGCONTROLBUTTONS
Syntax Description SVGCONTROLBUTTONS
specifies to display the paging control buttons in the SVG document. NOSVGCONTROLBUTTONS
specifies not to display the paging control buttons in the SVG document. This is the default.
Details When SVGCONTROLBUTTONS is specified, the size of the SVG is increased to accommodate the script that controls paging in the SVG document. The SVGView printer sets the option to SVGCONTROLBUTTONS.
SVGHEIGHT= System Option Specifies the height of the viewport unless the SVG output is embedded in another SVG output; specifies the value of the height attribute of the outermost element in the SVG file. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Log and procedure output control: SVG
1960
SVGHEIGHT= System Option
4
Chapter 7
PROC OPTIONS GROUP= SVG Restriction:
The SVGHEIGHT= option sets the height attribute only on the outermost
element.
Syntax SVGHEIGHT= number-of-units | "" | "
Syntax Description
number-of- units
specifies the height as a number of unit-of-measure. Requirement:
number-of- units must be a positive integer value.
Interaction: If number-of- units is a negative number, the SVG document is not
rendered by the browser. unit-of-measure
specifies the unit of measurement, which can be one of the following: %
percentage
cm
centimeters
em
the height of the element’s font
ex
the height of the letter x
in
inches
mm
millimeters
pc
picas
pt
points
px
pixels
Default: px "" | "
specifies to reset the height to the default value of 600 pixels. Use two double quotation marks or two single quotation marks with no space between them.
Requirement:
Details For embedded elements, the SVGHEIGHT= option specifies the height of the rectangular region into which the element is placed. The SVG output is scaled to fit the viewBox if SVGHEIGHT="100%". If the SVGHEIGHT= option is not specified, the height attribute on the element is not set, which effectively provides full scalability by using a height of 100%. The value for the SVGHEIGHT= option can be specified using no delimiters, enclosed in single or double quotation marks, or enclosed in parentheses.
SAS System Options
4
SVGPRESERVEASPECTRATIO= System Option
1961
Examples The following OPTIONS statement specifies to size the SVG output to portrait letter-sized and to scale the output to 100% of the viewport: options printerpath=svg orientation=portrait svgheight="100%" svgwidth="100%" papersize=letter;
By using these option values, SAS creates the following element: xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" xml:space="preserve" onload=’Init(evt)’ version="1.1" width="100%" height="100%" viewBox="0 0 850 1100"
The value of "100%" in the SVGHEIGHT= option specifies to scale the SVG output height to 100% of the viewport, which is based on the value of the PAPERSIZE= option. The paper size is letter in the portrait orientation, which has a height of 11" at 100 dpi.
See Also System options: “SVGCONTROLBUTTONS” on page 1959 “SVGPRESERVEASPECTRATIO= System Option” on page 1961 “SVGTITLE= System Option” on page 1964 “SVGVIEWBOX= System Option” on page 1965 “SVGWIDTH= System Option” on page 1967 “SVGX= System Option” on page 1969 “SVGY= System Option” on page 1970 “Using SAS System Options” on page 1768 The SAS Registry in SAS Language Reference: Concepts Creating Scalable Vector Graphics Using Universal Printing in SAS Language Reference: Concepts
SVGPRESERVEASPECTRATIO= System Option Specifies whether to force uniform scaling of SVG output; specifies the preserveAspectRatio attribute on the outermost element. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Log and procedure output control: SVG PROC OPTIONS GROUP= SVG Restriction:
The SVGPRESERVEASPECTRATIO= option sets the
preserveAspectRatio attribute only on the outermost element.
1962
SVGPRESERVEASPECTRATIO= System Option
4
Chapter 7
Syntax SVGPRESERVEASPECTRATIO=align | meetOrSlice | NONE | "" SVGPRESERVEASPECTRATIO="align meetOrSlice"
Syntax Description align
specifies to force uniform scaling by specifying the alignment method to use. The value for align can be one of the following: xMinYMin
specifies to force uniform scaling by using the following alignment: Align the of the element’s viewBox with the smallest X value of the viewport. Align the of the element’s viewBox with the smallest Y value of the viewport.
xMidYMin
specifies to force uniform scaling by using the following alignment: Align the midpoint X value of the element’s viewBox with the midpoint X value of the viewport. Align the of the element’s viewBox with the smallest Y value of the viewport.
xMaxYMin
specifies to force uniform scaling by using the following alignment: Align the + of the element’s viewBox with the maximum X value of the viewport. Align the of the element’s viewBox with the smallest Y value of the viewport.
xMinYMid
specifies to force uniform scaling by using the following alignment: Align the of the element’s viewBox with the smallest X value of the viewport. Align the midpoint Y value of the element’s viewBox with the midpoint Y value of the viewport.
xMidYMid
specifies to force uniform scaling by using the following alignment: Align the midpoint X value of the element’s viewBox with the midpoint X value of the viewport. Align the midpoint Y value of the element’s viewBox with the midpoint Y value of the viewport. This is the default.
xMaxYMid
specifies to force uniform scaling by using the following alignment: Align the + of the element’s viewBox with the maximum X value of the viewport. Align the midpoint Y value of the element’s viewBox with the midpoint Y value of the viewport.
xMinYMax
specifies to force uniform scaling by using the following alignment: Align the of the element’s viewBox with the smallest X value of the viewport. Align the + of the element’s viewBox with the maximum Y value of the viewport.
SAS System Options
xMidYMax
4
SVGPRESERVEASPECTRATIO= System Option
1963
specifies to force uniform scaling by using the following alignment: Align the midpoint X value of the element’s viewBox with the midpoint X value of the viewport. Align the + of the element’s viewBox with the maximum Y value of the viewport.
xMaxYMax
specifies to force uniform scaling by using the following alignment: Align the + of the element’s viewBox with the maximum X value of the viewport. Align the + of the element’s viewBox with the maximum Y value of the viewport.
meetOrSlice
specifies to preserve the aspect ratio and how the viewBox displays. The following values are valid for meetOrSlice: meet
specifies to scale the SVG graphic as follows:
3 preserve the aspect ratio 3 make the entire viewBox visible within the viewport 3 scale up the viewBox as much as possible while meeting other criteria If the aspect ratio of the graphic does not match the viewport, some of the viewport will extend beyond the bounds of the viewBox. slice
specifies to scale the SVG graphic as follows:
3 preserve the aspect ratio 3 cover the entire viewBox with the viewport 3 scale down the viewBox as much as possible while meeting other criteria If the aspect ratio of the viewBox does not match the viewport, some of the viewBox will extend the bounds of the viewport. NONE
specifies not to force uniform scaling and to scale the SVG output nonuniformly so that the element’s bounding box exactly matches the viewport rectangle. ""
specifies to reset the preserveAspectRatio attribute of the element to the default value of xMidYMid meet. Requirement:
Use two double quotation marks with no space between them.
Details When the value of the SVGPRESERVEASPECTRATIO= option includes both align and meetOrSlice, you can delimit the value by using single or double quotation marks or parentheses. The preserveAspectRatio attribute applies only when a value is provided for the viewBox on the same element. If the viewBox attribute is not provided, the preserveAspectRatio attribute is ignored.
1964
SVGTITLE= System Option
4
Chapter 7
Examples The following OPTIONS statements are examples of using the SVGPRESERVEASPECTRATIO= system option: options options options options
preserveaspectratio=xMinYMax; preserveaspectratio="xMinYMin meet"; preserveaspectratio=(xMinYMin meet); preserveaspectratio="";
See Also System options: “SVGCONTROLBUTTONS” on page 1959 “SVGHEIGHT= System Option” on page 1959 “SVGTITLE= System Option” on page 1964 “SVGWIDTH= System Option” on page 1967 “SVGVIEWBOX= System Option” on page 1965 “SVGX= System Option” on page 1969 “SVGY= System Option” on page 1970 Creating Scalable Vector Graphics Using Universal Printing in SAS Language Reference: Concepts
SVGTITLE= System Option Specifies the title in the title bar of the SVG output; specifies the value of the element in the SVG file. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Log and procedure output control: SVG PROC OPTIONS GROUP= SVG Category:
Syntax SVGTITLE="title" | "" | "
Syntax Description "title"
specifies the title of the SVG. "" | "
specifies to reset the title to empty. Use two double quotation marks or two single quotation marks with no space between them.
Requirement:
SAS System Options
4
SVGVIEWBOX= System Option
1965
Details If the SVGTITLE option is not specified, the title bar of the SVG output displays the filename of the SVG output. The value for the SVGTITLE= option must be enclosed in single or double quotation marks, or enclosed in parentheses.
See Also System options: “SVGCONTROLBUTTONS” on page 1959 “SVGHEIGHT= System Option” on page 1959 “SVGPRESERVEASPECTRATIO= System Option” on page 1961 “SVGWIDTH= System Option” on page 1967 “SVGVIEWBOX= System Option” on page 1965 “SVGX= System Option” on page 1969 “SVGY= System Option” on page 1970 Creating Scalable Vector Graphics Using Universal Printing in SAS Language Reference: Concepts
SVGVIEWBOX= System Option Specifies the coordinates, width, and height that are used to set the viewBox attribute on the outermost element, which enables SVG output to scale to the viewport. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Log and procedure output control: SVG PROC OPTIONS GROUP= SVG
The SVGVIEWBOX= option sets the viewBox attribute only on the outermost element.
Restriction:
Syntax SVGVIEWBOX="min-x min-y width height" | none | "" | ’’
Syntax Description min–x
specifies the beginning x coordinate of the viewBox, in user units. Requirement:
min-x can be 0, or a positive or a negative integer value.
min–y
specifies the beginning y coordinate of the viewBox, in user units. Requirement:
min–y can be 0, or a positive or negative integer value.
1966
SVGVIEWBOX= System Option
4
Chapter 7
width
specifies the width of the viewBox, in user units. Requirement:
width must be a positive integer value.
height
specifies the height of the viewBox, in user units. Requirement: height must be a positive integer value. none
specifies that no viewBox attribute is to be specified on the outermost element, which will effectively create a static SVG document. "" | ’’
specifies to reset the width and height of the viewBox to the width and height of the paper size for the SVG printer. Requirement: Use two double quotation marks or two single quotation marks with no space between them.
Details When the viewBox attribute is specified, the SVG output is scaled to be rendered in the viewport and the current coordinate system is updated to be the dimensions that are specified by the viewBox attribute. If it is not specified, the viewBox attribute on the outermost element sets the height and width arguments of the viewBox attribute to the paper height and paper width as defined by the PAPERSIZE= system option. The coordinates, width, and height of the viewBox attribute should be mapped to the coordinates, width, and height of the viewport, taking into account the values of the preserveAspectRatio attribute. The value for the SVGVIEWBOX= option must be enclosed in single or double quotation marks, or enclosed in parentheses. You can use a negative value for min-x and min-y to place the SVG document in the output. A negative value of min-x shifts the output to the right. A negative value of min-y shifts the placement of the output downward.
Examples The following OPTIONS statement specifies to scale the output to a width of 100 user units and a height of 200 user units: options printerpath=svg svgviewbox="0 0 100 200" dev=sasprtc;
By using these option values, SAS creates the following element: xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" xml:space="preserve" onload=’Init(evt)’ version="1.1" viewBox="0 0 100 200"
See Also System options: “SVGCONTROLBUTTONS” on page 1959 “SVGHEIGHT= System Option” on page 1959
SAS System Options
4
SVGWIDTH= System Option
1967
“SVGPRESERVEASPECTRATIO= System Option” on page 1961 “SVGTITLE= System Option” on page 1964 “SVGWIDTH= System Option” on page 1967 “SVGX= System Option” on page 1969 “SVGY= System Option” on page 1970 Creating Scalable Vector Graphics Using Universal Printing in SAS Language Reference: Concepts
SVGWIDTH= System Option Specifies the width of the viewport unless the SVG output is embedded in another SVG output; specifies the value of the width attribute in the outermost element in the SVG file. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Log and procedure output control: SVG PROC OPTIONS GROUP= SVG Restriction:
The SVGWIDTH= option sets the width attribute only on the outermost
element.
Syntax SVGWIDTH= number-of-units | "" | "
Syntax Description number-of-units
specifies the width as a number of unit-of-measure. Requirement: number-of- units must be a positive integer value. Interaction: If number-of- units is a negative number, the SVG document is not rendered by the browser. unit-of-measure
specifies the unit of measurement, which can be one of the following: %
percentage
cm
centimeters
em
the height of the element’s font
ex
the height of the letter x
in
inches
mm
millimeters
pc
picas
pt
points
px
pixels
1968
SVGWIDTH= System Option
4
Chapter 7
Default: px "" | "
specifies to reset the width to the default value of 800 pixels. Requirement: Use two double quotation marks or two single quotation marks with no space between them.
Details For embedded elements, the SVGWIDTH= option specifies the width of the rectangular region into which the element is placed. The SVG output is scaled to fit the viewBox if SVGWIDTH="100%". If the SVGWIDTH= option is not specified, the width attribute on the element is not set, which effectively provides full scalability by using a width of 100%. The value for the SVGWIDTH= option can be specified without delimiters, enclosed in single or double quotation marks, or enclosed in parentheses.
Examples The following OPTIONS statement specifies to size the SVG output to portrait letter-sized and to scale the output to 100% of the viewport: options printerpath=svg orientation=portrait svgheight="100%" svgwidth="100%" papersize=letter;
By using these option values, SAS creates the following element: xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" xml:space="preserve" onload=’Init(evt)’ version="1.1" width="100%" height="100%" viewBox="0 0 850 1100"
The value of "100%" in the SVGWIDTH= option specifies to scale the SVG output width to 100% of the viewport, which is based on the value of the PAPERSIZE= option. The paper size is letter in the portrait orientation, which has a width of 8.5" at 96 dpi.
See Also System options: “SVGCONTROLBUTTONS” on page 1959 “SVGHEIGHT= System Option” on page 1959 “SVGPRESERVEASPECTRATIO= System Option” on page 1961 “SVGTITLE= System Option” on page 1964 “SVGVIEWBOX= System Option” on page 1965 “SVGX= System Option” on page 1969 “SVGY= System Option” on page 1970 “Using SAS System Options” on page 1768 The SAS Registry in SAS Language Reference: Concepts Creating Scalable Vector Graphics Using Universal Printing in SAS Language Reference: Concepts
SAS System Options
4
SVGX= System Option
1969
SVGX= System Option Specifies the x-axis coordinate of one corner of the rectangular region into which an embedded element is placed; specifies the x attribute in the outermost element in an SVG file. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Log and procedure output control: SVG PROC OPTIONS GROUP= SVG Restriction:
The SVGX= option sets the x attribute only on the outermost element.
Syntax SVGX= number-of-units | "" | "
Syntax Description number-of-units
specifies the x-axis coordinate as a number of unit-of-measure. unit-of-measure
specifies the unit of measurement, which can be one of the following: %
percentage
cm
centimeters
em
the height of the element’s font
ex
the height of the letter x
in
inches
mm
millimeters
pc
picas
pt
points
px
pixels
Default: px "" | "
specifies to reset the x attribute to 0 on the element and the x-axis coordinate for embedded SVG to 0. Use two double quotation marks or two single quotation marks with no space between them.
Requirement:
Details If the SVGX= option is not set, the x attribute on the element effectively has a value of 0 and no x-axis coordinate is set for embedded SVG output. The value for the SVGX= option can be specified without delimiters, enclosed in single or double quotation marks, or enclosed in parentheses.
1970
SVGY= System Option
4
Chapter 7
The x attribute on the outermost element has no effect on SVG documents that are produced by SAS. You can use the SVGX= system option to specify the x-axis coordinate if the SVG document is processed outside of SAS.
See Also System options: “SVGCONTROLBUTTONS” on page 1959 “SVGHEIGHT= System Option” on page 1959 “SVGPRESERVEASPECTRATIO= System Option” on page 1961 “SVGTITLE= System Option” on page 1964 “SVGWIDTH= System Option” on page 1967 “SVGVIEWBOX= System Option” on page 1965 “SVGY= System Option” on page 1970 Creating Scalable Vector Graphics Using Universal Printing in SAS Language Reference: Concepts
SVGY= System Option Specifies the y-axis coordinate of one corner of the rectangular region into which an embedded element is placed; specifies the y attribute in the outermost element in an SVG file. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Log and procedure output control: SVG
PROC OPTIONS GROUP= SVG Restriction:
The SVGY= option sets the y attribute only on the outermost element.
Syntax SVGY= number-of-units | "" | "
Syntax Description number-of-units
specifies the y-axis coordinate as a number of unit-of-measure. unit-of-measure
specifies the unit of measurement, which can be one of the following: %
percentage
cm
centimeters
em
the height of the element’s font
ex
the height of the letter x
SAS System Options
in
inches
mm
millimeters
pc
picas
pt
points
px
pixels
4
SYNTAXCHECK System Option
1971
Default: px "" | "
specifies to reset the y attribute on the element and the y-axis coordinate for embedded SVG output to 0. Use two double quotation marks or two single quotation marks with no space between them.
Requirement:
Details If the SVGY= option is not set, the y attribute on the element effectively has a value of 0 and no y-axis coordinate is set for embedded SVG output. The value for the SVGY= option can be specified without delimiters, enclosed in single or double quotation marks, or enclosed in parentheses. The y attribute on the outermost element has no effect on SVG documents that are produced by SAS. You can use the SVGY= system option to specify the y-axis coordinate if the SVG document is processed outside of SAS.
See Also System options: “SVGCONTROLBUTTONS” on page 1959 “SVGHEIGHT= System Option” on page 1959 “SVGPRESERVEASPECTRATIO= System Option” on page 1961 “SVGTITLE= System Option” on page 1964 “SVGWIDTH= System Option” on page 1967 “SVGVIEWBOX= System Option” on page 1965 “SVGX= System Option” on page 1969 Creating Scalable Vector Graphics Using Universal Printing in SAS Language Reference: Concepts
SYNTAXCHECK System Option In non-interactive or batch SAS sessions, specifies whether to enable syntax check mode for multiple steps. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Environment control: Error handling PROC OPTIONS GROUP= ERRORHANDLING
1972
SYNTAXCHECK System Option
4
Chapter 7
Syntax SYNTAXCHECK | NOSYNTAXCHECK
Syntax Description SYNTAXCHECK
enables syntax check mode for statements that are submitted within a non-interactive or batch SAS session. NOSYNTAXCHECK
does not enable syntax check mode for statements that are submitted within a non-interactive or batch SAS session. CAUTION:
Setting NOSYNTAXCHECK might cause a loss of data. Manipulating and deleting data by using untested code might result in a loss of data if your code contains invalid syntax. Be sure to test code completely before placing it in a production environment. 4
Details If a syntax or semantic error occurs in a DATA step after the SYNTAXCHECK option is set, then SAS enters syntax check mode, which remains in effect from the point where SAS encountered the error to the end of the code that was submitted. After SAS enters syntax mode, all subsequent DATA step statements and PROC step statements are validated. While in syntax check mode, only limited processing is performed. For a detailed explanation of syntax check mode, see “Syntax Check Mode” in the section “Error Processing in SAS” in SAS Language Reference: Concepts. Place the OPTIONS statement that enables SYNTAXCHECK before the step for which you want it to take effect. If you place the OPTIONS statement inside a step, then SYNTAXCHECK will not take effect until the beginning of the next step. NOSYNTAXCHECK enables continuous processing of statements regardless of syntax error conditions. SYNTAXCHECK is ignored in the SAS windowing environment and in SAS line-mode sessions.
Comparisons You use the SYNTAXCHECK system option to validate syntax in a non-interactive or a batch SAS session. You use the DMSSYNCHK system option to validate syntax in an interactive session by using the SAS windowing environment. The ERRORCHECK= option can be set to enable or disable syntax check mode for the LIBNAME statement, the FILENAME statement, the %INCLUDE statement, and the LOCK statement in SAS/SHARE. If you specify the NOSYNTAXCHECK option and the ERRORCHECK=STRICT option, then SAS does not enter syntax check mode when an error occurs.
See Also System Options: “DMSSYNCHK System Option” on page 1834
SAS System Options
4
SYSPRINTFONT= System Option
1973
“ERRORCHECK= System Option” on page 1848 “Error Processing in SAS” in the section “Error Processing and Debugging” in SAS Language Reference: Concepts
SYSPRINTFONT= System Option Specifies the default font to use for printing, which can be overridden by explicitly specifying a font and an ODS style. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Log and procedure output control: Procedure output
LISTCONTROL SYSPRINTFONT= System Option in the documentation for your operating environment. PROC OPTIONS GROUP= See:
Syntax SYSPRINTFONT=(“face-name” )
Syntax Description “face-name”
specifies the name of the font face to use for printing. Requirement: If face-name consists of more than one word, you must be enclose the value in single or double quotation marks. The quotation marks are stored with the face-name. Requirement: When you use the SYSPRINTFONT= option with multiple arguments, you must enclose the arguments in parentheses. Interaction: When you specify UPRINT=printer-name, face-name must be a valid font for printer-name. weight
specifies the weight of the font, such as BOLD. A list of valid values for your specified printer appears in the SAS: Printer Properties window. Default: NORMAL style
specifies the style of the font, such as Italic. A list of valid values for your specified printer appears in the SAS: Printer Properties window. Default:
REGULAR
character-set
specifies the character set to use for printing. Default: If the font does not support the specified character set, the default
character set is used. If the default character set is not supported by the font, the font’s default character set is used.
1974
SYSPRINTFONT= System Option
4
Chapter 7
Range: Valid values are listed in the SAS: Printer Properties window, under the Font tab.
point-size
specifies the point size to use for printing. If you omit this argument, SAS uses the default. Requirement: Point-size must be an integer. It must also be placed after the face-name, weight, style, and character-set arguments. NAMED “printer-name”
specifies a printer in the Windows operating environment to which these settings apply. Restriction: This argument is valid only for printers in the Windows operating environment. To specify a Universal Printer, use the UPRINT=argument. Requirement: The printer-name must exactly match the name shown in the Print Setup dialog box (except that the printer name is not case sensitive). Requirement: If the printer is more than one word, the printer-name must be enclosed in double quotation marks. The quotation marks are stored with the printer-name. UPRINT=“printer-name”
specifies a Universal Printer to which these settings apply. Restriction: This argument is valid only for printers that are listed in the SAS Registry. Requirement: The printer-name must match exactly the name shown in the Print Setup dialog box (except that the printer name is not case sensitive). Requirement: If the printer-name is more than one word, it must be enclosed in single or double quotation marks. The quotation marks are stored with the printer-name. DEFAULT | ALL
specifies whether the font settings apply to the default printer or to all printers: DEFAULT specifies that the font settings apply to the current default printer that is specified by the SYSPRINT= system option. ALL specifies that the font settings apply to all installed printers.
Details The SYSPRINTFONT= system option sets the font to use when printing to the current default printer, to a specified printer or to all printers. In some cases, you might need to specify the font from a SAS program. In this case, you might want to view the SAS: Printer Properties window for allowable names, styles weights, and sizes for your fonts. For examples of how to apply the SYSPRINTFONT= option in a SAS program, see “Examples” on page 1975. If you specified SYSPRINTFONT= with DEFAULT or without a keyword and later use the Print Setup dialog box to change the current default printer, then the font used with the current default printer will be the font that was specified with SYSPRINTFONT, if the specified font exists on the printer. If the current printer does not support the specified font, the printer’s default font is used. The following fonts are widely supported:
SAS System Options
3 3 3 3
4
TERMINAL System Option
1975
Helvetica Times Courier Symbol
By specifying one of these fonts in a SAS program, you can usually avoid returning an error. If that particular font is not supported, a similar-looking font prints in its place. All Universal printers and many SAS/GRAPH devices use the FreeType engine to render TrueType fonts. For more information, see Using TrueType Fonts with Universal Printing and SAS/GRAPH Devices in SAS Language Reference: Concepts. Note: As an alternative to using the SYSPRINTFONT= system option, you can set fonts with the SAS: Printer Properties window, under the Font tab. From the drop-down menu select File I Print Setup I Properties I Font. Using a dialog box is fast and easy because you choose your font, style, weight, size, and character set from a list of options that your selected printer supports. 4
Examples Specifying a Font to the Default Printer This example specifies the 12–point Times font on the default printer: options sysprintfont=("times" 12);
Specifying a Font to a Named Windows Printer This example specifies to use Courier on the printer named HP LaserJet IIIsi Postscript. Specify the printer name in the same way that it is specified in the SAS Print Setup dialog box: options sysprintfont= ("courier" named "hp laserjet 111s, postscript");
Specifying a Font to a Universal Printer, on the SAS command line This example specifies the Albany AMT font for the PDF Universal Printer:: sysprintfont=(’courier’ 11 uprint=’PDF’)
TERMINAL System Option Specifies whether to associate a terminal with a SAS session. Valid in:
configuration file, SAS invocation
Category: Environment control: Initialization and operation PROC OPTIONS GROUP=
EXECMODES
Syntax TERMINAL | NOTERMINAL
1976
TERMSTMT= System Option
4
Chapter 7
Syntax Description TERMINAL
specifies that SAS evaluate the execution environment and if a physical display is not available for an interactive environment, sets the option to NOTERMINAL. Specify TERMINAL when you use the SAS windowing environment. NOTERMINAL
specifies that SAS not evaluate the execution environment.
Details SAS defaults to the appropriate setting for the TERMINAL system option based on whether the session is invoked in the foreground or the background. If NOTERMINAL is specified, dialog boxes are not displayed. The TERMINAL option is normally used with the following execution modes: 3 SAS windowing environment mode 3 interactive line mode 3 noninteractive mode.
TERMSTMT= System Option Specifies the SAS statements to execute when SAS terminates. configuration file, SAS invocation Category: Environment control: Initialization and operation PROC OPTIONS GROUP= EXECMODES Valid in:
Syntax TERMSTMT=’statement(s)’
Syntax Description ’statement(s)’
is one or more SAS statements. Maximum length: 2,048 characters Operating Environment Information: In some operating system environments there is a limit to the size of the value for TERMSTMT=. To circumvent this limitation, you can use the %INCLUDE statement. 4
Details TERMSTMT= is fully supported in batch mode. In interactive modes, TERMSTMT= is executed only when you submit the ENDSAS statement from an editor window to terminate the SAS session. Terminating SAS by any other means in interactive mode results in TERMSTMT= not being executed.
SAS System Options
4
TEXTURELOC= System Option
1977
An alternate method for specifying TERMSTMT= is to put a %INCLUDE statement at the end of a batch file or to submit a %INCLUDE statement before terminating the SAS session in interactive mode.
Comparisons TERMSTMT= specifies the SAS statements to be executed at SAS termination, and INITSTMT= specifies the SAS statements to be executed at SAS initialization.
See Also System Option: “INITSTMT= System Option” on page 1870 Statement: “%INCLUDE Statement” on page 1534
TEXTURELOC= System Option Specifies the location of textures and images that are used by ODS styles. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Log and procedure output control: ODS Printing PROC OPTIONS GROUP= ODSPRINT
Syntax TEXTURELOC=location
Syntax Description
location
specifies the location of textures and images used by ODS styles. Location can refer either to the physical name of the directory or to a URL reference to the directory. If location is not a fileref, then you must enclose the value in quotation marks.
Requirement: Restriction:
Only one location is allowed per statement.
Requirement:
The files in the directory must be in the form of gif, jpeg, or bitmap.
Requirement:
Location must refer to a directory.
See Also “Dictionary of ODS Language Statements” in SAS Output Delivery System: User’s Guide.
1978
THREADS System Option
4
Chapter 7
THREADS System Option Specifies that SAS use threaded processing if it is available. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
System administration: Performance
PROC OPTIONS GROUP= PERFORMANCE
Syntax THREADS | NOTHREADS
Syntax Description
THREADS
specifies to use threaded processing for SAS applications that support it. If THREADS is specified either as a SAS system option or in PROC SORT and another program has the input SAS data set open for reading, writing, or updating using the SPD engine, then the procedure might fail and write a subsequent message to the SAS log.
Interaction
NOTHREADS
specifies not to use threaded processing for running SAS applications that support it. Interaction: When you specify NOTHREADS, CPUCOUNT= is ignored unless you
specify a procedure option that overrides the NOTHREADS system option.
Details The THREADS system option enables some legacy SAS processes that are thread-enabled to take advantage of multiple CPUs by threading the processing and I/O operations. Threading the processing and I/O operations achieves a degree of parallelism that generally reduces the real time to completion for a given operation at the possible cost of additional CPU resources. In SAS 9 and SAS 9.1, the thread-enabled processes include
3 Base SAS engine indexing 3 Base SAS procedures: SORT, SUMMARY, MEANS, REPORT, TABULATE, and SQL
3 SAS/STAT procedures: GLM, LOESS, REG, ROBUSTREG. In some cases, for example, processing small data sets, SAS might determine to use a single-threaded operation. Set this option to NOTHREADS to achieve SAS behavior most compatible with releases before to SAS 9, if you find that threading does not improve performance or if threading might be related to an unexplainable problem. See the specific documentation for each product to determine whether it has functionality that is enabled by the THREADS option.
SAS System Options
4
TOOLSMENU System Option
Comparisons The system option THREADS determines when threaded processing is in effect. The SAS system option CPUCOUNT= suggests how many system CPUs are available for use by thread-enabled SAS procedures.
See Also System Option: “CPUCOUNT= System Option” on page 1819 “UTILLOC= System Option” on page 1984 “Support for Parallel Processing” in SAS Language Reference: Concepts.
TOOLSMENU System Option Specifies whether the Tools menu is included in SAS windows. TOOLSMENU Valid in: configuration file, SAS invocation Default:
Category: Environment control: Display PROC OPTIONS GROUP= ENVDISPLAY
Syntax TOOLSMENU | NOTOOLSMENU
Syntax Description TOOLSMENU
specifies that the Tools menu is included in SAS windows. NOTOOLSMENU
specifies that the Tools menu is not included in SAS windows.
1979
1980
TOPMARGIN= System Option
4
Chapter 7
TOPMARGIN= System Option Specifies the print margin at the top of the page for output directed to an ODS printer destination. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Log and procedure output control: ODS Printing ODSPRINT
PROC OPTIONS GROUP=
Syntax TOPMARGIN= margin-size
Syntax Description margin-size
specifies the size of the margin. Restriction: The bottom margin should be small enough so that the top margin plus the bottom margin is less than the height of the paper. Interactions: Changing the value of this option might result in changes to the value of the PAGESIZE= system option.
specifies the units for margin-size. The margin-unit can be in for inches or cm for centimeters. is saved as part of the value of the TOPMARGIN system option. Default: inches
Details All margins have a minimum that is dependent on the printer and the paper size. The default value of the TOPMARGIN system option is 0.00 in. Operating Environment Information: Most SAS system options are initialized with default settings when SAS is invoked. However, the default settings and option values for some SAS system options might vary both by operating environment and by site. For details, see the SAS documentation for your operating environment. 4 For additional information on declaring an ODS printer destination, see the ODS statements in the SAS Output Delivery System: User’s Guide.
See Also System Options: “BOTTOMMARGIN= System Option” on page 1795 “LEFTMARGIN= System Option” on page 1877 “RIGHTMARGIN= System Option” on page 1927
SAS System Options
4
UNIVERSALPRINT System Option
TRAINLOC= System Option Specifies the URL for SAS online training courses. configuration file, SAS invocation Category: Environment control: Files PROC OPTIONS GROUP= ENVFILES Valid in:
Syntax TRAINLOC=”base-URL”
Syntax Description base-URL
specifies the address where the SAS online training courses are located.
Details The TRAINLOC= system option specifies the base location (typically a URL) of SAS online training courses. These online training courses are typically accessed from an intranet server or a local CD-ROM.
Examples Some examples of the base-URL are: 3 "file://e:\onlintut" 3 "http://server.abc.com/SAS/sastrain"
UNIVERSALPRINT System Option Specifies whether to enable Universal Printing services. configuration file, SAS invocation Category: Log and procedure output control: ODS Printing PROC OPTIONS GROUP= ODSPRINT Valid in:
Syntax UNIVERSALPRINT | NOUNIVERSALPRINT
Syntax Description UNIVERSALPRINT
1981
1982
UPRINTCOMPRESSION System Option
4
Chapter 7
routes all printing through the Universal Print services. UPRINT
Alias:
NOUNIVERSALPRINT
disables printing through the Universal Print services. NOUPRINT
Alias:
Details Universal Printing services provides interactive and batch printing capabilities to SAS applications and procedures. The ODS PRINTER destination uses Universal Print services whenever the UNIVERSALPRINT option is enabled.
See Also “Printing with SAS” in SAS Language Reference: Concepts
UPRINTCOMPRESSION System Option Specifies whether to enable compression of file created by some Universal Printer and SAS/GRAPH devices. Alias:
UPC | NOUPC
Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Log and procedure output control: ODS Printing
PROC OPTIONS GROUP= ODSPRINT
Syntax UPRINTCOMPRESSION | NOUPRINTCOMPRESSION
Syntax Description
UPRINTCOMPRESSION
specifies to enable compression of files created by some Universal Printers and some SAS/GRAPH devices. This is the default. NOUPRINTCOMPRESSION
specifies to disable compression of files created by some Universal Printers and some SAS/GRAPH devices.
Details The following table lists the Universal Printers and the SAS/GRAPH devices that are affected by the UPRINTCOMPRESSION system option:
SAS System Options
4
USER= System Option
Universal Printers
SAS/GRAPH Device Drivers
PCL5, PCL5C, PCL5E
PCL5, PCL5C, PCL5E
PDF
PDF, PDFA, PDFC
SVGZ
SVGZ
PS
SASPRTC, SASPRTG, SASPRTM
1983
UEPS, UPSC, UPCL5, UPCL5C, UPCL5E, UPDF, UPSL, UPSLC
When NOUPRINTCOMPRESSION is set, the DEFLATION= option is ignored. The ODS PRINTER statement option, COMPRESS=, takes precedence over the UPRINTCOMPRESSION system option.
See Also System options: “DEFLATION= System Option” on page 1824 Statements: “ODS PRINTER Statement” in the SAS Output Delivery System: User’s Guide
USER= System Option Specifies the default permanent SAS library. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Environment control: Files
ENVFILES See: USER= System Option in the documentation for your operating environment. PROC OPTIONS GROUP=
Syntax USER= library-specification
Syntax Description library-specification
specifies the libref or physical name of a SAS library.
Details If this option is specified, you can use one-level names to reference permanent SAS files in SAS statements. However, if USER=WORK is specified, SAS assumes that files referenced with one-level names refer to temporary work files.
1984
UTILLOC= System Option
4
Chapter 7
Operating Environment Information: The syntax that is shown here applies to the OPTIONS statement. On the command line or in a configuration file, the syntax is specific to your operating environment. For details, see the SAS documentation for your operating environment. 4
UTILLOC= System Option Specifies one or more file system locations in which applications can store utility files. configuration file and SAS invocation Category: Files: SAS Files PROC OPTIONS GROUP= SASFILES See: UTILLOC= System Option in the documentation for your operating environment. Valid in:
Syntax UTILLOC= WORK | location | (location-1... location-n)
Syntax Description WORK
specifies that SAS creates utility files in the same directory as the Work library. This is the default. location
specifies the location of an existing directory for utility files that are created by applications. Enclose location in single or double quotation marks when the location contains spaces. Operating Environment Information: On z/OS each location is a list of DCB and SMS options to be used when creating utility files. 4 (location-1 ... location-n)
specifies a list of existing directories that can be accessed in parallel for utility files that are created by applications. A single utility file cannot span locations. Enclose a location in single or double quotation marks when the location contains spaces. Any location that does not exist is deleted from the value of the UTILLOC= system option. Operating Environment Information: On z/OS, each location is a list of DCB and SMS options to be used when creating utility files. 4 If you have more than one location, then you must enclose the list of locations in parentheses.
Requirement:
Details Thread-enabled SAS applications are able to create temporary utility files that can be accessed in parallel by separate threads. For the SORT procedure, the UTILLOC= system option affects the placement of the utility files only if the multi-threaded SAS sort is used. The multi-threaded SAS sort can be invoked when the THREAD system option is specified and the value of the
SAS System Options
4
UUIDCOUNT= System Option
1985
CPUCOUNT= system option is greater than 1. The multi-threaded SAS sort can also be invoked when you specify the THREADS option in the PROC SORT statement. The multi-threaded sort stores all temporary data in a single utility file within one of the locations that are specified by the UTILLOC= system option. The size of this utility file is proportional to the amount of data that is read from the input data set. A second utility file of the same size can be created in another of these locations when the amount of data that is read from the input data set is large or the amount of memory that is available to the SORT procedure is small.
See Also System Option: “CPUCOUNT= System Option” on page 1819 “THREADS System Option” on page 1978 The SORT Procedure in Base SAS Procedures Guide “Support for Parallel Processing” in SAS Language Reference: Concepts.
UUIDCOUNT= System Option Specifies the number of UUIDs to acquire from the UUID Generator Daemon. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Environment control: Files PROC OPTIONS GROUP= ENVFILES
Syntax UUIDCOUNT= n | MIN | MAX
Syntax Description
n
specifies the number of UUIDs to acquire. Zero indicates that the UUID Generator Daemon is not required. Range: 0–1000 Default: 100 MIN | MAX MIN specifies that the number of UUIDs to acquire is zero, indicating that the
UUID Generator Daemon is not required. MAX specifies that 1000 UUIDs at a time should be acquired from the UUID
Generator Daemon.
1986
UUIDGENDHOST= System Option
4
Chapter 7
Details If a SAS application will generate a large number of UUIDs, this value can be adjusted at any time during a SAS session to reduce the number of times that the SAS session would have to contact the SAS UUID Generator Daemon.
See Also System Option: “UUIDGENDHOST= System Option” on page 1986 Function: “UUIDGEN Function” on page 1140 “Universal Unique Identifiers and the Object Spawner” in SAS Language Reference: Concepts
UUIDGENDHOST= System Option Identifies the host and port or the LDAP URL that the UUID Generator Daemon runs on. configuration file, SAS invocation Category: Environment control: Files PROC OPTIONS GROUP= ENVFILES Valid in:
Syntax UUIDGENDHOST= ’host-string’
Syntax Description ’host-string’
is either of the form hostname:port or an LDAP URL. The value must be in one string. Enclose an LDAP URL string with quotation marks.
Details SAS does not guarantee that all UUIDs are unique. Use the SAS UUID Generator Daemon (UUIDGEN) to ensure unique UUIDs.
Examples 3 Specifying hostname:port as the ’host-string’: sas -UUIDGENDHOST ’myhost.com:5306’
or sas UUIDGENDHOST= ’myhost.com:5306’
3 Specifying an LDAP URL as the ’host-string’:
SAS System Options
4
V6CREATEUPDATE= System Option
1987
"ldap://ldap-hostname/sasspawner-distinguished-name"
3 A more detailed example of an LDAP URL as the ’host-string’: "ldap://ldaphost/sasSpawnercn=UUIDGEND,sascomponent=sasServer, cn=ABC,o=ABC Inc,c=US"
3 Specifying your binddn and password, if your LDAP server is secure: "ldap://ldap-hostname/sasSpawner-distinguished-name???? bindname=binddn,password=bind-password"
3 An example with a bindname value and a password value: "ldap://ldaphost/ sasSpawnercn=UUIDGEND,sascomponent=sasServer,cn=ABC,o=ABC Inc,c=US ????bindname=cn=me%2co=ABC Inc %2cc=US, password=itsme"
Note: When specifying your bindname and password, commas that are a part of your bindname and your password must be replaced with the string "%2c". In the previous example, the bindname is as follows: cn=me,o=ABC Inc,c=US
4
See Also System Option: “UUIDCOUNT= System Option” on page 1985 Function: “UUIDGEN Function” on page 1140
V6CREATEUPDATE= System Option Specifies the type of message to write to the SAS log when Version 6 data sets are created or updated. configuration file, SAS invocation Category: Environment control: Files PROC OPTIONS GROUP= SASFILES Valid in:
Syntax V6CREATEUPDATE = ERROR | NOTE | WARNING | IGNORE
Syntax Description ERROR
specifies that an ERROR is written to the SAS log when the V6 engine is used to open a SAS data set for creation or update. The attempt to create or update a SAS data set in Version 6 format will fail. Reading Version 6 data sets will not generate an error.
1988
VALIDFMTNAME= System Option
4
Chapter 7
NOTE
specifies that a NOTE is written to the SAS log when the V6 engine is used; all other processing occurs normally. WARNING
specifies that a WARNING is written to the SAS log when the V6 engine is used; all other processing occurs normally. IGNORE
disables the V6CREATEUPDATE= system option. Nothing is written to the SAS log when the V6 engine is used.
VALIDFMTNAME= System Option Specifies the maximum size (32 characters or 8 characters) that user-created format and informat names can be before an error or warning is issued. LONG Valid in: configuration file, SAS invocation, OPTIONS statement, SAS System Options window Category: Files: SAS Files PROC OPTIONS GROUP= SASFILES Default:
Syntax VALIDFMTNAME=LONG | FAIL | WARN
Syntax Description LONG
specifies that format and informat names can be up to 32 alphanumeric characters. This is the default. FAIL
specifies that creating a format or informat name that is longer than eight characters results in an error message. Tip: Specify this setting for using formats and informats that are valid in both SAS 9 and previous releases of SAS. Interaction: If you explicitly specify the V7 or V8 Base SAS engine, such as in a LIBNAME statement, then SAS automatically uses the VALIDFMTNAME=FAIL behavior for data sets that are associated with those engines. WARN
specifies that creating a format or informat name that is longer than eight characters results in a warning message to remind you that the format or informat cannot be used with releases before to SAS 9.
Details SAS 9 enables you to define format and informat names up to 32 characters. Previous releases were limited to eight characters. The VALIDFMTNAME= system option
SAS System Options
4
VALIDVARNAME= System Option
1989
applies to format and informat names in both data sets and format catalogs. VALIDFMTNAME= does not control the length of format and informat names. It only controls the length of format and informat names that you associate with variables when you create a SAS data set. If a SAS data set has a variable with a long format or informat name, which means that a release before SAS 9 cannot read it, then you can remove the long name so that the data set can be accessed by an earlier release. However, in order to retain the format attribute of the variable, an identical format with a short name would have to be applied to the variable. Note: After you create a format or informat using a name that is longer than eight characters, if you rename it using eight or fewer characters, a release before SAS 9 cannot use the format or informat. You must recreate the format or informat using the shorter name. 4
See Also For more information about SAS names, see “Names in the SAS Language” Names in the SAS Language in SAS Language Reference: Concepts. For information about defining formats and informats, see “The FORMAT Procedure” in Base SAS Procedures Guide. For information about compatibility issues, see “SAS 9.1 Compatibility with SAS Files From Earlier Releases” in SAS Language Reference: Concepts.
VALIDVARNAME= System Option Specifies the rules for valid SAS variable names that can be created and processed during a SAS session. V7 Valid in: configuration file, SAS invocation, OPTIONS statement, SAS System Options window Category: Files: SAS Files PROC OPTIONS GROUP= SASFILES Default:
Syntax VALIDVARNAME=V7 | UPCASE | ANY
Syntax Description V7
specifies that variable names must follow these rules: 3 can be up to 32 characters in length. 3 must begin with a letter of the Latin alphabet (A - Z, a - z) or the underscore character. Subsequent characters can be letters of the Latin alphabet, numerals, or underscores.
1990
VALIDVARNAME= System Option
4
Chapter 7
3 cannot contain blanks. 3 cannot contain special characters except for the underscore. 3 can contain mixed-case letters. SAS stores and writes the variable name in the same case that is used in the first reference to the variable. However, when SAS processes a variable name, SAS internally converts it to uppercase. You cannot, therefore, use the same variable name with a different combination of uppercase and lowercase letters to represent different variables. For example, cat, Cat, and CAT all represent the same variable. 3 cannot be assigned the names of special SAS automatic variables (such as _N_ and _ERROR_) or variable list names (such as _NUMERIC_, _CHARACTER_, and _ALL_).
UPCASE
specifies that the variable name follows the same rules as V7, except that the variable name is uppercase, as in earlier versions of SAS. ANY
specifies that SAS variable names must follow these rules: 3 can be up to 32 characters in length 3 can be special and multi-byte characters not to exceed 32 bytes. 3 cannot contain any null bytes 3 leading blanks are preserved, but trailing blanks are ignored 3 name must contain at least one character. An all blank name is not permitted. 3 can begin with or contain any characters, including blanks Note: If you use any characters other than the ones that are valid when the VALIDVARNAME system option is set to V7 (letters of the Latin alphabet, numerals, or underscores), then you must express the variable name as a name literal and you must set VALIDVARNAME=ANY. See “SAS Name Literals” and “Avoiding Errors When Using Name Literals” in SAS Language Reference: Concepts. If you use either the percent sign (%) or the ampersand (&), then you must use single quotation marks in the name literal in order to avoid interaction with the SAS Macro Facility.
4
3 can contain mixed-case letters. SAS stores and writes the variable name in the same case that is used in the first reference to the variable. However, when SAS processes a variable name, SAS internally converts it to uppercase. You cannot, therefore, use the same variable name with a different combination of uppercase and lowercase letters to represent different variables. For example, cat, Cat, and CAT all represent the same variable. The intent of the VALIDVARNAME=ANY option is to enable compatibility with other DBMS variable (column) naming conventions, such as allowing embedded blanks and national characters. Throughout SAS, using the name literal syntax with variable names that exceed the 32–byte limit or have excessive embedded quotation marks might cause unexpected results.
Warning:
See Also “Rules for Words and Names in the SAS Language” in SAS Language Reference: Concepts
SAS System Options
4
VARLENCHK= System Option
1991
VARLENCHK= System Option Specifies the type of message to write to the SAS log when the input data set is read using the SET, MERGE, UPDATE, or MODIFY statements. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Files: SAS Files PROC OPTIONS GROUP= SASFILES
Syntax VARLENCHK=NOWARN | WARN | ERROR
Syntax Description NOWARN
specifies that no warning message is issued when the length of a variable that is being read is larger than the length that is defined for the variable. WARN
specifies that a warning is issued when the length of a variable that is being read is larger than the length that is defined for the variable. This is the default. ERROR
specifies that an error message is issued when the length of a variable that is being read is larger than the length that is defined for the variable.
Details After a variable is defined, the length of a variable can be changed only by a LENGTH statement. If a variable is read by the SET, MERGE, UPDATE, or MODFIY statements and the length of the variable is longer than a variable of the same name, SAS issues a warning message and uses the shorter, original length of the variable. By using the shorter length, data won’t be truncated. When you intentionally truncate data, perhaps to remove unnecessary blanks from character variables, SAS issues a warning message that might not be useful to you. To make it so that SAS does not issue the warning message or set a nonzero return code, you can set the VARLENCHK= system option to NOWARN. When VARLENCHK=NOWARN, SAS does not issue a warning message and sets the return code to SYSRC=0. Alternatively, if you set VARLENCHK=ERROR and the length of a variable that is being read is larger than the length that is defined for the variable, SAS issues an error and sets the return code SYSRC=8.
Examples Example 1: SAS Issues a Warning Message Merging Two Data Sets with Different Variable Lengths This example merges two data sets, the sashelp.class data set and the exam_schedule data set. The length of the variable Name is set to 8 by the first SET statement, set sashelp.class;. The exam_schedule data set sets the length of
1992
VARLENCHK= System Option
4
Chapter 7
Name to 10. When exam_schedule is read in the second SET statement, set exam_schedule key=Name;, SAS issues a warning message because the length of Name in the exam_schedule data set is longer than the length of Name in the sashelp.class data set, and data might have been truncated. /& Create the exam_schedule data set. */ data exam_schedule(index=(Name)); input Name $10. +1 Exam_Date mmddyy10.; format Exam_Date mmddyy10.; datalines; Carol 06/09/2008 Hui 06/09/2008 Janet 06/09/2008 Geoffrey 06/09/2008 John 06/09/2008 Joyce 06/09/2008 Helga 06/09/2008 Mary 06/09/2008 Roberto 06/09/2008 Ronald 06/09/2008 Barbara 06/10/2008 Louise 06/10/2008 Alfred 06/11/2008 Alice 06/11/2008 Henri 06/11/2008 James 06/11/2008 Philip 06/11/2008 Tomas 06/11/2008 William 06/11/2008
/* Merge the data sets sashelp.class and exam_schedule data exams; set sashelp.class; set exam_schedule key=Name; run;
The following SAS log shows the warning message:
*/
SAS System Options
Output 7.11
4
VARLENCHK= System Option
1993
The Warning Message in the SAS Log
1 2 3
data exam_schedule(index=(Name)); input Name $10. +1 Exam_Date mmddyy10.; format Exam_Date mmddyy10.;
4
datalines;
NOTE: The data set WORK.EXAM_SCHEDULE has 20 observations and 2 variables. NOTE: DATA statement used (Total process time): real time 4.32 seconds cpu time
0.24 seconds
25 26 27
;
28 29 30
set sashelp.class; set exam_schedule key=Name; run;
data exams;
WARNING: Multiple lengths were specified for the variable Name by input data set(s). This may cause truncation of data. NOTE: There were 19 observations read from the data set SASHELP.CLASS. NOTE: The data set WORK.EXAMS has 19 observations and 6 variables. NOTE: DATA statement used (Total process time): real time cpu time
0.51 seconds 0.00 seconds
Example 2: Turn Off the Warning Message and Use the LENGTH Statement to Match Variable Lengths In order to merge the two data sets, sashelp.class and exam_schedule, you can examine the values of Name in exam_schedule. You can see that there are no values that are greater than 8 and that you can change the length of Name without losing data. To change the length of the variable Name, you use a LENGTH= statement in a DATA step before the set exam_schedule; statement. If the value of VARLENCHK is WARN (the default), SAS issues the warning message that the value of Name is truncated when it is read from work.exam_schedule. Because you know that data is not lost, you might want to turn the warning message off: options varlenchk=nowarn; data exam_schedule(index=(Name)); length Name $ 8; set exam_schedule; run;
The following is the SAS log output: 37
options varlenchk=nowarn;
38 39 40 41 42
options varlenchk=nowarn; data exam_schedule(index=(Name)); length Name $ 8; set exam_schedule; run;
NOTE: There were 20 observations read from the data set WORK.EXAM_SCHEDULE. NOTE: The data set WORK.EXAM_SCHEDULE has 20 observations and 2 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds
1994
VIEWMENU System Option
4
Chapter 7
See Also Looking at Sources of Common Problems in the “Combining SAS Data Sets: Basic Concepts” section of SAS Language Reference: Concepts
VIEWMENU System Option Specifies whether the View menu is included in SAS windows. Default:
VIEWMENU
Valid in:
configuration file, SAS invocation
Category:
Environment control: Display
PROC OPTIONS GROUP= ENVDISPLAY
Syntax VIEWMENU | NOVIEWMENU
Syntax Description
VIEWMENU
specifies that the View menu is included in SAS windows. NOVIEWMENU
specifies that the View menu is not included in SAS windows.
VNFERR System Option Specifies whether SAS issues an error or warning when a BY variable exists in one data set but not another data set when processing the SET, MERGE, UPDATE, or MODIFY statements. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category:
Environment control: Error handling
PROC OPTIONS GROUP=
ERRORHANDLING
Syntax VNFERR | NOVNFERR
SAS System Options
4
WORK= System Option
1995
Syntax Description
VNFERR
specifies that SAS issue an error when a BY variable exists in one data set but not in another data set when processing the SET, MERGE, UPDATE, or MODIFY statements. When the error occurs, SAS enters into syntax-check mode. NOVNFERR
specifies that SAS issue a warning when a BY variable exists in one data set but not in another data set when processing the SET, MERGE, UPDATE, or MODIFY statements. When the warning occurs, SAS does not enter into syntax-check mode.
Details Operating Environment Information: Under z/OS, SAS also issues an error or a warning when the data set specified by DDNAME points to a DUMMY library. 4
Comparisons 3 VNFERR is similar to the BYERR system option, which issues an error and enters into syntax-check mode if the SORT procedure attempts to sort a _NULL_ data set.
3 VNFERR is similar to the DSNFERR system option, which issues an error when a SAS data set is not found.
See Also System Options: “BYERR System Option” on page 1799 “DSNFERR System Option” on page 1836 Syntax Check Mode in SAS Language Reference: Concepts
WORK= System Option Specifies the WORK data library. Valid in:
configuration file, SAS invocation
Category: Environment control: Files PROC OPTIONS GROUP= ENVFILES See:
WORK= System Option in the documentation for your operating environment.
Syntax WORK=library-specification
1996
WORKINIT System Option
4
Chapter 7
Syntax Description library-specification
specifies the libref or physical name of the storage space where all data sets with one-level names are stored. This library must exist. Operating Environment Information: A valid library specification and its syntax are specific to your operating environment. On the command line or in a configuration file, the syntax is specific to your operating environment. For details, see the SAS documentation for your operating environment. 4
Details This library is deleted at the end of your SAS session by default. To prevent the files from being deleted, specify the NOWORKTERM system option.
See Also System Option: “WORKTERM System Option” on page 1997
WORKINIT System Option Specifies whether to initialize the WORK library at SAS invocation. configuration file, SAS invocation Category: Environment control: Files PROC OPTIONS GROUP= ENVFILES Valid in:
Syntax WORKINIT | NOWORKINIT
Syntax Description WORKINIT
erases files that exist from a previous SAS session in an existing WORK library at SAS invocation. NOWORKINIT
does not erase files from the WORK library at SAS invocation.
Comparisons The WORKINIT system option initializes the WORK data library and erases all files from a previous SAS session at SAS invocation. The WORKTERM system option controls whether SAS erases WORK files at the end of a SAS session.
SAS System Options
4
WORKTERM System Option
1997
See Also System Option: “WORKTERM System Option” on page 1997
Operating Environment Information: WORKINIT has behavior and functions specific to the UNIX operating environment. For details, see the SAS documentation for the UNIX operating environment. 4
WORKTERM System Option Specifies whether to erase the WORK files when SAS terminates. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Category: Environment control: Files PROC OPTIONS GROUP=
ENVFILES
Syntax WORKTERM | NOWORKTERM
Syntax Description
WORKTERM
erases the WORK files at the termination of a SAS session. NOWORKTERM
does not erase the WORK files.
Details Although NOWORKTERM prevents the WORK data sets from being deleted, it has no effect on initialization of the WORK library by SAS. SAS normally initializes the WORK library at the start of each session, which effectively destroys any pre-existing information.
Comparisons Use the NOWORKINIT system option to prevent SAS from erasing existing WORK files on invocation. Use the NOWORKTERM system option to prevent SAS from erasing existing WORK files on termination.
1998
YEARCUTOFF= System Option
4
Chapter 7
See Also System Option: “WORKINIT System Option” on page 1996
YEARCUTOFF= System Option Specifies the first year of a 100-year span that is used by date informats and functions to read a two–digit year. Valid in:
configuration file, SAS invocation, OPTIONS statement, SAS System Options
window Input control: Data Processing PROC OPTIONS GROUP= INPUTCONTROL Category:
Syntax YEARCUTOFF= nnnn | nnnnn
Syntax Description nnnn | nnnnn
specifies the first year of the 100-year span. Range: 1582–19900 Default: 1920
Details The YEARCUTOFF= value is the default that is used by various date and datetime informats and functions. If the default value of nnnn (1920) is in effect, the 100-year span begins with 1920 and ends with 2019. Therefore, any informat or function that uses a two-digit year value that ranges from 20 to 99 assumes a prefix of 19. For example, the value 92 refers to the year 1992. The value that you specify in YEARCUTOFF= can result in a range of years that span two centuries. For example, if you specify YEARCUTOFF=1950, any two-digit value between 50 and 99 inclusive refers to the first half of the 100-year span, which is in the 1900s. Any two-digit value between 00 and 49, inclusive, refers to the second half of the 100-year span, which is in the 2000s. The following figure illustrates the relationship between the 100-year span and the two centuries if YEARCUTOFF=1950.
SAS System Options
4
Encryption in SAS
1999
Figure 7.1 A 100–Year Span with Values in Two Centuries
100-year span
1950
1999
2000
in the 1900s
2049
in the 2000s
Note: YEARCUTOFF= has no effect on existing SAS dates or dates that are read from input data that include a four-digit year, except years with leading zeros. For example, 0076 with yearcutoff=1990 indicates 2076. 4 Operating Environment Information: The syntax that is shown here applies to the OPTIONS statement. On the command line or in a configuration file, the syntax is specific to your operating environment. For more information, see the SAS documentation for your operating environment. 4
See Also “Year 2000” in SAS Language Reference: Concepts.
SAS System Options Documented in Other SAS Publications In addition to system options documented in SAS Language Reference: Dictionary, system options are also documented in the following publications:
Encryption in SAS System Option
Description
NETENCRYPT=
Specifies whether client/server data transfers are encrypted.
NETENCRYPTALGORITHM=
Specifies one or more algorithms to be used for encrypted client/server data transfers.
NETENCRYPTKEYLEN=
Specifies the key length to use for encrypted client/server data transfers.
SSLCALISTLOC=
Specifies the location of digital certificates for trusted certification authorities (CA).
SSLCERTISS=
Specifies the name of the issuer of the digital certificate that SSL should use.
SSLCERTLOC=
Specifies the location of the digital certificate that is used for authentication.
2000
Grid Computing in SAS
4
Chapter 7
System Option
Description
SSLCERTSERIAL=
Specifies the serial number of the digital certificate that SSL should use.
SSLCERTSUBJ=
Specifies the subject name of the digital certificate that SSL should use.
SSLCLIENTAUTH=
Specifies whether a server should perform client authentication.
SSLCRLCHECK=
Specifies whether a Certificate Revocation List (CRL) is checked when a digital certificate is validated.
SSLCRLLOC=
Specifies the location of a Certificate Revocation List (CRL).
SSLPVTKEYLOC=
Specifies the location of the private key that corresponds to the digital certificate.
SSLPVTKEYPASS=
Specifies the password that SSL requires for decrypting the private key.
Grid Computing in SAS System Option
Description
CONNECTMETACONNECTION Specifies whether a SAS/CONNECT server is authorized to access a SAS Metadata Server at server sign-on. IPADDRESS
Specifies whether the grid node sends its IP address to the client session during sign-on to the grid.
SSPI
Enables a SAS session that runs on a grid node to access the SAS Metadata Server using credentials that are supplied by Windows SSPI (Security Provider Interface).
For more information, see Grid Computing in SAS 9.2 on http://support.sas.com.
SAS Interface to Application Response Measurement (ARM): Reference System Option
Description
ARMAGENT=
Specifies another vendor’s ARM agent, which is an executable module that contains a vendor’s implementation of the ARM API.
ARMLOC=
Specifies the location of the ARM log.
ARMSUBSYS=
Specifies whether to enable or disable the ARM subsystems that determine the internal SAS processing transactions to be logged.
SAS System Options
4
SAS Companion for Windows
2001
SAS Companion for Windows The system options listed here are documented only in SAS Companion for Windows. Other system options in SAS Companion for Windows contain information specific to the Windows operating environment, where the main documentation is in SAS Language Reference: Dictionary. These latter system options are not listed here.
System Option
Description
ACCESSIBILITY
Enables the accessibility features on the Customize Tools dialog box.
ALTLOG
Specifies a destination for a copy of the SAS log.
ALTPRINT
Specifies the destination for the copies of the output files from SAS procedures.
AUTHSERVER
Specifies the authentication domain server to search for secure server logins.
AUTOEXEC
Specifies the SAS autoexec file.
AWSCONTROL
Specifies whether the main SAS window includes a title bar, a system/control menu, and minimize/maximize buttons.
AWSDEF
Specifies the location and dimensions of the main SAS window when SAS initializes.
AWSMENU
Specifies whether to display the menu bar in the main SAS window.
AWSMENUMERGE
Specifies whether to embed menu items that are specific to Windows in the main menus.
AWSTITLE
Replaces the default text in the main SAS title bar.
COMDEF
Specifies the location where the SAS Command window is displayed.
CONFIG
Specifies the configuration file that is used when initializing or overriding the values of SAS system options.
ECHO
Specifies a message to be echoed to the SAS log while initializing SAS.
EMAILDLG
Specifies whether to use the native e-mail dialog box provided by your e-mail application or the e-mail dialog box provided by SAS.
EMAILSYS
Specifies the e-mail protocol to use for sending electronic mail.
ENHANCEDEDITOR
Specifies whether to enable the Enhanced Editor during SAS invocation.
FILTERLIST
Specifies an alternative set of file filter specifications to use for the Open and Save As dialog boxes.
FONT
Specifies a font to use for SAS windows.
FONTALIAS
Assigns a Windows font to one of the SAS fonts.
FULLSTIMER
Specifies whether to write all available system performance statistics to the SAS log.
HELPINDEX
Specifies one or more index files for the SAS Help and Documentation.
HELPLOC
Specifies the location of Help files that are used to view SAS Help and Documentation using Microsoft HTML Help.
HELPREGISTER
Registers help files to access from the main SAS window Help menu.
2002
SAS Companion for Windows
4
Chapter 7
System Option
Description
HELPTOC
Specifies the table of contents files for the SAS Help and Documentation.
HOSTPPRINT
Specifies that the Windows Print Manager is to be used for printing.
ICON
Minimizes the SAS window.
JREOPTIONS
Identifies Java Runtime Environment (JRE) options for SAS.
LOADMEMSIZE
Specifies a suggested amount of memory needed for executable programs loaded by SAS.
LOG
Specifies a destination for a copy of the SAS log when running in batch mode.
MAXMEMQUERY
Specifies the limit on the maximum amount of memory that is allocated for procedures.
MEMBLKSZ
Specifies the memory block size for memory-based libraries for Windows operating environments.
MEMCACHE
Specifies to use the memory-based libraries as a SAS file cache.
MEMLIB
Specifies to process the Work library as a memory-based library.
MEMMAXSZ
Specifies the maximum amount of memory to allocate for using memory-based libraries in Windows operating environments.
MEMSIZE
Specifies the limit on the amount of virtual memory that can be used during a SAS session.
MSG
Specifies the library that contains the SAS error messages.
MSGCASE
Specifies whether notes, warnings, and error messages that are generated by SAS are displayed in uppercase characters.
NUMKEYS
Controls the number of available function keys.
NUMMOUSEKEYS
Specifies the number of mouse buttons SAS displays in the KEYS window.
PATH
Specifies one or more search paths for SAS executable files.
PFKEY
Specifies which set of function keys to designate as the primary set of function keys.
PRINT
Specifies a destination for SAS output when running in batch mode.
PRNGETLIST
Specifies if printers attached to the system are recognized.
PRTABORTDLGS
Specifies when to display the Print Abort dialog box.
PRTPERSISTDEFAULT
Specifies to use the same destination printer from SAS session to SAS session.
PRTSETFORMS
Specifies whether to include the Use Forms check box in the Print Setup dialog box.
REALMEMSIZE
Specifies the amount of virtual memory SAS can expect to allocate.
REGISTER
Adds an application to the Tools menu in the main SAS window.
RESOURCESLOC
Specifies a directory location of the files that contain SAS resources.
RTRACE
Produces a list of resources that are read or loaded during a SAS session.
SAS System Options
4
SAS Companion for Windows
2003
System Option
Description
RTRACELOC
Specifies the pathname of the file to which the list of resources that are read or loaded during a SAS session is written.
SASCONTROL
Specifies whether the SAS application windows include system/ control menus and minimize/maximize buttons.
SASINITIALFOLDER
Changes the working folder and the default folders for the Open and Save As dialog boxes to the specified folder after SAS initialization is complete.
SCROLLBARFLASH
Specifies whether to allow the mouse or keyboard to focus on a scroll bar.
SET
Defines a SAS environment variable.
SGIO
Activates the Scatter/Gather I/O feature.
SLEEPWINDOW
Enables or disables the SLEEP window.
SORTANOM
Specifies certain options for the SyncSort utility.
SORTCUT
Specifies the number of observations above which SyncSort is used instead of the SAS sort program.
SORTCUTP
Specifies the number of bytes above which SyncSort is used instead of the SAS sort program.
SORTDEV
Specifies the pathname used for temporary files created by the SyncSort utility.
SORTPARM
Specifies parameters for the SyncSort utility.
SORTPGM
Specifies the sort utility that is used in the SORT procedure.
SPLASH
Specifies whether to display the splash screen (logo screen) when SAS starts.
SPLASHLOC
Specifies the location of the splash screen bitmap that appears when SAS starts.
STIMEFMT
Specifies the format to use for displaying the time on STIMER output.
STIMER
Writes a subset of system performance statistics to the SAS log.
SYSGUIFONT
Specifies a font to use for the button text and the descriptive text.
SYSPRINT
Specifies a destination printer for printing SAS output.
SYSIN
Specifies a batch mode source file.
TOOLDEF
Specifies the Toolbox display location.
UPRINTMENUSWITCH
Enables the universal print commands in the File menu.
USERICON
Specifies the pathname of the resource file associated with your user-defined icon.
VERBOSE
Controls whether SAS writes the settings of all the system options specified in the configuration file to either the terminal or the batch log.
WEBUI
Specifies to enable Web enhancements.
WINDOWSMENU
Specifies to include or suppress the Window menu in windows that display menus.
XCMD
Specifies that the X command is valid in the current SAS session.
2004
SAS Companion for OpenVMS on HP Integrity Servers
4
Chapter 7
System Option
Description
XMIN
Specifies to open the application specified in the X command in a minimized state or in the default active state.
XSYNC
Controls whether an X command or statement executes synchronously or asynchronously.
XWAIT
Specifies whether you have to type EXIT at the DOS prompt before the DOS shell closes.
SAS Companion for OpenVMS on HP Integrity Servers The system options listed here are documented only in SAS Companion for OpenVMS on HP Integrity Servers. Other system options in SAS Companion for OpenVMS on HP Integrity Servers contain information specific to the OpenVMS operating environment, where the main documentation is in SAS Language Reference: Dictionary. These latter system options are not listed here. System Option
Description
ALTMULT
Specifies the number of pages that are preallocated to a file.
ALTLOG
Specifies a destination for a copy of the SAS log.
ALTPRINT
Specifies the destination for the copies of the output files from SAS procedures.
APPLETLOC
Specifies the location of Java applets.
AUTOEXEC
Specifies the SAS autoexec file.
CACHENUM
Specifies the number of caches used per SAS file.
CACHESIZE
Specifies the size of cache that is used for each open SAS file.
CC
Tells SAS what type of carriage control to use when it writes to external files.
CONFIG
Specifies the configuration file that is used when initializing or overriding the values of SAS system options.
DEQMULT
Specifies the number of pages to extend a file.
DETACH
Specifies that the asynchronous host command uses a detached process.
DUMP
Specifies when to create a process dump file.
EDITCMD
Specifies the host editor to be used with the HOSTEDIT command.
EMAILSYS
Specifies the e-mail protocol to use for sending electronic mail.
EXPANDLNM
Specifies whether concealed logical names are expanded when libref paths are displayed to the user.
FILECC
Specifies how to treat data in column 1 of a print file.
FULLSTIMER
Specifies whether to write all available system performance statistics to the SAS log.
GSFCC
Tells SAS what type of carriage control to use for writing to graphics stream files.
SAS System Options
4
SAS Companion for OpenVMS on HP Integrity Servers
2005
System Option
Description
HELPHOST
Specifies the name of the local computer where the remote browsing system is to be displayed.
HELPINDEX
Specifies one or more index files for the SAS Help and Documentation.
HELPLOC
Specifies the location of the text and index files for the facility that is used to view SAS Help and Documentation.
HELPTOC
Specifies the table of contents files for the SAS Help and Documentation.
JREOPTIONS
Identifies the Java Runtime Environment (JRE) options for SAS.
LOADLIST
Specifies whether to print to the specified file the information about images that SAS has loaded into memory.
LOG
Specifies a destination for a copy of the SAS log when running in batch mode.
LOGMULTREAD
Specifies the session log file to be opened for shared read access.
MEMSIZE
Specifies the limit on the total amount of memory that can be used by a SAS session.
MSG
Specifies the library that contains SAS error messages.
MSGCASE
Specifies whether notes, warnings, and error messages that are generated by SAS are displayed in uppercase characters.
OPLIST
Specifies whether the settings of the SAS system options are written to the SAS log.
PRINT
Specifies a destination for SAS output when running in batch mode.
REALMEMSIZE
Specifies the amount of real memory SAS can expect to allocate.
SORTPGM
Specifies whether SAS sorts using use the SAS sort utility or the host sort utility.
SORTWORK
Defines locations for host sort work files.
SPAWN
Specifies that SAS is invoked in a SPAWN/NOWAIT subprocess.
STIMEFMT
Specifies the format that is used to display time on STIMER output.
STIMER
Specifies whether to write a subset of system performance statistics to the SAS log.
SYSIN
Specifies the default location of SAS source programs.
SYSPRINT
Specifies the destination for printed output.
TERMIO
Specifies whether terminal I/O is blocking or non-blocking.
USER
Specifies the default permanent SAS library.
VERBOSE
Specifies whether SAS writes the system options that are set when SAS starts to the VMS computer in the SAS windowing environment or, in batch, to the batch log.
WORKCACHE
Specifies the size of the I/O data cache allocated for a file in the WORK library.
XCMD
Specifies whether the X command is valid in the SAS session.
2006
SAS Companion for UNIX Environments
4
Chapter 7
System Option
Description
XCMDWIN
Specifies whether to create a DECTERM window for X command output when in the SAS windowing environment.
XKEYPAD
Specifies that subprocesses use the keypad settings that were in effect before you invoked SAS.
XLOG
Specifies whether to display the output from the X command in the SAS log file.
XLOGICAL
Specifies that process-level logical names are passed to the subprocess that is spawned by an X statement or X command.
XOUTPUT
Specifies whether to display the output from the X command.
XRESOURCES
Specifies a character string of X resource options or the application instance name for the SAS interface to Motif.
XSYMBOL
Specifies that global symbols are passed to the subprocess that is spawned by an X statement or X command.
XTIMEOUT
Specifies how long a subprocess that has been spawned by an X statement or X command remains inactive before being deleted.
SAS Companion for UNIX Environments The system options listed here are documented only in SAS Companion for UNIX Environments. Other system options in SAS Companion for UNIX Environments contain information specific to the UNIX operating environment, where the main documentation is in SAS Language Reference: Dictionary. These latter system options are not listed here. System Option
Description
ALTLOG
Specifies a destination for a copy of the SAS log.
ALTPRINT
Specifies the destination for the copies of the output files from SAS procedures.
AUTOEXEC
Specifies the SAS autoexec file.
CONFIG
Specifies the configuration file that is used when initializing or overriding the values of SAS system options.
ECHO
Specifies a message to be echoed to the computer.
EDITCMD
Specifies the host editor to be used with the HOSTEDIT command.
EMAILSYS
Specifies the e-mail protocol to use for sending electronic mail.
FILELOCKS
Specifies whether external file locking is turned on or off and what action should be taken if a file cannot be locked.
FILELOCKWAITMAX
Sets an upper limit on the time SAS will wait for a locked file.
FULLSTIMER
Specifies whether to write all available system performance statistics and the datetime stamp to the SAS log.
HELPINDEX
Specifies one or more index files for the SAS Help and Documentation.
SAS System Options
4
SAS Companion for UNIX Environments
2007
System Option
Description
HELPLOC
Specifies the location of the text and index files for the facility that is used to view SAS Help and Documentation.
HELPTOC
Specifies the location of the table of contents files for the SAS Help and Documentation.
JREOPTIONS
Identifies the Java Runtime Environment (JRE) options for SAS.
LOG
Specifies a destination for a copy of the SAS log when running in batch mode.
LPTYPE
Specifies which UNIX command and options settings will be used to route files to the printer.
MAXMEMQUERY
Specifies the maximum amount of memory that is allocated per request for certain procedures.
MEMSIZE
Specifies the limit on the total amount of virtual memory that can be used by a SAS session.
MSG
Specifies the library that contains the SAS error messages.
MSGCASE
Specifies whether notes, warnings, and error messages that are generated by SAS are displayed in uppercase characters.
OPTLIST
Specifies whether the settings of the SAS system options are written to the SAS log.
PATH
Specifies one or more search paths for SAS executable files.
PRINT
Specifies a destination for SAS output when running in batch mode.
PRINTCMD
Specifies the print command SAS is to use.
REALMEMSIZE
Specifies the amount of real (physical) memory SAS can expect to allocate.
RTRACE
Produces a list of resources that are read or loaded during a SAS session.
RTRACELOC
Specifies the pathname of the file to which the list of resources that are read or loaded during a SAS session is written.
SASSCRIPT
Specifies one or more storage locations of SAS/CONNECT script files.
SET
Defines an environment variable.
SORTANOM
Specifies certain options for the host sort utility.
SORTCUT
Specifies the number of observations that SAS sorts; if the number of observation in the data set is greater than the specified number, the host sort program sorts the remaining observations.
SORTCUTP
Specifies the number of bytes that SAS sorts; if the number of bytes in the data set is greater than the specified number, the host sort program sorts the remaining data set.
SORTDEV
Specifies the pathname used for temporary files created by the host sort utility.
SORTNAME
Specifies the name of the host sort utility.
SORTPARM
Specifies parameters for the host sort utility.
SORTPGM
Specifies whether SAS sorts using the SAS sort utility or the host sort utility.
2008
SAS Companion for z/OS
4
Chapter 7
System Option
Description
STDIO
Specifies whether SAS should use stdin, stdout, and stderr.
STIMEFMT
Specifies the format that is used to display the time on FULLSTIMER and STIMER output.
STIMER
Specifies whether to write a subset of system performance statistics to the SAS log.
SYSIN
Specifies the default location of SAS source code when running in batch mode.
SYSPRINT
Specifies the destination for printed output.
VERBOSE
Specifies whether SAS writes the system option settings to the SAS log.
WORKPERMS
Sets the permissions of the SAS Work library when it is initially created.
XCMD
Specifies whether the X command is valid in the SAS session.
SAS Companion for z/OS The system options listed here are documented only in SAS Companion for z/OS. Other system options in SAS Companion for z/OS contain information specific to the z/OS operating environment, where the main documentation is in SAS Language Reference: Dictionary . These latter system options are not listed here. System Option
Description
ALTLOG=
Specifies a destination for a copy of the SAS log.
ALTPRINT=
Specifies the destination for the copies of the output files from SAS procedures.
APPEND=
Appends the specified value to the existing value of the specified system option.
AUTOEXEC=
Specifies the SAS autoexec file.
BLKALLOC
Causes SAS to set LRECL and BLKSIZE values for a SAS library when it is allocated rather than when it is first accessed.
BLKSIZE=
Specifies the default block size for SAS libraries.
BLSKIZE(device-type)=
Specifies the default block size for SAS libraries by device-type.
CAPSOUT
Specifies that all output is to be converted to uppercase.
CHARTYPE=
Specifies a character set or screen size to use for a device.
CLIST
Specifies that SAS obtains its input from a CLIST.
CONFIG=
Specifies the configuration file that is used when initializing or overriding the values of SAS system options.
DLDISPCHG
Controls changes in allocation disposition for an existing library data set.
DLDSNTYPE
Specifies the default value of the DSNTYPE libname option.
SAS System Options
4
SAS Companion for z/OS
2009
System Option
Description
DLEXCPCOUNT
Reports number of EXCPs to direct access bound SAS libraries.
DLHFSDIRCREATE
Creates an HFS directory for a SAS library that is specified with LIBNAME if the library does not exist.
DLMSGLEVEL=
Specifies the level of messages to generate for SAS libraries.
DLSEQDSNTYPE
Specifies the default value of the DSNTYPE libname option for sequential-format disk files.
DLTRUNCHK
Enables checking for SAS library truncation.
DLRESV
Requests exclusive use of shared disk volumes when accessing partitioned data sets on shared disk volumes.
DYNALLOC
Controls whether SAS or the host sort utility allocates sort work data sets.
ECHO=
Specifies a message to be echoed to the SAS log while initializing SAS.
EMAILSYS=
Specifies the e-mail protocol to use for sending electronic mail.
FILEAUTHDEFER
Controls whether SAS performs file authorization checking for z/OS data sets or defers authorization checking to z/OS system services such as OPEN.
FILEBLKSIZE(devicetype)=
Specifies the default maximum block size for external files.
FILECC
Specifies whether to treat data in column 1 of a printer file as carriage-control data when reading the file.
FILEDEST=
Specifies the default printer destination.
FILEDEV=
Specifies the device name used for allocating new physical files.
FILEDIRBLK=
Specifies the number of default directory blocks to allocate for new partitioned data sets.
FILEEXT=
Specifies how to handle file extensions when accessing members of partitioned data sets.
FILEFORMS=
Specifies the default SYSOUT form for a print file.
FILELBI
Controls the use of the z/OS Large Block Interface support for BSAM and QSAM files, as well as files on tapes that have standard labels.
FILELOCKS=
Specifies the default SAS system file locking that is to be used for external files (both USS and native MVS). Also specifies the operating system file locking to be used for USS files (both SAS files and external files).
FILEMOUNT
Specifies whether an off-line volume is to be mounted.
FILEMSGS
Controls whether you receive expanded dynamic allocation error messages when you are assigning a physical file.
FILENULL
Specifies whether zero-length records are written to external files.
FILEPROMPT
Controls whether you are prompted if you reference a data set that does not exist.
FILEREUSE
Specifies whether to reuse an existing allocation for a file that is being allocated to a temporary ddname.
2010
SAS Companion for z/OS
4
Chapter 7
System Option
Description
FILESEQDSNTYPE
Specifies the default value that is assigned to DSNTYPE when it is not specified with a filename statement, a DD statement, or a TSO ALLOC command.
FILESPPRI=
Specifies the default primary space allocation for new physical files.
FILESPEC=
Specifies the default secondary space allocation for new physical files.
FILESTAT
Specifies whether ISPF statistics are written.
FILESYSOUT=
Specifies the default SYSOUT CLASS for a printer file.
FILESYSTEM=
Specifies the default file system used when the filename is ambiguous.
FILEUNIT=
Specifies the default unit of allocation for new physical files.
FILEVOL=
Specifies which VOLSER to use for new physical files.
FILSZ
Specifies that the host sort utility supports the FILSZ parameter.
FSBCOLOR
Specifies whether you can set background colors in SAS windows on vector graphics devices.
FSBORDER=
Specifies what type of symbols are to be used in borders.
FSDEVICE=
Specifies the full-screen device driver for your terminal.
FSMODE=
Specifies the full-screen data stream type.
FULLSTATS
Specifies whether to write all available system performance statistics to the SAS log.
GHFONT=
Specifies the default graphics hardware font.
HELPCASE
Controls how text is displayed in the help browser.
HELPHOST
Specifies the name of the computer where the remote help browser is running.
HELPLOC=
Specifies the location of the text and index files for the facility that is used to view SAS Help and Documentation.
HSLXTNTS=
Specifies the size of each physical hyperspace that is created for a SAS library.
HSMAXPGS=
Specifies the maximum number of hyperspace pages allowed in a SAS session.
HSMAXSPC=
Specifies the maximum number of hyperspaces allowed in a SAS session.
HSSAVE
Controls how often the DIV data set pages are updated when a DIV data set backs a hyperspace library.
HSWORK
Tells SAS to place the WORK library in a hyperspace.
INSERT
Inserts the specified value at the beginning of the specified system option.
ISPCAPS
Specifies whether to convert to uppercase printable ISPF parameters that are used in CALL ISPEXEC and CALL ISPLINK.
ISPCHARF
Specifies whether the values of SAS character variables are converted using their automatically specified informats or formats each time they are used as ISPF variables.
SAS System Options
4
SAS Companion for z/OS
2011
System Option
Description
ISPCSR=
Tells SAS to set an ISPF variable to the name of a variable whose value is found to be invalid.
ISPEXECV=
Specifies the name of an ISPF variable that passes its value to an ISPF service.
ISPMISS=
Specifies the value assigned to SAS character variables defined to ISPF when the associated ISPF variable has a length of zero.
ISPMSG=
Tells SAS to set an ISPF variable to a message ID when a variable is found to be invalid.
ISPNOTES
Specifies whether ISPF error messages are to be written to the SAS log.
ISPNZTRC
Specifies whether nonzero ISPF service return codes are to be written to the SAS log.
ISPPPT
Specifies whether ISPF parameter value pointers and lengths are to be written to the SAS log.
ISPTRACE
Specifies whether the parameter lists and service return codes are to be written to the SAS log.
ISPVDEFA
Specifies whether all current SAS variables are to be identified to ISPF via the SAS VDEFINE user exit.
ISPVDLT
Specifies whether VDELETE is executed before each SAS variable is identified to ISPF via VDEFINE.
ISPVDTRC
Specifies whether to trace every VDEFINE for SAS variables.
ISPVIMSG=
Specifies the ISPF message ID that is to be set by the SAS VDEFINE user exit when the informat for a variable returns a nonzero return code.
ISPVRMSG=
Specifies the ISPF message ID that is to be set by the SAS VDEFINE user exit when a variable has a null value.
ISPVTMSG=
Specifies the ISPF message ID that is to be displayed by the SAS VDEFINE user exit when the ISPVTRAP option is in effect.
ISPVTNAM=
Restricts the information that is displayed by the ISPVTRAP option to the specified variable only.
ISPVTPNL=
Specifies the name of the ISPF panel that is to be displayed by the SAS VDEFINE user exit when the ISPVTRAP option is in effect.
ISPVTRAP
Specifies whether the SAS VDEFINE user exit is to write information to the SAS log (for debugging purposes) each time it is entered.
ISPVTVARS=
Specifies the prefix for the ISPF variables to be set by the SAS VDEFINE user exit when the ISPVTRAP option is in effect.
JREOPTIONS=
Identifies the Java Runtime Environment (JRE) options for SAS.
LOG=
Specifies a destination for a copy of the SAS log when running in batch mode.
MEMLEAVE=
Specifies the amount of memory in the user’s region that is reserved exclusively for the use of the operating environment.
MEMRPT
Specifies whether memory usage statistics are to be written to the SAS log for each step.
2012
SAS Companion for z/OS
4
Chapter 7
System Option
Description
MEMSIZE=
Specifies the limit on the total amount of memory that can be used by a SAS session.
MINSTG
Tells SAS whether to minimize its use of storage.
MSG=
Specifies the library that contains the SAS error messages.
MSGCASE
Specifies whether notes, warnings, and error messages that are generated by SAS are displayed in uppercase characters.
MSGSIZE=
Specifies the size of the message cache.
OPLIST
Specifies whether the settings of the SAS system options are written to the SAS log.
PFKEY=
Specifies which set of function keys to designate as the primary set of function keys.
PGMPARM=
Specifies the parameter that is passed to the external program specified by the SYSINP= option.
PRINT=
Specifies a destination for SAS output when running in batch mode.
PROCLEAVE=
Specifies how much memory to leave unallocated for SAS procedures to use to complete critical functions during out-of-memory conditions.
REALMEMSIZE=
Specifies the amount of real memory SAS can expect to allocate.
REXXLOC=
Specifies the ddname of the REXX library to be searched when the REXXMAC option is in effect.
REXXMAC
Enables or disables the REXX interface.
SASLIB=
Specifies the ddname for an alternate load library.
SASSCRIPT
Specifies one or more storage locations of SAS/CONNECT script files.
SEQENGINE=
Specifies the default engine for sequential SAS libraries.
SET=
Defines an environment variable.
SORT=
Specifies the minimum size of all allocated sort work data sets.
SORTALTMSGF
Enables sorting with alternate message flags.
SORTBLKMODE
Enables block mode sorting.
SORTBUFMOD
Enables modification of the sort utility output buffer.
SORTCUTP=
Specifies the number of bytes that SAS sorts; if the number of observations in the data set is greater that the specified number, the host sort program sorts the remaining observations.
SORTDEV=
Specifies the unit device name if SAS dynamically allocates the sort work file.
SORTDEVWARN
Enables device type warnings.
SORTEQOP
Specifies whether the host sort utility supports the EQUALS option.
SORTLIB=
Specifies the name of the sort library.
SORTLIST
Enables passing of the LIST parameter to the host sort utility.
SORTMSG
Controls the class of messages to be written by the host sort utility.
SORTMSG=
Specifies the ddname to be dynamically allocated for the message print file of the host sort utility.
SAS System Options
4
SAS Companion for z/OS
2013
System Option
Description
SORTNAME=
Specifies the name of the host sort utility.
SORTOPTS
Specifies whether the host sort utility supports the OPTIONS statement.
SORTPARM=
Specifies parameters for the host sort utility.
SORTPGM=
Specifies whether SAS sorts using the SAS sort utility or the host sort utility.
SORTSHRB
Specifies whether the host sort interface can modify data in buffers.
SORTSUMF
Specifies whether the host sort utility supports the SUM FIELDS=NONE control statement.
SORTUADCON
Specifies whether the host sort utility supports passing a user address constant to the E15/E35 exits.
SORTUNIT=
Specifies the unit of allocation for sort work files.
SORTWKDD=
Specifies the prefix of sort work data sets.
SORTWKNO=
Specifies how many sort work data sets to allocate.
SORT31PL
Controls what type of parameter list is used to invoke the host sort utility.
STAE
Enables or disables a system abend exit.
STATS
Specifies whether statistics are to be written to the SAS log.
STAX
Specifies whether to enable attention handling.
STIMER
Specifies whether to write a subset of system performance statistics to the SAS log.
SVC11SCREEN
Specifies whether to enable SVC 11 screening to obtain host date and time information.
SYNCHIO
Specifies whether synchronous I/O is enabled.
SYSIN=
Specifies the location of the primary SAS input data stream.
SYSINP=
Specifies the name of an external program that provides SAS input statements.
SYSLEAVE=
Specifies how much memory to leave unallocated to ensure that SAS software tasks are able to terminate successfully.
SYSPREF=
Specifies a prefix for partially qualified physical file names.
SYSPRINT=
Specifies the handling of output that is directed to the default print file.
S99NOMIG
Tells SAS whether to recall a migrated data set.
TAPECLOSE=
Specifies the default CLOSE setting for a SAS library that is on tape.
USER=
Specifies the location of the default SAS library.
V6GUIMODE
Specifies whether SAS uses Version 6 SCL selection list windows.
VERBOSE
Specifies whether SAS writes the system option settings to the SAS log or to the batch log.
WTOUSERDESC=
Specifies a WTO DATA step function descriptor code.
WTOUSERMCSF=
Specifies WTO DATA step function MCS flags.
2014
SAS Data Quality Server: Reference
4
Chapter 7
System Option
Description
WTOUSERROUT=
Specifies a WTO DATA step function routing code.
XCMD
Specifies whether the X command is valid in the SAS session.
SAS Data Quality Server: Reference System Option
Description
DQLOCALE=
Specifies an ordered list of locales.
DQOPTIONS
Specifies SAS session parameters for data quality programs.
DQSETUPLOC=
Specifies the location of the SAS Data Quality Server setup file.
SAS Intelligence Platform: Application Server Administration Guide System Option
Description
OBJECTSERVER
Specifies whether SAS is to run as an Integrated Object Model (IOM) server.
OBJECTSERVERPARMS
Specifies startup parameters for the SAS ojbect servers.
SECPACKAGE
Identifies the security package that the IOM server uses to authenticate incoming client connections.
SECPACKAGELIST
Specifies the security authorization packages used by the server.
SSPI
Identifies support for the Security Provider Interface for SSO connections to IOM servers.
For more information, see the SAS Intelligence Platform documentation on http:// support.sas.com.
SAS Language Interfaces to Metadata System Option
Description
METAAUTORESOURCES=
Identifies the metadata resources that are assigned when SAS starts.
METACONNECT=
Identifies the named connection from the metadata user profiles to use as the default values for logging in to the SAS Metadata Server.
METAENCRYPTALG=
Specifies the type of encryption to use when communicating with a SAS Metadata Server.
SAS System Options
4
SAS Macro Language: Reference
2015
System Option
Description
METAENCRYPTLEVEL=
Specifies what is to be encrypted when communicating with a SAS Metadata Server.
METAPASS=
Specifies the default password for the SAS Metadata Server.
METAPORT=
Specifies the TCP port for the SAS Metadata Server.
METAPROFILE=
Identifies the file that contains the SAS Metadata Server user profiles.
METAPROTOCOL=
Specifies the network protocol for communicating with the SAS Metadata Server.
METAREPOSITORY=
Specifies the default SAS Metadata Repository to use with the SAS Metadata Server.
METASERVER=
Specifies the address of the SAS Metadata Server.
METASPN=
Specifies the service principal name (SPN) for the SAS Metadata Server.
METAUSER=
Specifies the default user ID for logging on to the SAS Metadata Server.
SAS Logging: Configuration and Programming Reference System Option
Description
LOGAPPLNAME
Specifies a SAS session name for SAS logging.
LOGCONFIGLOC
Specifies the name of the configuration file that is used to initialize SAS logging.
SAS Macro Language: Reference System Option
Description
CMDMAC
Controls command-style macro invocation.
IMPLMAC
Controls statement-style macro invocation.
MACRO
Controls whether the SAS macro language is available.
MAUTOLCDISPLAY
Specifies whether to display the source location of the autocall macros in the log when the autocall macro is invoked.
MAUTOSOURCE
Specifies whether the autocall feature is available.
MCOMPILENOTE
Issues a NOTE to the SAS log containing the size and number of instructions upon the completion of the compilation of a macro.
MCOMPILE
Specifies whether to allow new definitions of macros.
MERROR
Specifies whether the macro processor issues a warning message when a macro reference cannot be resolved.
2016
SAS Macro Language: Reference
4
Chapter 7
System Option
Description
MEXECNOTE
Specifies whether to display macro execution information in the SAS log at macro invocation.
MEXECSIZE
Specifies the maximum macro size that can be executed in memory.
MFILE
Specifies whether MPRINT output is routed to an external file.
MINDELIMITER=
Specifies the character to be used as the delimiter for the macro IN operator.
MINOPERATOR
Specifies whether the macro processor recognizes and evaluates the IN (#) logical operator.
MLOGIC
Specifies whether the macro processor traces its execution for debugging.
MLOGICNEST
Specifies whether to display the macro nesting information in the MLOGIC output in the SAS log.
MPRINT
Specifies whether SAS statements generated by macro execution are traced for debugging.
MPRINTNEST
Specifies whether to display the macro nesting information in the MPRINT output in the SAS log.
MRECALL
Specifies whether autocall libraries are searched for a member that was not found during an earlier search.
MREPLACE
Specifies whether to enable existing macros to be redefined.
MSTORED
Specifies whether the macro facility searches a specific catalog for a stored compiled macro.
MSYMTABMAX
Specifies the maximum amount of memory available to the macro variable symbol tables.
MVARSIZE
Specifies the maximum size for macro variable values that are stored in memory.
SASAUTOS
Specifies the location of one or more autocall libraries.
SASMSTORE=
Identifies the libref of a SAS library with a catalog that contains, or will contain, stored compiled SAS macros.
SERROR
Specifies whether the macro processor issues a warning message when a macro variable reference does not match a macro variable.
SYMBOLGEN
Specifies whether the results of resolving macro variable references are written to the SAS log for debugging.
SYSPARM
Specifies a character string that can be passed to SAS programs.
SAS System Options
4
SAS Scalable Performance Data Engine: Reference
2017
SAS National Language Support (NLS): Reference Guide System Option
Description
BOMFILE
Specifies whether to write the Byte Order Mark (BOM) prefix on Unicode encoded external files.
DATESTYLE
Identifies the sequence of month, date, and year when the ANYDTDTM, ANYDTDTE, or ANYDTTME informats encounter input where the year, month, and day determination is ambiguous.
DBCS
Recognizes double-byte character sets.
DBCSLANG
Specifies a double-byte character set (DBCS) language.
DBCSTYPE
Specifies the encoding method to use for a double-byte character set (DBCS).
DFLANG
Specifies the language for international date informats and formats.
ENCODING
Specifies the default character-set encoding for the SAS session.
FSDBTYPE
Specifies a full-screen double-byte character set (DBCS) encoding method.
FSIMM
Specifies input method modules (IMMs) for a full-screen double-byte character set (DBCS).
FSIMMOPT
Specifies options for input method modules (IMMs) that are used with a full-screen double-byte character set (DBCS).
LOCALE
Specifies a set of attributes in a SAS session that reflect the language, local conventions, and culture for a geographical region.
LOCALELANGCHG
Determines whether the language of the ODS output text can be changed.
NLSCOMPATMODE
Provides national language compatibility with a previous release of SAS.
RSASIOTRANSERROR
Displays a transcoding error when illegal data is read from a remote application.
SORTSEQ
Specifies a language-specific collating sequence for the SORT procedure to use in the current SAS session.
TRANTAB
Specifies the translation tables that are used by various parts of SAS.
SAS Scalable Performance Data Engine: Reference System Option
Description
COMPRESS=
Specifies to compress the SPD Engine data sets on disk as they are being created.
MAXSEGRATIO=
When evaluating a WHERE expression that contains indexed variables, controls what percentage of index segments to identify as candidate segments before processing the WHERE expression.
2018
SAS VSAM Processing for Z/OS
4
Chapter 7
System Option
Description
MINPARTSIZE=
Specifies a minimum partition size to use for creating SPE Engine data sets.
SPDEINDEXSORTSIZE=
Specifies the size of memory space that the sorting utility can use when sorting values for creating an index.
SPDEMAXTHREADS=
Specifies the upper limit on the number of threads that the SPD Engine can spawn for I/O processing.
SPDESORTSIZE=
Specifies the size of memory space needed for sorting operations used by the SPD Engine.
SPDEUTILLOC=
Specifies one or more file system locations in which the SPD Engine can temporarily store utility files.
SPDEWHEVAL=
Specifies the process used to determine which observations meet the conditions of a WHERE expression.
SAS VSAM Processing for Z/OS
System Option
Description
VSAMLOAD
Enables you to load a VSAM data set.
VSAMREAD
Enables the user to read a VSAM data set.
VSAMRLS
Enables record-level sharing for a VSAM data set.
VSAMUPDATE
Enables you to update a VSAM data set.
SAS/ACCESS for Relational Databases: Reference
System Option
Description
DBIDIRECTEXEC=
Controls SQL optimization for SAS/ACCESS engines.
DBSRVTP=
Specifies whether SAS/ACCESS engines put a hold (or block) on the originating client while making performance-critical calls to the database. This option applies when SAS is invoked as a server responding to multiple clients .
DBSLICEPARM=
Controls the scope of DBMS threaded reads and the number of threads.
SASTRACE=
Generates trace information from a DBMS engine.
SASTRACELOC=
Prints SASTRACE information to a specified location.
SAS System Options
4
SAS/CONNECT User’s Guide
2019
System Option
Description
SQLMAPPUTTO=
Specifies whether the PUT function in the SQL procedure is processed by SAS or by the SAS_PUT( ) function inside the Teradata database.
VALIDVARNAME=
Controls the type of SAS variable names that can be used or created during a SAS session.
SAS/CONNECT User’s Guide System Option
Description
AUTOSIGNON
Automatically signs on to the server when the client issues a remote submit request for server processing.
COMAMID=
Identifies the communication access method for connecting a client and a server across a network.
CONNECTPERSIST
Specifies whether a connection between a client and a server persists (continues) after the RSUBMIT has completed.
CONNECTREMOTE=
Identifies the server session that a SAS/CONNECT client connects to.
CONNECTSTATUS
Specifies the default setting for the display of the Transfer Status window.
CONNECTWAIT
Specifies whether remote submits are executed synchronously or asynchronously.
DMR
Specifies to invoke a server session.
SASCMD=
Specifies the command that starts a server session on a multi-processor (SMP) machine.
SASFRSCR
Is a read-only option that contains the fileref that is generated by the SASSCRIPT= option.
SASSCRIPT=
Specifies one or more storage locations for SAS/CONNECT script files.
SIGNONWAIT
Specifies whether a SAS/CONNECT SIGNON should be executed asynchronously or synchronously.
SYSRPUTSYNC
Sets %SYSRPUT macro variable in the client session when the %SYSRPUT statements are executed rather than when a synchronization point in encountered.
TBUFSIZE=
Specifies the size of the buffer that is used by the SAS application layer for transferring data between a client and a server across a network.
TCPPORTFIRST=
Specifies the first value in a range of TCP/IP ports for a client to use to connect to a server.
TCPPORTLAST=
Specifies the last value in a range of TCP/IP ports for a client to use to connect to a server.
2020
SAS/SHARE User’s Guide
4
Chapter 7
SAS/SHARE User’s Guide System Option
Description
COMAMID=
Identifies the communications access method to connect a SAS/ SHARE client a server SAS session.
COMAUX1=
Specifies the first alternate communications access method.
SHARESESSIONCNTL=
Specifies the condition under which subsequent sessions can be created on a SAS/SHARE server.
TBUFSIZE=
Specifies the value of the default buffer size that the server uses for transferring data.
2021
2
P A R T
Dictionary of Component Object Language Elements Chapter
8. . . . . . . . . . Component Objects
Chapter
9 . . . . . . . . . . Hash and Hash Iterator Object Language Elements
Chapter
10. . . . . . . . .Java Object Language Elements
2023
2085
2027
2022
2023
CHAPTER
8 Component Objects DATA Step Component Objects 2023 The DATA Step Component Interface 2023 Dot Notation and DATA Step Component Objects Definition 2024 Syntax 2024 Rules When Using Component Objects 2025
2024
DATA Step Component Objects SAS provides these five predefined component objects for use in a DATA step: hash and hash iterator objects
enable you to quickly and efficiently store, search, and retrieve data based on lookup keys. For more information, see “Using the Hash Object” and “Using the Hash Iterator Object” in SAS Language Reference: Concepts.
Java object
provides a mechanism that is similar to the Java Native Interface (JNI) for instantiating Java classes and accessing fields and methods on the resultant objects. For more information, see “Using the Java Object” in SAS Language Reference: Concepts.
logger and appender objects
enable you to record logging events and write these events to the appropriate destination. For more information, see “Logger and Appender Object Language Reference” in SAS Logging: Configuration and Programming Reference.
The DATA Step Component Interface The DATA step component object interface enables you to create and manipulate predefined component objects in a DATA step. To declare and create a component object, you use either the DECLARE statement by itself or the DECLARE statement and _NEW_ operator together. Component objects are data elements that consist of attributes, methods, and operators. Attributes are the properties that specify the information that is associated with an object. Methods define the operations that an object can perform. For component objects, operators provide special functionality. You use the DATA step object dot notation to access the component object’s attributes and methods.
2024
Dot Notation and DATA Step Component Objects
4
Chapter 8
Note: The DATA step component object’s statements, attributes, methods, and operators are limited to those that are defined for these objects. You cannot use the SAS Component Language functionality with these predefined DATA step objects. 4
Dot Notation and DATA Step Component Objects
Definition Dot notation provides a shortcut for invoking methods and for setting and querying attribute values. Using dot notation makes your SAS programs easier to read. To use dot notation with a DATA step component object, you must declare and instantiate the component object by using either the DECLARE statement by itself or the DECLARE statement and the _NEW_ operator together. For more information, see “Using DATA Step Component Objects” in SAS Language Reference: Concepts and “Logger and Appender Object Language Reference” in SAS Logging: Configuration and Programming Reference.
Syntax The syntax for dot notation is as follows: object.attribute
or object.method();
The arguments are defined as follows: object specifies the variable name for the DATA step component object. attribute specifies an object attribute to assign or query. When you set an attribute for an object, the code takes this form: object.attribute = value;
When you query an object attribute, the code takes this form: value = object.attribute;
method specifies the name of the method to invoke. argument_tag identifies the arguments that are passed to the method. Enclose the argument tag in parentheses. The parentheses are required whether the method contains argument tags. All DATA step component object methods take this form: return_code=object.method();
The return code indicates method success or failure. A return code of zero indicates success; a nonzero value indicates failure. If you do not supply a return
Component Objects
4
Rules When Using Component Objects
2025
code variable for the method call and the method fails, an appropriate error message is printed to the log. value specifies the argument value.
Rules When Using Component Objects 3 You can assign objects in the same manner as you assign DATA step variables. However, the object types must match. The first set of code is valid, but the second generates an error. declare hash h(); declare hash t(); t=h;
declare hash t(); declare javaobj j(); j=t;
3 You cannot declare arrays of objects. The following code would generate an error: declare hash h1(); declare hash h2(); array h h1--h2;
3 You can store a component object in a hash object as data but not as keys. data _null_; declare hash h1(); declare hash h2(); length key1 key2 $20; h1.defineKey(’key1’); h1.defineData(’key1’, ’h2’); h1.defineDone(); key1 = ’abc’; h2 = _new_ hash(); h2.defineKey(’key2’); h2.defineDone(); key2 = ’xyz’; h2.add(); h1.add(); key1 = ’def’; h2 = _new_ hash(); h2.defineKey(’key2’); h2.defineDone(); key1 = ’abc’; rc = h1.find(); h2.output(dataset: ’work.h2’);
2026
Rules When Using Component Objects
4
Chapter 8
run; proc print data=work.h2; run;
The data set WORK.H2 is displayed. Output 8.1 Output Data Set WORK.H2
Obs 1
key2 xyz
3 You cannot use component objects with comparison operators other than the equal sign (=). If H1 and H2 are hash objects, the following code will generate an error: if h1>h2 then
3 After you declare and instantiate a component object, you cannot assign a scalar value to it. If J is a Java object, the following code will generate an error: j=5;
3 You have to be careful to not delete object references that might still be in use or that have already been deleted by reference. In the following code, the second DELETE statement will generate an error because the original H1 object has already been deleted through the reference to H2. The original H2 can no longer be referenced directly. declare hash h1(); declare hash h2(); declare hash t(); t=h2; h2=h1; h2.delete(); t.delete();
3 You cannot use component objects in argument tag syntax. In the following example, using the H2 hash object in the ADD methods will generate an error. declare hash h2(); declare hash h(); h.add(key: h2); h.add(key: 99, data: h2);
2027
CHAPTER
9 Hash and Hash Iterator Object Language Elements ADD Method 2027 CHECK Method 2029 CLEAR Method 2031 DECLARE Statement, Hash and Hash Iterator Objects DEFINEDATA Method 2033 DEFINEDONE Method 2035 DEFINEKEY Method 2036 DELETE Method 2038 EQUALS Method 2039 FIND Method 2040 FIND_NEXT Method 2043 FIND_PREV Method 2045 FIRST Method 2046 HAS_NEXT Method 2048 HAS_PREV Method 2050 ITEM_SIZE Attribute 2051 LAST Method 2052 NEXT Method 2058 NUM_ITEMS Attribute 2059 OUTPUT Method 2060 PREV Method 2064 REF Method 2065 REMOVE Method 2067 REMOVEDUP Method 2070 REPLACE Method 2072 REPLACEDUP Method 2075 SETCUR Method 2077 SUM Method 2079 SUMDUP Method 2081
2033
ADD Method Adds the specified data that is associated with the given key to the hash object. Applies to:
Hash object
2028
ADD Method
4
Chapter 9
Syntax rc=object.ADD(< KEY: keyvalue-1,…, KEY: keyvalue-n, DATA: datavalue-1, …, DATA: datavalue-n>);
Arguments
rc
specifies whether the method succeeded or failed. A return code of zero indicates success; a non-zero value indicates failure. If you do not supply a return code variable for the method call and the method fails, then an appropriate error message is written to the log. object
specifies the name of the hash object. KEY: keyvalue
specifies the key value whose type must match the corresponding key variable that is specified in a DEFINEKEY method call. The number of “KEY: keyvalue” pairs depends on the number of key variables that you define by using the DEFINEKEY method. DATA: datavalue
specifies the data value whose type must match the corresponding data variable that is specified in a DEFINEDATA method call. The number of “DATA: datavalue” pairs depends on the number of data variables that you define by using the DEFINEDATA method.
Details You can use the ADD method in one of two ways to store data in a hash object. You can define the key and data item, and then use the ADD method as shown in the following code: data _null_; length k $8; length d $12; /* Declare hash object and key and data variable names */ if _N_ = 1 then do; declare hash h(); rc = h.defineKey(’k’); rc = h.defineData(’d’); rc = h.defineDone(); end; /* Define constant key and data values */ k = ’Joyce’; d = ’Ulysses’; /* Add key and data values to hash object */ rc = h.add(); run;
Hash and Hash Iterator Object Language Elements
4
CHECK Method
2029
Alternatively, you can use a shortcut and specify the key and data directly in the ADD method call as shown in the following code: data _null_; length k $8; length d $12; /* Define hash object and key and data variable names */ if _N_ = 1 then do; declare hash h(); rc = h.defineKey(’k’); rc = h.defineData(’d’); rc = h.defineDone(); /* avoid uninitialized variable notes */ call missing(k, d); end; /* Define constant key and data values and add to hash object */ rc = h.add(key: ’Joyce’, data: ’Ulysses’); run;
If you add a key that is already in the hash object, then the ADD method will return a non-zero value to indicate that the key is already in the hash object. Use the REPLACE method to replace the data that is associated with the specified key with new data. If you do not specify the data variables with the DEFINEDATA method, the data variables are automatically assumed to be same as the keys. If you use the KEY: and DATA: argument tags to specify the key and data directly, you must use both argument tags. The ADD method does not set the value of the data variable to the value of the data item. It only sets the value in the hash object.
See Also Statements: “DEFINEDATA Method” on page 2033 “DEFINEKEY Method” on page 2036 “REF Method” on page 2065 “Storing and Retrieving Data” in SAS Language Reference: Concepts
CHECK Method Checks whether the specified key is stored in the hash object. Applies to:
Hash object
Syntax rc=object.CHECK();
2030
CHECK Method
4
Chapter 9
Arguments rc
specifies whether the method succeeded or failed. A return code of zero indicates success; a non-zero value indicates failure. If you do not supply a return code variable for the method call and the method fails, then an appropriate error message is written to the log. object
specifies the name of the hash object. KEY: keyvalue
specifies the key value whose type must match the corresponding key variable that is specified in a DEFINEKEY method call. The number of “KEY: keyvalue” pairs depends on the number of key variables that you define by using the DEFINEKEY method.
Details You can use the CHECK method in one of two ways to find data in a hash object. You can specify the key, and then use the CHECK method as shown in the following code: data _null_; length k $8; length d $12; /* Declare hash object and key and data variable names */ if _N_ = 1 then do; declare hash h(); rc = h.defineKey(’k’); rc = h.defineData(’d’); rc = h.defineDone(); /* avoid uninitialized variable notes */ call missing(k, d); end; /* Define constant key and data values and add to hash object */ rc = h.add(key: ’Joyce’, data: ’Ulysses’); /* Verify that JOYCE key is in hash object */ k = ’Joyce’; rc = h.check(); if (rc = 0) then put ’Key is in the hash object.’; run;
Alternatively, you can use a shortcut and specify the key directly in the CHECK method call as shown in the following code: data _null_; length k $8; length d $12; /* Declare hash object and key and data variable names */
Hash and Hash Iterator Object Language Elements
4
CLEAR Method
2031
if _N_ = 1 then do; declare hash h(); rc = h.defineKey(’k’); rc = h.defineData(’d’); rc = h.defineDone(); /* avoid uninitialized variable notes */ call missing(k, d); end; /* Define constant key and data values and add to hash object */ rc = h.add(key: ’Joyce’, data: ’Ulysses’); /* Verify that JOYCE key is in hash object */ rc = h.check(key: ’Joyce’); if (rc =0) then put ’Key is in the hash object.’; run;
Comparisons The CHECK method only returns a value that indicates whether the key is in the hash object. The data variable that is associated with the key is not updated. The FIND method also returns a value that indicates whether the key is in the hash object. However, if the key is in the hash object, then the FIND method also sets the data variable to the value of the data item so that it is available for use after the method call.
See Also Methods: “FIND Method” on page 2040 “DEFINEKEY Method” on page 2036
CLEAR Method Removes all items from the hash object without deleting the hash object instance. Applies to:
Hash object
Syntax rc=object.CLEAR();
Arguments rc
specifies whether the method succeeded or failed. A return code of zero indicates success; a nonzero value indicates failure. If you do not supply a return code variable for the method call and the method fails, then an appropriate error message is written to the log.
2032
CLEAR Method
4
Chapter 9
object
specifies the name of the hash object.
Details The CLEAR method enables you to remove items from and reuse an existing hash object without having to delete the object and create a new one. If you want to remove the hash object instance completely, use the DELETE method. Note: The CLEAR method does not change the value of the DATA step variables. It only clears the values in the hash object. 4
Examples The following example declares a hash object, gets the number of items in the hash object, and then clears the hash object without deleting it. data mydata; do i = 1 to 10000; output; end; run; data _null_; length i 8; /* Declares the hash object named MYHASH using the data set MyData. */ dcl hash myhash(dataset: ’mydata’); myhash.definekey(’i’); myhash.definedone(); call missing (i); /* Uses the NUM_ITEMS attribute, which returns the number of items in the hash object. */ n = myhash.num_items; put n=; /* Uses the CLEAR method to delete all items within MYHASH. */ rc = myhash.clear(); /* Writes the number of items in the log. */ n = myhash.num_items; put n=; run;
The first PUT statement writes the number of items in the hash table MYHASH before it is cleared. n=10000
The second PUT statement writes the number of items in the hash table MYHASH after it is cleared. n=0
Hash and Hash Iterator Object Language Elements
4
DEFINEDATA Method
2033
See Also Methods: “DELETE Method” on page 2038
DECLARE Statement, Hash and Hash Iterator Objects Declares a hash or hash iterator object; creates an instance of and initializes data for a hash or hash iterator object. Valid in: See:
DATA step
“DECLARE Statement, Hash and Hash Iterator Objects” on page 1427
DEFINEDATA Method Defines data, associated with the specified data variables, to be stored in the hash object. Applies to:
Hash object
Syntax rc=object.DEFINEDATA(’datavarname-1’< ,…’datavarname-n’>); rc=object.DEFINEDATA(ALL: ’YES’ | “YES”);
Arguments
rc
specifies whether the method succeeded or failed. A return code of zero indicates success; a non-zero value indicates failure. If you do not supply a return code variable for the method call and the method fails, then an appropriate error message is written to the log. object
specifies the name of the hash object. ’datavarname’
specifies the name of the data variable. The data variable name can also be enclosed in double quotation marks. ALL: ’YES’ | “YES”
specifies all the data variables as data when the data set is loaded in the object constructor. If the dataset argument tag is used in the DECLARE statement or _NEW_ operator to automatically load a data set, then you can define all the data set variables as data by using the ALL: ’YES’ option.
2034
4
DEFINEDATA Method
Chapter 9
Note: If you use the shortcut notation for the ADD or REPLACE method (for example, h.add(key:99, data:’apple’, data:’orange’)) and use the ALL:’YES’ option on the DEFINEDATA method, then you must specify the data in the same order as it exists in the data set. 4 Note:
The hash object does not assign values to key variables (for example,
h.find(key:’abc’)), and the SAS compiler cannot detect the key and data variable
assignments that are performed by the hash object and the hash iterator. Therefore, if no assignment to a key or data variable appears in the program, then SAS will issue a note stating that the variable is uninitialized. To avoid receiving these notes, you can perform one of the following actions:
3 Set the NONOTES system option. 3 Provide an initial assignment statement (typically to a missing value) for each key and data variable.
3 Use the CALL MISSING routine with all the key and data variables as parameters. Here is an example: length d $20; length k $20; if _N_ = 1 then do; declare hash h(); rc = h.defineKey(’k’); rc = h.defineData(’d’); rc = h.defineDone(); call missing(k, d); end;
4
Details The hash object works by storing and retrieving data based on lookup keys. The keys and data are DATA step variables, which you use to initialize the hash object by using dot notation method calls. You define a key by passing the key variable name to the DEFINEKEY method. You define data by passing the data variable name to the DEFINEDATA method. When you have defined all key and data variables, you must call the DEFINEDONE method to complete initialization of the hash object. Keys and data consist of any number of character or numeric DATA step variables. For detailed information about how to use the DEFINEDATA method, see “Defining Keys and Data” in SAS Language Reference: Concepts.
Examples The following example creates a hash object and defines the key and data variables: data _null_; length d $20; length k $20; /* Declare the hash object and key and data variables */ if _N_ = 1 then do; declare hash h(); rc = h.defineKey(’k’); rc = h.defineData(’d’); rc = h.defineDone();
Hash and Hash Iterator Object Language Elements
4
DEFINEDONE Method
2035
/* avoid uninitialized variable notes */ call missing(k, d); end; run;
See Also Methods: “DEFINEDONE Method” on page 2035 “DEFINEKEY Method” on page 2036 Operators: “_NEW_ Operator, Hash or Hash Iterator Object” on page 2053 Statements: “DECLARE Statement, Hash and Hash Iterator Objects” on page 1427 “Defining Keys and Data” in SAS Language Reference: Concepts
DEFINEDONE Method Indicates that all key and data definitions are complete. Applies to:
Hash object
Syntax rc = object.DEFINEDONE( ); rc = object.DEFINEDONE(MEMRC: ’y’);
Arguments
rc
specifies whether the method succeeded or failed. A return code of zero indicates success; a non-zero value indicates failure. If you do not supply a return code variable for the method call and the method fails, then an appropriate error message is written to the log. object
specifies the name of the hash object. memrc:’y’
enables recovery from memory failure when loading a data set into a hash object. If a call fails because of insufficient memory to load a data set, a nonzero return code is returned. The hash object frees the principal memory in the underlying array. The only allowable operation after this kind of failure is deletion via the DELETE method.
2036
DEFINEKEY Method
4
Chapter 9
Details When the DEFINEDONE method is called and the dataset argument tag is used with the constructor, the data set is loaded into the hash object. The hash object works by storing and retrieving data based on lookup keys. The keys and data are DATA step variables, which you use to initialize the hash object by using dot notation method calls. You define a key by passing the key variable name to the DEFINEKEY method. You define data by passing the data variable name to the DEFINEDATA method. When you have defined all key and data variables, you must call the DEFINEDONE method to complete initialization of the hash object. Keys and data consist of any number of character or numeric DATA step variables. For detailed information about how to use the DEFINEDONE method, see “Defining Keys and Data” in SAS Language Reference: Concepts.
See Also Methods: “DEFINEDATA Method” on page 2033 “DEFINEKEY Method” on page 2036 “Defining Keys and Data” in SAS Language Reference: Concepts.
DEFINEKEY Method Defines key variables for the hash object. Applies to:
Hash object
Syntax rc=object.DEFINEKEY(’keyvarname-1’< …, ’keyvarname-n’>); rc=object.DEFINEKEY(ALL: ’YES’ | ”YES”);
Arguments
rc
specifies whether the method succeeded or failed. A return code of zero indicates success; a non-zero value indicates failure. If you do not supply a return code variable for the method call and the method fails, then an appropriate error message is written to the log. object
specifies the name of the hash object. ’keyvarname’
specifies the name of the key variable. The key variable name can also be enclosed in double quotation marks.
Hash and Hash Iterator Object Language Elements
4
DEFINEKEY Method
2037
ALL: ’YES’ | ”YES”
specifies all the data variables as keys when the data set is loaded in the object constructor. If you use the dataset argument tag in the DECLARE statement or _NEW_ operator to automatically load a data set, then you can define all the key variables by using the ALL: ’YES’ option. Note: If you use the shortcut notation for the ADD, CHECK, FIND, REMOVE, or REPLACE methods (for example, h.add(key:99, data:’apple’, data:’orange’)) and the ALL:’YES’ option on the DEFINEKEY method, then you must specify the keys and data in the same order as they exist in the data set. 4
Details The hash object works by storing and retrieving data based on lookup keys. The keys and data are DATA step variables, which you use to initialize the hash object by using dot notation method calls. You define a key by passing the key variable name to the DEFINEKEY method. You define data by passing the data variable name to the DEFINEDATA method. When you have defined all key and data variables, you must call the DEFINEDONE method to complete initialization of the hash object. Keys and data consist of any number of character or numeric DATA step variables. For detailed information about how to use the DEFINEKEY method, see “Defining Keys and Data” in SAS Language Reference: Concepts. Note:
The hash object does not assign values to key variables (for example,
h.find(key:’abc’)), and the SAS compiler cannot detect the key and data variable
assignments done by the hash object and the hash iterator. Therefore, if no assignment to a key or data variable appears in the program, SAS will issue a note stating that the variable is uninitialized. To avoid receiving these notes, you can perform one of the following actions:
3 Set the NONOTES system option. 3 Provide an initial assignment statement (typically to a missing value) for each key and data variable.
3 Use the CALL MISSING routine with all the key and data variables as parameters. Here is an example: length d $20; length k $20; if _N_ = 1 then do; declare hash h(); rc = h.defineKey(’k’); rc = h.defineData(’d’); rc = h.defineDone(); call missing(k, d); end;
4
2038
DELETE Method
4
Chapter 9
See Also Methods: “DEFINEDATA Method” on page 2033 “DEFINEDONE Method” on page 2035 Operators: “_NEW_ Operator, Hash or Hash Iterator Object” on page 2053 Statements: “DECLARE Statement, Hash and Hash Iterator Objects” on page 1427 “Defining Keys and Data” in SAS Language Reference: Concepts.
DELETE Method Deletes the hash or hash iterator object. Applies to:
Hash object Hash interator object
Syntax rc=object.DELETE( );
Arguments
rc
specifies whether the method succeeded or failed. A return code of zero indicates success; a non-zero value indicates failure. If you do not supply a return code variable for the method call and the method fails, then an appropriate error message is printed to the log. object
specifies the name of the hash or hash iterator object.
Details DATA step component objects are deleted automatically at the end of the DATA step. If you want to reuse the object reference variable in another hash or hash iterator object constructor, you should delete the hash or hash iterator object by using the DELETE method. If you attempt to use a hash or hash iterator object after you delete it, you will receive an error in the log. If you want to delete all the items from within a hash object and save the hash object to use again, use the “CLEAR Method” on page 2031.
Hash and Hash Iterator Object Language Elements
4
EQUALS Method
2039
EQUALS Method Determines whether two hash objects are equal. Applies to:
Hash object
Syntax rc=object.EQUALS(HASH: ’object’, RESULT: variable name);
Arguments rc
specifies whether the method succeeded or failed. A return code of zero indicates success; a nonzero value indicates failure. If you do not supply a return code variable for the method call and the method fails, then an appropriate error message is written to the log. object
specifies the name of a hash object. HASH:’object’
specifies the name of the second hash object that is compared to the first hash object. RESULT: variable name
specifies the name of a numeric variable name to hold the result. If the hash objects are equal, the result variable is 1; otherwise, the result variable is zero.
Details The following example compares H1 to H2 hash objects: length eq k 8; declare hash h1(); h1.defineKey(’k’); h1.defineDone(); declare hash h2(); h2.defineKey(’k’); h2.defineDone(); rc = h1.equals(hash: ’h2’, result: eq); if eq then put ’hash objects equal’; else put ’hash objects not equal’;
The two hash objects are defined as equal when all of the following conditions occur:
3 Both hash objects are the same size—that is, the HASHEXP sizes are equal. 3 Both hash objects have the same number of items—that is, H1.NUM_ITEMS = H2.NUM_ITEMS.
3 Both hash objects have the same key and data structure.
2040
4
FIND Method
Chapter 9
3 In an unordered iteration over H1 and H2 hash objects, each successive record from H1 has the same key and data fields as the corresponding record in H2—that is, each record is in the same position in each hash object and each such record is identical to the corresponding record in the other hash object.
Examples In the following example, the first return call to EQUALS returns a nonzero value and the second return call returns a zero value. data x; length k eq 8; declare hash h1(); h1.defineKey(’k’); h1.defineDone(); declare hash h2(); h2.defineKey(’k’); h2.defineDone(); k = 99; h1.add(); h2.add(); rc = h1.equals(hash: ’h2’, result: eq); put eq=; k = 100; h2.replace(); rc = h1.equals(hash: ’h2’, result: eq); put eq=; run;
FIND Method Determines whether the specified key is stored in the hash object. Applies to:
Hash object
Syntax rc=object.FIND(< KEY: keyvalue–1,…, KEY: keyvalue-n>);
Hash and Hash Iterator Object Language Elements
4
FIND Method
2041
Arguments rc
specifies whether the method succeeded or failed. A return code of zero indicates success; a non-zero value indicates failure. If you do not supply a return code variable for the method call and the method fails, then an appropriate error message is written to the log. object
specifies the name of the hash object. KEY: keyvalue
specifies the key value whose type must match the corresponding key variable that is specified in a DEFINEKEY method call. The number of “KEY: keyvalue” pairs depends on the number of key variables that you define by using the DEFINEKEY method.
Details You can use the FIND method in one of two ways to find data in a hash object. You can specify the key, and then use the FIND method as shown in the following code: data _null_; length k $8; length d $12; /* Declare hash object and key and data variables */ if _N_ = 1 then do; declare hash h(); rc = h.defineKey(’k’); rc = h.defineData(’d’); rc = h.defineDone(); /* avoid uninitialized variable notes */ call missing(k, d); end; /* Define constant key and data values */ rc = h.add(key: ’Joyce’, data: ’Ulysses’); /* Find the key JOYCE */ k = ’Joyce’; rc = h.find(); if (rc = 0) then put ’Key is in the hash object.’; run;
Alternatively, you can use a shortcut and specify the key directly in the FIND method call as shown in the following code: data _null_; length k $8; length d $12; /* Declare hash object and key and data variables */ if _N_ = 1 then do;
2042
FIND Method
4
Chapter 9
declare hash h(); rc = h.defineKey(’k’); rc = h.defineData(’d’); rc = h.defineDone(); /* avoid uninitialized variable notes */ call missing(k, d); end; /* Define constant key and data values */ rc = h.add(key: ’Joyce’, data: ’Ulysses’); /* Find the key JOYCE */ rc = h.find(key: ’Joyce’); if (rc = 0) then put ’Key is in the hash object.’; run;
If the hash object has multiple data items for each key, use the “FIND_NEXT Method” on page 2043 and the “FIND_PREV Method” on page 2045 in conjunction with the FIND method to traverse a multiple data item list.
Comparisons The FIND method returns a value that indicates whether the key is in the hash object. If the key is in the hash object, then the FIND method also sets the data variable to the value of the data item so that it is available for use after the method call. The CHECK method only returns a value that indicates whether the key is in the hash object. The data variable is not updated.
Examples The following example creates a hash object. Two data values are added. The FIND method is used to find a key in the hash object. The data value is returned to the data set variable that is associated with the key. data _null_; length k $8; length d $12; /* Declare hash object and key and data variable names */ if _N_ = 1 then do; declare hash h(); rc = h.defineKey(’k’); rc = h.defineData(’d’); /* avoid uninitialized variable notes */ call missing(k, d); rc = h.defineDone(); end; /* Define constant key and data values and add to hash object */ rc = h.add(key: ’Joyce’, data: ’Ulysses’); rc = h.add(key: ’Homer’, data: ’Odyssey’); /* Verify that key JOYCE is in hash object and */ /* return its data value to the data set variable D */ rc = h.find(key: ’Joyce’); put d=; run;
Hash and Hash Iterator Object Language Elements
4
FIND_NEXT Method
2043
d=Ulysses is written to the SAS log.
See Also Methods: “CHECK Method” on page 2029 “DEFINEKEY Method” on page 2036 “FIND_NEXT Method” on page 2043 “FIND_PREV Method” on page 2045 “REF Method” on page 2065 “Storing and Retrieving Data” in SAS Language Reference: Concepts
FIND_NEXT Method Sets the current list item to the next item in the current key’s multiple item list and sets the data for the corresponding data variables. Applies to:
Hash object
Syntax rc=object.FIND_NEXT( );
Arguments rc
specifies whether the method succeeded or failed. A return code of zero indicates success; a nonzero value indicates failure. If you do not supply a return code variable for the method call and the method fails, an appropriate error message is printed to the log. object
specifies the name of the hash object.
Details The FIND method determines whether the key exists in the hash object. The HAS_NEXT method determines whether the key has multiple data items associated with it. When you have determined that the key has another data item, that data item can be retrieved by using the FIND_NEXT method, which sets the data variable to the value of the data item so that it is available for use after the method call. Once you are in the data item list, you can use the HAS_NEXT and FIND_NEXT methods to traverse the list.
Examples This example uses the FIND_NEXT method to iterate through a data set where several keys have multiple data items. If a key has more than one data item, subsequent items are marked dup.
2044
FIND_NEXT Method
4
Chapter 9
data dup; length key data 8; input key data; datalines; 1 10 2 11 1 15 3 20 2 16 2 9 3 100 5 5 1 5 4 6 5 99 ; data _null_; dcl hash h(dataset:’dup’, multidata: ’y’); h.definekey(’key’); h.definedata(’key’, ’data’); h.definedone(); /* avoid uninitialized variable notes */ call missing (key, data); do key = 1 to 5; rc = h.find(); if (rc = 0) then do; put key= data=; rc = h.find_next(); do while(rc = 0); put ’dup ’ key= data; rc = h.find_next(); end; end; end; run;
The following lines are written to the SAS log.
Output 9.1
Keys with Multiple Data Items
key=1 data=10 dup key=1 5 dup key=1 15 key=2 data=11 dup key=2 9 dup key=2 16 key=3 data=20 dup key=3 100 key=4 data=6 key=5 data=5 dup key=5 99
Hash and Hash Iterator Object Language Elements
4
FIND_PREV Method
2045
See Also Methods: “FIND Method” on page 2040 “FIND_PREV Method” on page 2045 “HAS_NEXT Method” on page 2048 “Non-Unique Key and Data Pairs” in SAS Language Reference: Concepts
FIND_PREV Method Sets the current list item to the previous item in the current key’s multiple item list and sets the data for the corresponding data variables. Applies to:
Hash object
Syntax rc=object.FIND_PREV( );
Arguments rc
specifies whether the method succeeded or failed. A return code of zero indicates success; a nonzero value indicates failure. If you do not supply a return code variable for the method call and the method fails, an appropriate error message is printed to the log. object
specifies the name of the hash object.
Details The FIND method determines whether the key exists in the hash object. The HAS_PREV method determines whether the key has multiple data items associated with it. When you have determined that the key has a previous data item, that data item can be retrieved by using the FIND_PREV method, which sets the data variable to the value of the data item so that it is available for use after the method call. Once you are in the data item list, you can use the HAS_PREV and FIND_PREV methods in addition to the HAS_NEXT and FIND_NEXT methods to traverse the list. See “HAS_NEXT Method” on page 2048 for an example.
2046
4
FIRST Method
Chapter 9
See Also Methods: “FIND Method” on page 2040 “FIND_NEXT Method” on page 2043 “Non-Unique Key and Data Pairs” in SAS Language Reference: Concepts
FIRST Method Returns the first value in the underlying hash object. Applies to:
Hash iterator object
Syntax rc=object.FIRST( );
Arguments
rc
specifies whether the method succeeded or failed. A return code of zero indicates success; a non-zero value indicates failure. If you do not supply a return code variable for the method call and the method fails, an appropriate error message will be printed to the log. object
specifies the name of the hash iterator object.
Details The FIRST method returns the first data item in the hash object. If you use the ordered: ’yes’ or ordered: ’ascending’ argument tag in the DECLARE statement or _NEW_ operator when you instantiate the hash object, then the data item that is returned is the one with the ’least’ key (smallest numeric value or first alphabetic character), because the data items are sorted in ascending key-value order in the hash object. Repeated calls to the NEXT method will iteratively traverse the hash object and return the data items in ascending key order. Conversely, if you use the ordered: ’descending’ argument tag in the DECLARE statement or _NEW_ operator when you instantiate the hash object, then the data item that is returned is the one with the ’highest’ key (largest numeric value or last alphabetic character), because the data items are sorted in descending key-value order in the hash object. Repeated calls to the NEXT method will iteratively traverse the hash object and return the data items in descending key order. Use the LAST method to return the last data item in the hash object. Note: The FIRST method sets the data variable to the value of the data item so that it is available for use after the method call. 4
Hash and Hash Iterator Object Language Elements
4
FIRST Method
2047
Examples The following example creates a data set that contains sales data. You want to list products in order of sales. The data is loaded into a hash object and the FIRST and NEXT methods are used to retrieve the data. data work.sales; input prod $1-6 qty $9-14; datalines; banana 398487 apple 384223 orange 329559 ; data _null_; /* Declare hash object and read SALES data set as ordered */ if _N_ = 1 then do; length prod $10; length qty $6; declare hash h(dataset: ’work.sales’, ordered: ’yes’); declare hiter iter(’h’); /* Define key and data variables */ h.defineKey(’qty’); h.defineData(’prod’); h.defineDone(); /* avoid uninitialized variable notes */ call missing(qty, prod); end; /* Iterate through the hash object and output data values */ rc = iter.first(); do while (rc = 0); put prod=; rc = iter.next(); end; run;
The following lines are written to the SAS log: prod=orange prod=banana prod=apple
See Also Method: “LAST Method” on page 2052 Operators: “_NEW_ Operator, Hash or Hash Iterator Object” on page 2053 Statements: “DECLARE Statement, Hash and Hash Iterator Objects” on page 1427 “Using the Hash Iterator Object” in SAS Language Reference: Concepts
2048
HAS_NEXT Method
4
Chapter 9
HAS_NEXT Method Determines whether there is a next item in the current key’s multiple data item list. Applies to:
Hash object
Syntax rc=object.HAS_NEXT(RESULT: R);
Arguments rc
specifies whether the method succeeded or failed. A return code of zero indicates success; a nonzero value indicates failure. If you do not supply a return code variable for the method call and the method fails, then an appropriate error message is written to the log. object
specifies the name of the hash object. RESULT:R
specifies the numeric variable R, which receives a zero value if there is not another data item in the data item list or a nonzero value if there is another data item in the data item list.
Details If a key has multiple data items, you can use the HAS_NEXT method to determine whether there is a next item in the current key’s multiple data item list. If there is another item, the method will return a nonzero value in the numeric variable R; otherwise, it will return a zero. The FIND method determines whether the key exists in the hash object. The HAS_NEXT method determines whether the key has multiple data items associated with it. When you have determined that the key has another data item, that data item can be retrieved by using the FIND_NEXT method, which sets the data variable to the value of the data item so that it is available for use after the method call. Once you are in the data item list, you can use the HAS_PREV and FIND_PREV methods in addition to the HAS_NEXT and FIND_NEXT methods to traverse the list.
Examples This example creates a hash object where several keys have multiple data items. It uses the HAS_NEXT method to find all the data items. data testdup; length key data 8; input key data; datalines; 1 100 2 11 1 15
Hash and Hash Iterator Object Language Elements
3 2 2 3 5 1 4 5
20 16 9 100 5 5 6 99
; data _null_; length r 8; dcl hash h(dataset:’testdup’, multidata: ’y’); h.definekey(’key’); h.definedata(’key’, ’data’); h.definedone(); call missing (key, data); do key = 1 to 5; rc = h.find(); if (rc = 0) then do; put key= data=; h.has_next(result: r); do while(r ne 0); rc = h.find_next(); put ’dup ’ key= data; h.has_next(result: r); end; end; end; run;
The following lines are written to the SAS log.
Output 9.2 Output of Keys with Multiple Data Items key=1 data=100 dup key=1 5 dup key=1 15 key=2 data=11 dup key=2 9 dup key=2 16 key=3 data=20 dup key=3 100 key=4 data=6 key=5 data=5 dup key=5 99
See Also Methods: “FIND Method” on page 2040 “FIND_NEXT Method” on page 2043
4
HAS_NEXT Method
2049
2050
HAS_PREV Method
4
Chapter 9
“FIND_PREV Method” on page 2045 “HAS_PREV Method” on page 2050 “Non-Unique Key and Data Pairs” in SAS Language Reference: Concepts
HAS_PREV Method Determines whether there is a previous item in the current key’s multiple data item list. Applies to:
Hash object
Syntax rc=object.HAS_PREV(RESULT: R);
Arguments rc
specifies whether the method succeeded or failed. A return code of zero indicates success; a nonzero value indicates failure. If you do not supply a return code variable for the method call and the method fails, then an appropriate error message is written to the log. object
specifies the name of the hash object. RESULT:R
specifies the numeric variable R, which receives a zero value if there is not another data item in the data item list or a nonzero value if there is another data item in the data item list.
Details If a key has multiple data items, you can use the HAS_PREV method to determine whether there is a previous item in the current key’s multiple data item list. If there is a previous item, the method will return a nonzero value in the numeric variable R; otherwise, it will return a zero. The FIND method determines whether the key exists in the hash object. The HAS_NEXT method determines whether the key has multiple data items associated with it. When you have determined that the key has a previous data item, that data item can be retrieved by using the FIND_PREV method, which sets the data variable to the value of the data item so that it is available for use after the method call. Once you are in the data item list, you can use the HAS_PREV and FIND_PREV methods in addition to the HAS_NEXT and FIND_NEXT methods to traverse the list. See “HAS_NEXT Method” on page 2048 for an example.
Hash and Hash Iterator Object Language Elements
4
ITEM_SIZE Attribute
2051
See Also Methods: “FIND Method” on page 2040 “FIND_NEXT Method” on page 2043 “FIND_PREV Method” on page 2045 “HAS_NEXT Method” on page 2048 “Non-Unique Key and Data Pairs” in SAS Language Reference: Concepts
ITEM_SIZE Attribute Returns the size (in bytes) of an item in a hash object. Applies to:
Hash object
Syntax variable_name=object.ITEM_SIZE;
Arguments variable_name
specifies name of the variable that contains the size of the item in the hash object. object
specifies the name of the hash object.
Details The ITEM_SIZE attribute returns the size (in bytes) of an item, which includes the key and data variables and some additional internal information. You can set an estimate of how much memory the hash object is using with the ITEM_SIZE and NUM_ITEMS attributes. The ITEM_SIZE attribute does not reflect the initial overhead that the hash object requires, nor does it take into account any necessary internal alignments. Therefore, the use of ITEM_SIZE does not provide exact memory usage, but it does return a good approximation.
Examples The following example uses ITEM_SIZE to return the size of the item in MYHASH: data work.stock; input prod $1-10 qty 12-14; datalines; broccoli 345 corn 389 potato 993 onion 730
2052
4
LAST Method
Chapter 9
; data _null_; if _N_ = 1 then do; length prod $10; /* Declare hash object and read STOCK data set as ordered */ declare hash myhash(dataset: "work.stock"); /* Define key and data variables */ myhash.defineKey(’prod’); myhash.defineData(’qty’); myhash.defineDone(); end; /* Add a key and data value to the hash object */ prod = ’celery’; qty = 183; rc = myhash.add(); /* Use ITEM_SIZE to return the size of the item in hash object */ itemsize = myhash.item_size; put itemsize=; run;
The following lines are written to the log: itemsize=40
LAST Method Returns the last value in the underlying hash object. Applies to:
Hash iterator object
Syntax rc=object.LAST( );
Arguments rc
specifies whether the method succeeded or failed. A return code of zero indicates success; a non-zero value indicates failure. If you do not supply a return code variable for the method call and the method fails, then an appropriate error message is written to the log. object
specifies the name of the hash iterator object.
Details The LAST method returns the last data item in the hash object. If you use the ordered: ’yes’ or ordered: ’ascending’ argument tag in the DECLARE statement or _NEW_
Hash and Hash Iterator Object Language Elements
4
_NEW_ Operator, Hash or Hash Iterator Object
2053
operator when you instantiate the hash object, then the data item that is returned is the one with the ’highest’ key (largest numeric value or last alphabetic character), because the data items are sorted in ascending key-value order in the hash object. Conversely, if you use the ordered: ’descending’ argument tag in the DECLARE statement or _NEW_ operator when you instantiate the hash object, then the data item that is returned is the one with the ’least’ key (smallest numeric value or first alphabetic character), because the data items are sorted in descending key-value order in the hash object. Use the FIRST method to return the first data item in the hash object. Note: The LAST method sets the data variable to the value of the data item so that it is available for use after the method call. 4
See Also Methods: “FIRST Method” on page 2046 Operators: “_NEW_ Operator, Hash or Hash Iterator Object” on page 2053 Statements: “DECLARE Statement, Hash and Hash Iterator Objects” on page 1427 “Using the Hash Iterator Object” in SAS Language Reference: Concepts
_NEW_ Operator, Hash or Hash Iterator Object Creates an instance of a hash or hash iterator object. Applies to:
Hash object Hash iterator object
Syntax object-reference = _NEW_ object();
Arguments object-reference
specifies the object reference name for the hash or hash iterator object. object
specifies the component object. It can be one of the following: hash
indicates a hash object. The hash object provides a mechanism for quick data storage and retrieval. The hash object stores and retrieves data based on lookup keys. For more information about the hash object, see “Using the Hash Object” in SAS Language Reference: Concepts.
2054
_NEW_ Operator, Hash or Hash Iterator Object
hiter
4
Chapter 9
indicates a hash iterator object. The hash iterator object enables you to retrieve the hash object’s data in forward or reverse key order. For more information about the hash iterator object, see “Using the Hash Iterator Object” in SAS Language Reference: Concepts.
argument-tag
specifies the information that is used to create an instance of the hash object. Valid hash object argument tags are dataset: ’dataset_name ’ Names a SAS data set to load into the hash object. The name of the SAS data set can be a literal or character variable. The data set name must be enclosed in single or double quotation marks. Macro variables must be enclosed in double quotation marks. You can use SAS data set options when declaring a hash object in the DATASET argument tag. Data set options specify actions that apply only to the SAS data set with which they appear. They enable you to perform the following operations: 3 renaming variables 3 selecting a subset of observations based on observation number for processing 3 selecting observations using the WHERE option 3 dropping or keeping variables from a data set loaded into a hash object, or for an output data set specified in an OUTPUT method call 3 specifying a password for a data set. The following syntax is used: dcl hash h; h = _new_ hash (dataset: ’x (where = (i > 10))’);
For a list of SAS data set options, see “Data Set Options by Category” on page 12. Note: If the data set contains duplicate keys, the default is to keep the first instance in the hash object; subsequent instances will be ignored. To store the last instance in the hash object or have an error message written in the SAS log if there is a duplicate key, use the DUPLICATE argument tag. 4 duplicate: ’option’ determines whether to ignore duplicate keys when loading a data set into the hash object. The default is to store the first key and ignore all subsequent duplicates. Option can be one of the following values: ’replace’ | ’r’ stores the last duplicate key record. ’error’ | ’e’ reports an error to the log if a duplicate key is found. The following example using the REPLACE option stores brown for the key 620 and blue for the key 531 . If you use the default, green would be stored for 620 and yellow would be stored for 531. data table; input key data $; datalines; 531 yellow 620 green 531 blue 908 orange 620 brown
Hash and Hash Iterator Object Language Elements
4
_NEW_ Operator, Hash or Hash Iterator Object
2055
143 purple run; data _null_; length key 8 data $ 8; if (_n_ = 1) then do; declare hash myhash; myhash = _new_ hash (dataset: "table", duplicate: "r"); rc = myhash.definekey(’key’); rc = myhash.definedata(’data’); myhash.definedone(); end; rc = myhash.output(dataset:"otable"); run;
hashexp: n n The hash object’s internal table size, where the size of the hash table is 2 . The value of HASHEXP is used as a power-of-two exponent to create the hash table size. For example, a value of 4 for HASHEXP equates to a hash table size of 4 2 , or 16. The maximum value for HASHEXP is 20. The hash table size is not equal to the number of items that can be stored. Imagine the hash table as an array of ’buckets.’ A hash table size of 16 would have 16 ’buckets.’ Each bucket can hold an infinite number of items. The efficiency of the hash table lies in the ability of the hashing function to map items to and retrieve items from the buckets. You should set the hash table size relative to the amount of data in the hash object in order to maximize the efficiency of the hash object lookup routines. Try different HASHEXP values until you get the best result. For example, if the hash object contains one million items, a hash table size of 16 (HASHEXP = 4) would work, but not very efficiently. A hash table size of 512 or 1024 (HASHEXP = 9 or 10) would result in the best performance. 8 Default: 8, which equates to a hash table size of 2 or 256 ordered: ’option’ Specifies whether or how the data is returned in key-value order if you use the hash object with a hash iterator object or if you use the hash object OUTPUT method. option can be one of the following values: ’ascending’ | ’a’
Data is returned in ascending key-value order. Specifying ’ascending’ is the same as specifying ’yes’.
’descending’ | ’d’
Data is returned in descending key-value order.
’YES’ | ’Y’
Data is returned in ascending key-value order. Specifying ’yes’ is the same as specifying ’ascending’.
’NO’ | ’N’ Data is returned in some undefined order. Default: NO The argument value can also be enclosed in double quotation marks. multidata: ’option’ specifies whether multiple data items are allowed for each key. option can be one of the following values: ’YES’ | ’Y’
Multiple data items are allowed for each key.
’NO’ | ’N’
Only one data item is allowed for each key.
2056
_NEW_ Operator, Hash or Hash Iterator Object
4
Chapter 9
Default: NO See Also: “Non-Unique Key and Data Pairs” in SAS Language Reference: Concepts
The argument value can also be enclosed in double quotation marks. suminc: ’variable-name’ maintains a summary count of hash object keys. The SUMINC argument tag is given a DATA step variable, which holds the sum increment, that is, how much to add to the key summary for each reference to the key. The SUMINC value treats a missing value as zero, like the SUM function. For example, a key summary changes using the current value of the DATA step variable. dcl hash myhash(suminc: ’count’);
For more information, see “Maintaining Key Summaries” in SAS Language Reference: Concepts. See Also: “Initializing Hash Object Data Using a Constructor” and “Declaring and Instantiating a Hash Iterator Object” in SAS Language Reference: Concepts.
Details To use a DATA step component object in your SAS program, you must declare and create (instantiate) the object. The DATA step component interface provides a mechanism for accessing the predefined component objects from within the DATA step. If you use the _NEW_ operator to instantiate the component object, you must first use the DECLARE statement to declare the component object. For example, in the following lines of code, the DECLARE statement tells SAS that the object reference H is a hash object. The _NEW_ operator creates the hash object and assigns it to the object reference H. declare hash h(); h = _new_ hash( );
Note: You can use the DECLARE statement to declare and instantiate a hash or hash iterator object in one step. 4 A constructor is a method that is used to instantiate a component object and to initialize the component object data. For example, in the following lines of code, the _NEW_ operator instantiates a hash object and assigns it to the object reference H. In addition, the data set WORK.KENNEL is loaded into the hash object. declare hash h(); h = _new_ hash(datset: "work.kennel");
For more information about the predefined DATA step component objects and constructors, see “Using DATA Step Component Objects” in SAS Language Reference: Concepts.
Comparisons You can use the DECLARE statement and the _NEW_ operator, or the DECLARE statement alone to declare and instantiate an instance of a hash or hash iterator object.
Examples This example uses the _NEW_ operator to instantiate and initialize data for a hash object and instantiate a hash iterator object. The hash object is filled with data, and the iterator is used to retrieve the data in key order.
Hash and Hash Iterator Object Language Elements
4
_NEW_ Operator, Hash or Hash Iterator Object
2057
data kennel; input name $1-10 kenno $14-15; datalines; Charlie 15 Tanner 07 Jake 04 Murphy 01 Pepe 09 Jacques 11 Princess Z 12 ; run; data _null_; if _N_ = 1 then do; length kenno $2; length name $10; /* Declare the hash object */ declare hash h(); /* Instantiate and initialize the hash object */ h = _new_ hash(dataset:"work.kennel", ordered: ’yes’); /* Declare the hash iterator object */ declare hiter iter; /* Instantiate the hash iterator object */ iter = _new_ hiter(’h’); /* Define key and data variables */ h.defineKey(’kenno’); h.defineData(’name’, ’kenno’); h.defineDone(); /* avoid uninitialized variable notes */ call missing(kenno, name); end; /* Find the first key in the ordered hash object and output to the log */ rc = iter.first(); do while (rc = 0); put kenno ’ ’ name; rc = iter.next(); end; run;
The following lines are written to the SAS log: Output 9.3 Output of Data Written in Key Order NOTE: 01 04 07 09 11 12 15
There were 7 observations read from the data set WORK.KENNEL. Murphy Jake Tanner Pepe Jacques Princess Z Charlie
2058
4
NEXT Method
Chapter 9
See Also Statements: “DECLARE Statement, Hash and Hash Iterator Objects” on page 1427 “Using DATA Step Component Objects” in SAS Language Reference: Concepts
NEXT Method Returns the next value in the underlying hash object. Applies to:
Hash iterator object
Syntax rc=object.NEXT( );
Arguments rc
specifies whether the method succeeded or failed. A return code of zero indicates success; a non-zero value indicates failure. If you do not supply a return code variable for the method call and the method fails, then an appropriate error message is written to the log. object
specifies the name of the hash iterator object.
Details Use the NEXT method iteratively to traverse the hash object and return the data items in key order. The FIRST method returns the first data item in the hash object. You can use the PREV method to return the previous data item in the hash object. Note: The NEXT method sets the data variable to the value of the data item so that it is available for use after the method call. 4 Note: If you call the NEXT method without calling the FIRST method, then the NEXT method will still start at the first item in the hash object. 4
See Also Methods: “FIRST Method” on page 2046 “PREV Method” on page 2064 Operators: “_NEW_ Operator, Hash or Hash Iterator Object” on page 2053
Hash and Hash Iterator Object Language Elements
4
NUM_ITEMS Attribute
2059
Statements: “DECLARE Statement, Hash and Hash Iterator Objects” on page 1427 “Using the Hash Iterator Object” in SAS Language Reference: Concepts
NUM_ITEMS Attribute Returns the number of items in the hash object. Applies to:
Hash object
Syntax variable_name=object.NUM_ITEMS;
Arguments variable_name
specifies the name of the variable that contains the number of items in the hash object. object
specifies the name of the hash object.
Examples This example creates a data set and loads the data set into a hash object. An item is added to the hash object and the total number of items in the resulting hash object is returned by the NUM_ITEMS attribute. data work.stock; input item $1-10 qty $12-14; datalines; broccoli 345 corn 389 potato 993 onion 730 ; data _null_; if _N_ = 1 then do; length item $10; length qty 8; length totalitems 8; /* Declare hash object and read STOCK data set as ordered */ declare hash myhash(dataset: "work.stock"); /* Define key and data variables */ myhash.defineKey(’item’); myhash.defineData(’qty’); myhash.defineDone();
2060
OUTPUT Method
4
Chapter 9
end; /* Add a key and data value to the hash object */ item = ’celery’; qty = 183; rc = myhash.add(); if (rc ne 0) then put ’Add failed’; /* Use NUM_ITEMS to return updated number of items in hash object */ totalitems = myhash.num_items; put totalitems=; run;
totalitems=5 is written to the SAS log.
OUTPUT Method Creates one or more data sets each of which contain the data in the hash object. Applies to:
Hash object
Syntax rc=object.OUTPUT(DATASET: ’dataset-1 ’ (’datasetoption’);
Arguments rc
specifies whether the method succeeded or failed. A return code of zero indicates success; a non-zero value indicates failure. If you do not supply a return code variable for the method call and the method fails, then an appropriate error message is written to the log. object
specifies the name of the hash object. DATASET: ’dataset’
specifies the name of the output data set. The name of the SAS data set can be a character literal or character variable. The data set name can also be enclosed in double quotation marks. When specifying the name of the output data set, you can use SAS data set options in the DATASET argument tag. Macro variables must be enclosed in double quotation marks. datasetoption
specifies a data set option. For complete information on how to specify data set options, see “Syntax” on page 10.
Details Hash object keys are not automatically stored as part of the output data set. The keys must be defined as data items by using the DEFINEDATA method to be included in the output data set.
Hash and Hash Iterator Object Language Elements
4
OUTPUT Method
2061
If you use the ordered: ’yes’ or ordered: ’ascending’ argument tag in the DECLARE statement or _NEW_ operator when you instantiate the hash object, then the data items are written to the data set in ascending key-value order. If you use the ordered: ’descending’ argument tag in the DECLARE statement or _NEW_ operator when you instantiate the hash object, then the data items are written to the data set in descending key-value order. If you do not use the ordered argument tag, the order is undefined. When specifying the name of the output data set, you can use SAS data set options in the DATASET argument tag. Data set options specify actions that apply only to the SAS data set with which they appear. They let you perform the following operations: 3 renaming variables 3 selecting a subset of observations based on the observation number for processing 3 selecting observations using the WHERE option 3 dropping or keeping variables from a data set loaded into a hash object, or for an output data set that is specified in an OUTPUT method call 3 specifying a password for a data set. The following example uses the WHERE data set option to select specific data for the output data set named OUT: data x; do i = 1 to 20; output; end; run; /* Using the WHERE option. */ data _null_; length i 8; dcl hash h(); h.definekey(all: ’y’); h.definedone(); h.output(dataset: ’out (where =( i < 8))’); run;
The following example uses the RENAME data set option to rename the variable J to K for the output data set named OUT: data x; do i = 1 to 20; output; end; run; /* Using the RENAME option. */ data _null_; length i j 8; dcl hash h(); h.definekey(all: ’y’); h.definedone(); h.output(dataset: ’out (rename =(j=k))’); run;
For a list of data set options, see “Data Set Options by Category” on page 12. Note: When you use the OUTPUT method to create a data set, the hash object is not part of the output data set. In the following example, the H2 hash object will be omitted from the output data set.
2062
OUTPUT Method
4
Chapter 9
data _null_; length k 8; length d $10; declare hash h2(); declare hash h(ordered: ’y’); h.defineKey(’k’); h.defineData(’k’, ’d’, ’h2’); h.defineDone(); k = 99; d = ’abc’; h.add(); k = 199; d = ’def’; h.add(); h.output(dataset:’work.x’); run;
4
Examples Using the data set ASTRO that contains astronomical data, the following code creates a hash object with the Messier (OBJ) objects sorted in ascending order by their right-ascension (RA) values and uses the OUTPUT method to save the data to a data set. data astro; input obj $1-4 ra $6-12 dec $14-19; datalines; M31 00 42.7 +41 16 M71 19 53.8 +18 47 M51 13 29.9 +47 12 M98 12 13.8 +14 54 M13 16 41.7 +36 28 M39 21 32.2 +48 26 M81 09 55.6 +69 04 M100 12 22.9 +15 49 M41 06 46.0 -20 44 M44 08 40.1 +19 59 M10 16 57.1 -04 06 M57 18 53.6 +33 02 M3 13 42.2 +28 23 M22 18 36.4 -23 54 M23 17 56.8 -19 01 M49 12 29.8 +08 00 M68 12 39.5 -26 45 M17 18 20.8 -16 11 M14 17 37.6 -03 15 M29 20 23.9 +38 32 M34 02 42.0 +42 47 M82 09 55.8 +69 41 M59 12 42.0 +11 39 M74 01 36.7 +15 47 M25 18 31.6 -19 15 ;
Hash and Hash Iterator Object Language Elements
4
OUTPUT Method
run; data _null_; if _N_ = 1 then do; length obj $10; length ra $10; length dec $10; /* Read ASTRO data set as ordered */ declare hash h(hashexp: 4, dataset:"work.astro", ordered: ’yes’); /* Define variables RA and OBJ as key and data for hash object */ h.defineKey(’ra’); h.defineData(’ra’, ’obj’); h.defineDone(); /* avoid uninitialized variable notes */ call missing(ra, obj); end; /* Create output data set from hash object */ rc = h.output(dataset: ’work.out’); run; proc print data=work.out; var ra obj; title ’Messier Objects Sorted by Right-Ascension Values’; run;
Output 9.4 Messier Objects Sorted by Right-Ascension Values Messier Objects Sorted by Right-Ascension Values Obs 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
ra 00 01 02 06 08 09 09 12 12 12 12 12 13 13 16 16 17 17 18 18 18 18 19 20 21
42.7 36.7 42.0 46.0 40.1 55.6 55.8 13.8 22.9 29.8 39.5 42.0 29.9 42.2 41.7 57.1 37.6 56.8 20.8 31.6 36.4 53.6 53.8 23.9 32.2
obj M31 M74 M34 M41 M44 M81 M82 M98 M100 M49 M68 M59 M51 M3 M13 M10 M14 M23 M17 M25 M22 M57 M71 M29 M39
1
2063
2064
4
PREV Method
Chapter 9
See Also Methods: “DEFINEDATA Method” on page 2033 Operators: “_NEW_ Operator, Hash or Hash Iterator Object” on page 2053 Statements: “DECLARE Statement, Hash and Hash Iterator Objects” on page 1427 “Saving Hash Object Data in a Data Set” in SAS Language Reference: Concepts
PREV Method Returns the previous value in the underlying hash object. Applies to:
Hash iterator object
Syntax rc=object.PREV( );
Arguments rc
specifies whether the method succeeded or failed. A return code of zero indicates success; a non-zero value indicates failure. If you do not supply a return code variable for the method call and the method fails, then an appropriate error message is written to the log. object
specifies the name of the hash iterator object.
Details Use the PREV method iteratively to traverse the hash object and return the data items in reverse key order. The FIRST method returns the first data item in the hash object. The LAST method returns the last data item in the hash object. You can use the NEXT method to return the next data item in the hash object. Note: The PREV method sets the data variable to the value of the data item so that it is available for use after the method call. 4
See Also Methods: “FIRST Method” on page 2046
Hash and Hash Iterator Object Language Elements
4
REF Method
2065
“LAST Method” on page 2052 “NEXT Method” on page 2058 Operators: “_NEW_ Operator, Hash or Hash Iterator Object” on page 2053 Statements: “DECLARE Statement, Hash and Hash Iterator Objects” on page 1427 “Using the Hash Iterator Object” in SAS Language Reference: Concepts
REF Method Consolidates the FIND and ADD methods into a single method call. Applies to:
Hash object
Syntax rc=object.REF();
Arguments rc
specifies whether the method succeeded or failed. A return code of zero indicates success; a nonzero value indicates failure. If you do not supply a return code variable for the method call and the method fails, then an appropriate error message is written to the log. object
specifies the name of the hash object. KEY: keyvalue
specifies the key value whose type must match the corresponding key variable that is specified in a DEFINEKEY method call. The number of “KEY: keyvalue” pairs depends on the number of key variables that you define by using the DEFINEKEY method. DATA: datavalue
specifies the data value whose type must match the corresponding data variable that is specified in a DEFINEDATA method call. The number of “DATA: datavalue” pairs depends on the number of data variables that you define by using the DEFINEDATA method.
Details You can consolidate FIND and ADD methods into a single REF method. You can change the following code: rc = h.find(); if (rc ne = 0) then rc = h.add();
2066
REF Method
4
Chapter 9
to rc = h.ref();
The REF method is useful for counting the number of occurrences of each key in a hash object. The REF method initializes the key summary for each key on the first ADD, and then changes the ADD for each subsequent FIND. Note: The REF method sets the data variable to the value of the data item so that it is available for use after the method call. 4 For more information about key summaries, see SAS Language Reference: Concepts.
Examples The following example uses the REF method for key summaries: data keys; input key; datalines; 1 2 1 3 5 2 3 2 4 1 5 1 ; data count; length count key 8; keep key count; if _n_ = 1 then do; declare hash myhash(suminc: "count", ordered: "y"); declare hiter iter("myhash"); myhash.defineKey(’key’); myhash.defineDone(); count = 1; end; do while (not done); set keys end=done; rc = myhash.ref(); end; rc = iter.first(); do while(rc = 0); rc = myhash.sum(sum: count); output; rc = iter.next(); end;
Hash and Hash Iterator Object Language Elements
4
REMOVE Method
2067
stop; run;
The following lines are written to the SAS log. Output 9.5 Output of DATA Using the REF Method
Obs 1 2 3 4 5
count 4 3 2 1 2
key 1 2 3 4 5
See Also Methods: “ADD Method” on page 2027 “FIND Method” on page 2040 “CHECK Method” on page 2029
REMOVE Method Removes the data that is associated with the specified key from the hash object. Applies to:
Hash object
Syntax rc=object.REMOVE(< KEY: keyvalue-1,…, KEY: keyvalue-n>);
Arguments rc
specifies whether the method succeeded or failed. A return code of zero indicates success; a non-zero value indicates failure. If you do not supply a return code variable for the method call and the method fails, then an appropriate error message is written to the log. object
specifies the name of the hash object. KEY: keyvalue
specifies the key value whose type must match the corresponding key variable that is specified in a DEFINEKEY method call The number of “KEY: keyvalue” pairs depends on the number of key variables that you define by using the DEFINEKEY method.
2068
REMOVE Method
4
Chapter 9
Restriction: If an associated hash iterator is pointing to the keyvalue, then the
REMOVE method will not remove the key or data from the hash object. An error message is issued.
Details The REMOVE method deletes both the key and the data from the hash object. You can use the REMOVE method in one of two ways to remove the key and data in a hash object. You can specify the key, and then use the REMOVE method as shown in the following code: data _null_; length k $8; length d $12; if _N_ = 1 then do; declare hash h(); rc = h.defineKey(’k’); rc = h.defineData(’d’); rc = h.defineDone(); /* avoid uninitialized variable notes */ call missing(k, d); end; rc = h.add(key: ’Joyce’, data: ’Ulysses’); /* Specify the key */ k = ’Joyce’; /* Use the REMOVE method to remove the key and data */ rc = h.remove(); if (rc = 0) then put ’Key and data removed from the hash object.’; run;
Alternatively, you can use a shortcut and specify the key directly in the REMOVE method call as shown in the following code: data _null_; length k $8; length d $12; if _N_ = 1 then do; declare hash h(); rc = h.defineKey(’k’); rc = h.defineData(’d’); rc = h.defineDone(); /* avoid uninitialized variable notes */ call missing(k, d); end; rc = h.add(key: ’Joyce’, data: ’Ulysses’); rc = h.add(key: ’Homer’, data: ’Iliad’); /* Specify the key in the REMOVE method parameter */ rc = h.remove(key: ’Homer’);
Hash and Hash Iterator Object Language Elements
4
REMOVE Method
if (rc =0) then put ’Key and data removed from the hash object.’; run;
Note: The REMOVE method does not modify the value of data variables. It only removes the value in the hash object. 4 Note: If you specify multidata:’y’ in the hash object constructor, the REMOVE method will remove all data items for the specified key. 4
Examples This example illustrates how to remove a key in the hash table. /* Generate test data */ data x; do k = 65 to 70; d = byte (k); output; end; run; data _null_; length k 8 d $1; /* define the hash table and iterator */ declare hash H (dataset:’x’, ordered:’a’); H.defineKey (’k’); H.defineData (’k’, ’d’); H.defineDone (); call missing (k,d); declare hiter HI (’H’); /* Use this logic to remove a key in the hash table when an iterator is pointing to that key */ do while (hi.next() = 0); if flag then rc=h.remove(key:key); if d = ’C’ then do; key=k; flag=1; end; end; rc = h.output(dataset: ’work.out’); stop; run; proc print; run;
The following output shows that the key and data for the third object (key=67, data=C) is deleted.
2069
2070
4
REMOVEDUP Method
Output 9.6
Chapter 9
Key and Data Removed from Output The SAS System Obs
k
d
1 2 3 4 5
65 66 68 69 70
A B D E F
1
See Also Methods: “ADD Method” on page 2027 “DEFINEKEY Method” on page 2036 “REMOVEDUP Method” on page 2070 “Replacing and Removing Data” in SAS Language Reference: Concepts
REMOVEDUP Method Removes the data that is associated with the specified key’s current data item from the hash object. Applies to:
Hash object
Syntax rc=object.REMOVEDUP();
Arguments
rc
specifies whether the method succeeded or failed. A return code of zero indicates success; a nonzero value indicates failure. If you do not supply a return code variable for the method call and the method fails, then an appropriate error message is written to the log. object
specifies the name of the hash object. KEY: keyvalue
specifies the key value whose type must match the corresponding key variable that is specified in a DEFINEKEY method call. The number of “KEY: keyvalue” pairs depends on the number of key variables that you define by using the DEFINEKEY method.
Hash and Hash Iterator Object Language Elements
4
REMOVEDUP Method
2071
Restriction: If an associated hash iterator is pointing to the keyvalue, then the
REMOVEDUP method will not remove the key or data from the hash object. An error message is issued.
Details The REMOVEDUP method deletes both the key and the data from the hash object. You can use the REMOVEDUP method in one of two ways to remove the key and data in a hash object. You can specify the key, and then use the REMOVEDUP method. Alternatively, you can use a shortcut and specify the key directly in the REMOVEDUP method call. Note: The REMOVEDUP method does not modify the value of data variables. It only removes the value in the hash object. 4 Note: If only one data item is in the key’s data item list, the key and data will be removed from the hash object. 4
Comparisons The REMOVEDUP method removes the data that is associated with the specified key’s current data item from the hash object. The REMOVE method removes the data that is associated with the specified key from the hash object.
Examples This example creates a hash object where several keys have multiple data items. The last data item in the key is removed. data testdup; length key data 8; input key data; datalines; 1 10 2 11 1 15 3 20 2 16 2 9 3 100 5 5 1 5 4 6 5 99 ; data _null_; length r 8; dcl hash h(dataset:’testdup’, multidata: ’y’, ordered: ’y’); h.definekey(’key’); h.definedata(’key’, ’data’); h.definedone(); call missing (key, data); do key = 1 to 5; rc = h.find();
2072
REPLACE Method
4
Chapter 9
if (rc = 0) then do; h.has_next(result: r); if (r ne 0) then do; h.find_next(); h.removedup(); end; end; end; dcl hiter i(’h’); rc = i.first(); do while (rc = 0); put key= data=; rc = i.next(); end; run;
The following lines are written to the SAS log.
Output 9.7 key=1 key=1 key=2 key=2 key=3 key=4 key=5
Last Data Item Removed from the Key
data=10 data=15 data=11 data=16 data=20 data=6 data=5
See Also Methods: “REMOVE Method” on page 2067 “Non-Unique Key and Data Pairs” in SAS Language Reference: Concepts
REPLACE Method Replaces the data that is associated with the specified key with new data. Applies to:
Hash object
Syntax rc=object.REPLACE();
Hash and Hash Iterator Object Language Elements
4
REPLACE Method
2073
Arguments rc
specifies whether the method succeeded or failed. A return code of zero indicates success; a non-zero value indicates failure. If you do not supply a return code variable for the method call and the method fails, then an appropriate error message is written to the log. object
specifies the name of the hash object. KEY: keyvalue
specifies the key value whose type must match the corresponding key variable that is specified in a DEFINEKEY method call. The number of “KEY: keyvalue” pairs depends on the number of key variables that you define by using the DEFINEKEY method. DATA: datavalue
specifies the data value whose type must match the corresponding data variable that is specified in a DEFINEDATA method call. The number of “DATA: datavalue” pairs depends on the number of data variables that you define by using the DEFINEDATA method.
Details You can use the REPLACE method in one of two ways to replace data in a hash object. You can define the key and data item, and then use the REPLACE method as shown in the following code. In this example the data for the key ’Rottwlr’ is changed from ’1st’ to ’2nd’. data work.show; input brd $1-10 plc $12-14; datalines; Terrier 2nd LabRetr 3rd Rottwlr 1st Collie bis ChinsCrstd 2nd Newfnlnd 3rd ;
data _null_; length brd $12; length plc $8; if _N_ = 1 then do; declare hash h(dataset: ’work.show’); rc = h.defineKey(’brd’); rc = h.defineData(’plc’); rc = h.defineDone(); end; /* Specify the key and new data value */ brd = ’Rottwlr’; plc = ’2nd’;
2074
REPLACE Method
4
Chapter 9
/* Call the REPLACE method to replace the data value */ rc = h.replace(); run;
Alternatively, you can use a shortcut and specify the key and data directly in the REPLACE method call as shown in the following code: data work.show; input brd $1-10 plc $12-14; datalines; Terrier 2nd LabRetr 3rd Rottwlr 1st Collie bis ChinsCrstd 2nd Newfnlnd 3rd ; data _null_; length brd $12; length plc $8; if _N_ = 1 then do; declare hash h(dataset: ’work.show’); rc = h.defineKey(’brd’); rc = h.defineData(’plc’); rc = h.defineDone(); /* avoid uninitialized variable notes */ call missing(brd, plc); end; /* Specify the key and new data value in the REPLACE method */ rc = h.replace(key: ’Rottwlr’, data: ’2nd’); run;
Note: If you call the REPLACE method and the key is not found, then the key and data are added to the hash object. 4 Note: The REPLACE method does not replace the value of the data variable with the value of the data item. It only replaces the value in the hash object. 4
Comparisons The REPLACE method replaces the data that is associated with the specified key with new data. The REPLACEDUP method replaces the data that is associated with the current key’s current data item with new data.
See Also Methods: “DEFINEDATA Method” on page 2033 “DEFINEKEY Method” on page 2036 “REPLACEDUP Method” on page 2075 “Replacing and Removing Data” in SAS Language Reference: Concepts
Hash and Hash Iterator Object Language Elements
4
REPLACEDUP Method
2075
REPLACEDUP Method Replaces the data that is associated with the current key’s current data item with new data. Applies to:
Hash object
Syntax rc=object.REPLACEDUP(< DATA: datavalue-1,…, DATA: datavalue-n>);
Arguments rc
specifies whether the method succeeded or failed. A return code of zero indicates success; a nonzero value indicates failure. If you do not supply a return code variable for the method call and the method fails, then an appropriate error message is written to the log. object
specifies the name of the hash object. DATA: datavalue
specifies the data value whose type must match the corresponding data variable that is specified in a DEFINEDATA method call. The number of “DATA: datavalue” pairs depends on the number of data variables that you define by using the DEFINEDATA method for the current key.
Details You can use the REPLACEDUP method in one of two ways to replace data in a hash object. You can define the data item, and then use the REPLACEDUP method. Alternatively, you can use a shortcut and specify the data directly in the REPLACEDUP method call. Note: If you call the REPLACEDUP method and the key is not found, then the key and data are added to the hash object. 4 Note: The REPLACEDUP method does not replace the value of the data variable with the value of the data item. It only replaces the value in the hash object. 4
Comparisons The REPLACEDUP method replaces the data that is associated with the current key’s current data item with new data. The REPLACE method replaces the data that is associated with the specified key with new data.
Examples This example creates a hash object where several keys have multiple data items. When a duplicate data item is found, 300 is added to the value of the data item. data testdup; length key data 8;
2076
REPLACEDUP Method
4
Chapter 9
input key data; datalines; 1 10 2 11 1 15 3 20 2 16 2 9 3 100 5 5 1 5 4 6 5 99 ; data _null_; length r 8; dcl hash h(dataset:’testdup’, multidata: ’y’, ordered: ’y’); h.definekey(’key’); h.definedata(’key’, ’data’); h.definedone(); call missing (key, data); do key = 1 to 5; rc = h.find(); if (rc = 0) then do; put key= data=; h.has_next(result: r); do while(r ne 0); rc = h.find_next(); put ’dup ’ key= data; data = data + 300; rc = h.replacedup(); h.has_next(result: r); end; end; end; put ’iterating...’; dcl hiter i(’h’); rc = i.first(); do while (rc = 0); put key= data=; rc = i.next(); end; run;
The following lines are written to the SAS log.
Hash and Hash Iterator Object Language Elements
4
SETCUR Method
2077
Output 9.8 Ouput Showing Alteration of Duplicate Data Items key=1 data=10 dup key=1 5 dup key=1 15 key=2 data=11 dup key=2 9 dup key=2 16 key=3 data=20 dup key=3 100 key=4 data=6 key=5 data=5 dup key=5 99 iterating... key=1 data=10 key=1 data=305 key=1 data=315 key=2 data=11 key=2 data=309 key=2 data=316 key=3 data=20 key=3 data=400 key=4 data=6 key=5 data=5 key=5 data=399
See Also Methods: “REPLACE Method” on page 2072 “Non-Unique Key and Data Pairs” in SAS Language Reference: Concepts
SETCUR Method Specifies a starting key item for iteration. Applies to:
Hash iterator object
Syntax rc=object.SETCUR(KEY: ’keyvalue-1’);
Arguments rc
specifies whether the method succeeded or failed. A return code of zero indicates success; a nonzero value indicates failure. If you do not supply a return code variable for the method call and the method fails, then an appropriate error message is written to the log. object
specifies the name of the hash iterator object.
2078
SETCUR Method
4
Chapter 9
KEY: ’keyvalue’
specifies a key value as the starting key for the iteration.
Details The hash iterator enables you to start iteration on any item in the hash object. The SETCUR method sets the starting key for iteration. You use the KEY option to specify the starting item.
Examples The following example creates a data set that contains astronomical data. You want to start iteration at RA= 18 31.6 instead of the first or last items. The data is loaded into a hash object and the SETCUR method is used to start the iteration. Because the ordered argument tag was set to YES, note that the output is sorted in ascending order. data work.astro; input obj $1-4 ra $6-12 dec $14-19; datalines; M31 00 42.7 +41 16 M71 19 53.8 +18 47 M51 13 29.9 +47 12 M98 12 13.8 +14 54 M13 16 41.7 +36 28 M39 21 32.2 +48 26 M81 09 55.6 +69 04 M100 12 22.9 +15 49 M41 06 46.0 -20 44 M44 08 40.1 +19 59 M10 16 57.1 -04 06 M57 18 53.6 +33 02 M3 13 42.2 +28 23 M22 18 36.4 -23 54 M23 17 56.8 -19 01 M49 12 29.8 +08 00 M68 12 39.5 -26 45 M17 18 20.8 -16 11 M14 17 37.6 -03 15 M29 20 23.9 +38 32 M34 02 42.0 +42 47 M82 09 55.8 +69 41 M59 12 42.0 +11 39 M74 01 36.7 +15 47 M25 18 31.6 -19 15 ;
The following code sets the starting key for iteration to ’18 31.6’: data _null_; length obj $10; length ra $10; length dec $10; declare hash myhash(hashexp: 4, dataset:"work.astro", ordered:"yes"); declare hiter iter(’myhash’); myhash.defineKey(’ra’);
Hash and Hash Iterator Object Language Elements
4
SUM Method
2079
myhash.defineData(’obj’, ’ra’); myhash.defineDone(); call missing (ra, obj, dec); rc = iter.setcur(key: ’18 31.6’); do while (rc = 0); put obj= ra=; rc = iter.next(); end; run;
The following lines are written to the SAS log.
Output 9.9 Output Showing Starting Key of 18.31.6
obj=M25 ra=18 31.6 obj=M22 ra=18 36.4 obj=M57 ra=18 53.6 obj=M71 ra=19 53.8 obj=M29 ra=20 23.9 obj=M39 ra=21 32.2
You can use the FIRST method or the LAST method to start iteration on the first item or the last item, respectively.
See Also Methods: “FIRST Method” on page 2046 “LAST Method” on page 2052 Operators: “_NEW_ Operator, Hash or Hash Iterator Object” on page 2053 Statements: “DECLARE Statement, Hash and Hash Iterator Objects” on page 1427 “Using the Hash Iterator Object” in SAS Language Reference: Concepts
SUM Method Retrieves the summary value for a given key from the hash table and stores the value in a DATA step variable. Applies to:
Hash object
Syntax rc=object.SUM(SUM: variable-name);
2080
SUM Method
4
Chapter 9
Arguments rc
specifies whether the method succeeded or failed. A return code of zero indicates success; a nonzero value indicates failure. If you do not supply a return code variable for the method call and the method fails, then an appropriate error message is written to the log. object
specifies the name of the hash object. SUM: variable-name
specifies a DATA step variable that stores the current summary value of a given key.
Details You use the SUM method to retrieve key summaries from the hash object. For more information, see “Maintaining Key Summaries” in SAS Language Reference: Concepts.
Comparisons The SUM method retrieves the summary value for a given key when only one data item exists per key. The SUMDUP method retrieves the summary value for the current data item of the current key when more than one data item exists for a key.
Examples The following example uses the SUM method to retrieve the key summary for each given key, K=99 and K=100. k = 99; count = 1; h.add(); /* key=99 summary is now 1 */ k = 100; h.add(); /* key=100 summary is now 1 */ k = 99; h.find(); /* key=99 summary is now 2 */ count = 2; h.find(); /* key=99 summary is now 4 */ k = 100; h.find(); /* key=100 summary is now 3 */ h.sum(sum: total); put ’total for key 100 = ’ total; k = 99; h.sum(sum:total);
Hash and Hash Iterator Object Language Elements
4
SUMDUP Method
2081
put ’total for key 99 = ’ total; run;
The first PUT statement prints the summary for k=100: total for key 100 = 3
The second PUT statement prints the summary for k=99: total for key 99 = 4
See Also Methods: “ADD Method” on page 2027 “FIND Method” on page 2040 “CHECK Method” on page 2029 “REF Method” on page 2065 “SUMDUP Method” on page 2081 Operators: “_NEW_ Operator, Hash or Hash Iterator Object” on page 2053 Statements: “DECLARE Statement, Hash and Hash Iterator Objects” on page 1427
SUMDUP Method Retrieves the summary value for the current data item of the current key and stores the value in a DATA step variable. Applies to:
Hash object
Syntax rc=object.SUMDUP(SUM: variable-name);
Arguments rc
specifies whether the method succeeded or failed. A return code of zero indicates success; a nonzero value indicates failure. If you do not supply a return code variable for the method call and the method fails, an appropriate error message is printed to the log. object
specifies the name of the hash object.
2082
SUMDUP Method
4
Chapter 9
SUM: variable-name
specifies a DATA step variable that stores the summary value for the current data item of the current key.
Details You use the SUMDUP method to retrieve key summaries from the hash object when a key has multiple data items. For more information, see “Maintaining Key Summaries” in SAS Language Reference: Concepts.
Comparisons The SUMDUP method retrieves the summary value for the current data item of the current key when more than one data item exists for a key. The SUM method retrieves the summary value for a given key when only one data item exists per key.
Example The following example uses the SUMDUP method to retrieve the summary value for the current data item. It also illustrates that it is possible to loop backward through the list by using the HAS_PREV and FIND_PREV methods. The FIND_PREV method works similarly to the FIND_NEXT method with respect to the current list item except that it moves backward through the multiple item list. data dup; length key data 8; input key data; cards; 1 10 2 11 1 15 3 20 2 16 2 9 3 100 5 5 1 5 4 6 5 99 ; data _null_; length r i sum 8; i = 0; dcl hash h(dataset:’dup’, multidata: ’y’, suminc: ’i’); h.definekey(’key’); h.definedata(’key’, ’data’); h.definedone(); call missing (key, data); i = 1; do key = 1 to 5; rc = h.find(); if (rc = 0) then do; h.has_next(result: r);
Hash and Hash Iterator Object Language Elements
4
SUMDUP Method
2083
do while(r ne 0); rc = h.find_next(); rc = h.find_prev(); rc = h.find_next(); h.has_next(result: r); end; end; end; i = 0; do key = 1 to 5; rc = h.find(); if (rc = 0) then do; h.sum(sum: sum); put key= data= sum=; h.has_next(result: r); do while(r ne 0); rc = h.find_next(); h.sumdup(sum: sum); put ’dup ’ key= data= sum=; h.has_next(result: r); end; end; end; run;
The following lines are written to the SAS log. Output 9.10
Key Summary
key=1 data=10 sum=2 dup key=1 data=5 sum=3 dup key=1 data=15 sum=2 key=2 data=11 sum=2 dup key=2 data=9 sum=3 dup key=2 data=16 sum=2 key=3 data=20 sum=2 dup key=3 data=100 sum=2 key=4 data=6 sum=1 key=5 data=5 sum=2 dup key=5 data=99 sum=2
To see how this works, consider the key 1,which has three data values: 10, 5, and 15 (which are stored in that order). key=1 data=10 sum=2 dup key=1 data=5 sum=3 dup key=1 data=15 sum=2
When traveling through the data list in the loop, the key summary for 10 is set to 1 on the initial FIND method call. The first FIND_NEXT method call sets the key summary for 5 to 1. The next FIND_PREV method call moves back to the data value 10 and increments its key summary to 2. Finally, the last call to the FIND_NEXT method increments the key summary for 5 to 2. The next iteration through the loop sets the key summary for 15 to 1 and the key summary for 5 to 3 (because 5 is stored before 15
2084
SUMDUP Method
4
Chapter 9
in the list). Finally, the key summary for 15 is incremented to 2. This processing results in the output for key 1 as shown in Output 5.10. Note that you do not call the HAS_PREV method before calling the FIND_PREV method in this example because you already know that there is a previous entry in the list; otherwise, you would not have gotten into the loop. This example illustrates that there is no guaranteed order for multiple data items for a given key because they all have the same key. SAS cannot sort on the key. The order in the list (10, 5, 15) does not match the order that the items were added. Also shown here is the necessity of having special methods for some duplicate operations (in this case, the SUMDUP method works similarly to the SUM method by retrieving the key summary for the current list item).
See Also Methods: “SUM Method” on page 2079 “Non-Unique Key and Data Pairs” in SAS Language Reference: Concepts
2085
CHAPTER
10 Java Object Language Elements Java Object Methods by Category 2085 Dictionary 2086 CALLtypeMETHOD Method 2086 CALLSTATICtypeMETHOD Method 2089 DECLARE Statement, Java Object 2091 DELETE Method 2091 EXCEPTIONCHECK Method 2092 EXCEPTIONCLEAR Method 2093 EXCEPTIONDESCRIBE Method 2096 FLUSHJAVAOUTPUT Method 2097 GETtypeFIELD Method 2099 GETSTATICtypeFIELD Method 2101 SETtypeFIELD Method 2105 SETSTATICtypeFIELD Method 2107
Java Object Methods by Category There are five categories of Java object methods. Table 10.1
Java Object Methods by Category
Category
Description
Deletion
enables you to delete a Java object.
Exception
enables you to gather information about and clear an exception.
Field reference
enables you to return or set the value of static and non-static instance fields of the Java object.
Method reference
enables you to access static and non-static Java methods.
Output
enables you to send the Java output to its destination immediately.
Table 10.2 on page 2086 provides brief descriptions of the Java object methods. For more detailed descriptions, see the dictionary entry for each method.
2086
4
Dictionary
Table 10.2
Chapter 10
Categories and Descriptions of Java Object Language Elements
Category
Java Object Language Elements
Description
Deletion
“DELETE Method” on page 2091
Deletes the Java object.
Exception
“EXCEPTIONCHECK Method” on page 2092
Determines whether an exception occurred during a method call.
“EXCEPTIONCLEAR Method” on page 2093
Clears any exception that is currently being thrown.
“EXCEPTIONDESCRIBE Method” on page 2096
Turns the exception debug logging on or off and prints exception information.
“GETtypeFIELD Method” on page 2099
Returns the value of a non-static field for a Java object.
“GETSTATICtypeFIELD Method” on page 2101
Returns the value of a static field for a Java object.
“SETtypeFIELD Method” on page 2105
Modifies the value of a non-static field for a Java object.
“SETSTATICtypeFIELD Method” on page 2107
Modifies the value of a static field for a Java object.
“CALLtypeMETHOD Method” on page 2086
Invokes an instance method on a Java object from a non-static Java method.
Field reference
Method reference
“CALLSTATICtypeMETHOD Invokes an instance method on a Java object from a Method” on page 2089 static Java method. Output
“FLUSHJAVAOUTPUT Method” on page 2097
Specifies that the Java output is sent to its destination.
Dictionary
CALLtypeMETHOD Method Invokes an instance method on a Java object from a non-static Java method. Category: Applies to:
Method reference Java object
Syntax object.CALLtypeMETHOD ("method-name", < method-argument-1 …, method-argument-n>, );
Java Object Language Elements
4
CALLtypeMETHOD Method
2087
Arguments
object
specifies the name of the Java object. type
specifies the result type for the non-static Java method. The type can be one of the following values: BOOLEAN specifies that the result type is BOOLEAN. BYTE specifies that the result type is BYTE. CHAR specifies that the result type is CHAR. DOUBLE specifies that the result type is DOUBLE. FLOAT specifies that the result type is FLOAT. INT specifies that the result type is INT. LONG specifies that the result type is LONG. SHORT specifies that the result type is SHORT. STRING specifies that the result type is STRING. VOID specifies that the result type is VOID. See Also: “Type Issues” in SAS Language Reference: Concepts method-name
specifies the name of the non-static Java method. The method name must be enclosed in either single or double quotation marks.
Requirement:
method-argument
specifies the parameters to pass to the method. return-value
specifies the return value if the method returns one.
Details Once you instantiate a Java object, you can access any non-static Java method through method calls on the Java object by using the CALLtypeMETHOD method. Note: The type argument represents a Java data type. For more information about how Java data types relate to SAS data types, see “Type Issues” in SAS Language Reference: Concepts. 4
2088
CALLtypeMETHOD Method
4
Chapter 10
Comparisons Use the CALLtypeMETHOD method for non-static Java methods. If the Java method is static, use the CALLSTATICtypeMETHOD method.
Example The following example creates a simple class that contains three non-static fields. The Java object j is instantiated, and then the field values are set and retrieved using the CALLtypeFIELD method.
/* Java code */ import java.util.*; import java.lang.*; public class ttest { public int i; public double d; public string s; public int im() { return i; } public String sm() { return s; } public double dm() { return d; } }
/* DATA step code */ data _null_; dcl javaobj j("ttest"); length val 8; length str $20; j.setIntField("i", 100); j.setDoubleField("d", 3.14159); j.setStringField("s", "abc"); j.callIntMethod("im", val); put val=; j.callDoubleMethod("dm", val); put val=; j.callStringMethod("sm", str); put str=; run;
Java Object Language Elements
4
CALLSTATICtypeMETHOD Method
The following lines are written to the SAS log:
val=100 val=3.14159 str=abc
See Also Method: “CALLSTATICtypeMETHOD Method” on page 2089
CALLSTATICtypeMETHOD Method Invokes an instance method on a Java object from a static Java method. Category: Method reference Applies to:
Java object
Syntax object.CALLSTATICtypeMETHOD ("method-name", , < return-value>);
Arguments object
specifies the name of the Java object. type
specifies the result type for the static Java method. The type can be one of the following values: BOOLEAN specifies that the result type is BOOLEAN. BYTE specifies that the result type is BYTE. CHAR specifies that the result type is CHAR. DOUBLE specifies that the result type is DOUBLE. FLOAT specifies that the result type is FLOAT. INT specifies that the result type is INT.
2089
2090
CALLSTATICtypeMETHOD Method
4
Chapter 10
LONG specifies that the result type is LONG. SHORT specifies that the result type is SHORT. STRING specifies that the result type is STRING. VOID specifies that the result type is VOID. See Also: “Type Issues” in SAS Language Reference: Concepts method-name
specifies the name of the static Java method. The method name must be enclosed in either single or double quotation marks.
Requirement:
method-argument
specifies the parameters to pass to the method. return-value
specifies the return value if the method returns one.
Details Once you instantiate a Java object, you can access any static Java method through method calls on the Java object by using the CALLSTATICtypeMETHOD method. Note: The type argument represents a Java data type. For more information about how Java data types relate to SAS data types, see “Type Issues” in SAS Language Reference: Concepts. 4
Comparisons Use the CALLSTATICtypeMETHOD method for static Java methods. If the Java method is not static, use the CALLtypeMETHOD method.
Example The following example creates a simple class that contains three static fields. The Java object j is instantiated, and then the field values are set and retrieved using the CALLSTATICtypeFIELD method.
/* Java code */ import java.util.*; import java.lang.*; public class ttestc { public static double d; public static double dm() { return d; }
Java Object Language Elements
4
DELETE Method
/* DATA step code */ data x; declare javaobj j("ttestc"); length d 8; j.SetStaticDoubleField("d", 3.14159); j.callStaticDoubleMethod("dm", d); put d=; run;
The following line is written to the SAS log:
d=3.14159
See Also Method: “CALLtypeMETHOD Method” on page 2086
DECLARE Statement, Java Object Declares a Java object; creates an instance of and initializes data for a Java object. DATA step Applies to: Java object See: “DECLARE Statement, Java Object” on page 1434 Valid in:
DELETE Method Deletes the Java object. Category: Deletion Applies to:
Java object
Syntax object.DELETE( );
Arguments object
specifies the name of the Java object.
2091
2092
EXCEPTIONCHECK Method
4
Chapter 10
Details DATA step component objects are deleted automatically at the end of the DATA step. If you want to reuse the object reference variable in another Java object constructor, you should delete the Java object by using the DELETE method. If you attempt to use a Java object after you delete it, you will receive an error in the log.
EXCEPTIONCHECK Method Determines whether an exception occurred during a method call. Category: Applies to:
Exception Java object
Syntax object.EXCEPTIONCHECK(status);
Arguments object
specifies the name of the Java object. status
specifies the exception status that is returned. Tip: The status value that is returned by Java is of type DOUBLE, which corresponds to a SAS numeric data value.
Details Java exceptions are handled through the EXCEPTIONCHECK, EXCEPTIONCLEAR, and EXCEPTIONDESCRIBE methods. The EXCEPTIONCHECK method is used to determine whether an exception occurred during a method call. Ideally, the EXCEPTIONCHECK method should be called after every call to a Java method that can throw an exception.
Example In the following example, the Java class contains a method that throws an exception. The DATA step calls the method and checks for an exception.
/* Java code */ public class a { public void m() throws NullPointerException { throw new NullPointerException();
Java Object Language Elements
4
EXCEPTIONCLEAR Method
} }
/* DATA step code */ data _null_; length e 8; dcl javaobj j(’a’); rc = j.callvoidmethod(’m’); /* Check for exception. Value is returned in variable ’e’ */ rc = j.exceptioncheck(e); if (e) then put ’exception’; else put ’no exception’; run;
The following line is written to the SAS log:
exception
See Also Method: “EXCEPTIONCLEAR Method” on page 2093 “EXCEPTIONDESCRIBE Method” on page 2096
EXCEPTIONCLEAR Method Clears any exception that is currently being thrown. Category: Exception Applies to:
Java object
Syntax object.EXCEPTIONCLEAR( );
Arguments object
specifies the name of the Java object.
2093
2094
EXCEPTIONCLEAR Method
4
Chapter 10
Details Java exceptions are handled through the EXCEPTIONCHECK, EXCEPTIONCLEAR, and EXCEPTIONDESCRIBE methods. If you call a method that throws an exception, it is strongly recommended that you check for an exception after the call. If an exception was thrown, you should take appropriate action and then clear the exception by using the EXCEPTIONCLEAR method. If no exception is currently being thrown, this method has no effect.
Example Example 1: Checking and Clearing an Exception
In the following example, the Java class contains a method that throws an exception. The method is called in the DATA step, and the exception is cleared. /* Java code */ public class a { public void m() throws NullPointerException { throw new NullPointerException(); } }
/* DATA step code */ data _null_; length e 8; dcl javaobj j(’a’); rc = j.callvoidmethod(’m’); /* Check for exception. Value is returned in variable ’e’ */ rc = j.exceptioncheck(e); if (e) then put ’exception’; else put ’no exception’; /* rc rc if
Clear the exception and check it again */ = j.exceptionclear( ); = j.exceptioncheck(e); (e) then put ’exception’; else put ’no exception’; run;
The following lines are written to the SAS log: exception no exception
Java Object Language Elements
4
EXCEPTIONCLEAR Method
2095
Example 2: Checking for an Exception When Reading an External File In this example, the Java IO classes are used to read an external file from the DATA step. The Java code creates a wrapper class for DataInputStream, which enables you to pass a FileInputStream to the constructor. The wrapper is necessary because the constructor actually takes an InputStream, which is the parent of FileInputStream, and the current method lookup is not robust enough to perform the superclass lookup. /* Java code */ public class myDataInputStream extends java.io.DataInputStream { myDataInputStream(java.io.FileInputStream fi) { super(fi); } }
After you create the wrapper class, you can use it to create a DataInputStream for an external file and read the file until the end-of-file is reached. The EXCEPTIONCHECK method is used to determine when the readInt method throws an EOFException, which enables you to end the input loop. /* DATA step code */ data _null_; length d e 8; dcl javaobj f("java/io/File", "c:\temp\binint.txt"); dcl javaobj fi("java/io/FileInputStream", f); dcl javaobj di("myDataInputStream", fi); do while(1); di.callIntMethod("readInt", d); di.ExceptionCheck(e); if (e) then leave; else put d=; end; run;
See Also Method: “EXCEPTIONCHECK Method” on page 2092 “EXCEPTIONDESCRIBE Method” on page 2096
2096
EXCEPTIONDESCRIBE Method
4
Chapter 10
EXCEPTIONDESCRIBE Method Turns the exception debug logging on or off and prints exception information. Category: Applies to:
Exception Java object
Syntax object.EXCEPTIONDESCRIBE(status);
Arguments object
specifies the name of the Java object. status
specifies whether exception debug logging is on or off. The status argument can be one of the following values: 0 specifies that debug logging is off. 1 specifies that debug logging is on. Default: 0 (off) Tip: The status value that is returned by Java is of type DOUBLE, which corresponds to a SAS numeric data value.
Details The EXCEPTIONDESCRIBE method is used to turn exception debug logging on or off. If exception debug logging is on, exception information is printed to the JVM standard output. Note:
By default, JVM standard output is redirected to the SAS log.
4
Example In the following example, exception information is printed to the standard output.
/* Java code */ public class a { public void m() throws NullPointerException { throw new NullPointerException(); } }
Java Object Language Elements
4
FLUSHJAVAOUTPUT Method
2097
/* DATA step code */ data _null_; length e 8; dcl javaobj j(’a’); j.exceptiondescribe(1); rc = j.callvoidmethod(’m’); run;
The following lines are written to the SAS log:
java.lang.NullPointerException at a.m(a.java:5)
See Also Method: “EXCEPTIONCHECK Method” on page 2092 “EXCEPTIONCLEAR Method” on page 2093
FLUSHJAVAOUTPUT Method Specifies that the Java output is sent to its destination. Category: Output Applies to:
Java object
Syntax object.FLUSHJAVAOUTPUT( );
Arguments
object
specifies the name of the Java object.
Details Java output that is directed to the SAS log is flushed when the DATA step terminates. If you use the FLUSHJAVAOUTPUT method, the Java output will appear after any output that was issued while the DATA step was running.
2098
FLUSHJAVAOUTPUT Method
4
Chapter 10
Example In the following example, the “In Java class” lines are written after the DATA step is complete. : /* Java code */ public class p { void p() { System.out.println("In Java class"); } }
/* DATA step code */ data _null_; dcl javaobj j(’p’); do i = 1 to 3; j.callVoidMethod(’p’); put ’In DATA Step’; end; run;
The following lines are written to the SAS log:
In In In In In In
DATA DATA DATA Java Java Java
Step Step Step class class class
If you use the FLUSHJAVAOUTPUT method, the Java output is written to the SAS log in the order of execution.
/* DATA step code */ data _null_; dcl javaobj j(’p’); do i = 1 to 3; j.callVoidMethod(’p’); j.flushJavaOutput(); put ’In DATA Step’; end; run;
The following lines are written to the SAS log:
In Java class In DATA Step
Java Object Language Elements
In In In In
Java DATA Java DATA
4
GETtypeFIELD Method
class Step class Step
See Also “Java Standard Output” in SAS Language Reference: Concepts
GETtypeFIELD Method Returns the value of a non-static field for a Java object. Category: Field reference Applies to:
Java object
Syntax object.GETtypeFIELD("field-name", value);
Arguments
object
specifies the name of a Java object. type
specifies the type for the Java field. The type can be one of the following values: BOOLEAN specifies that the field type is BOOLEAN. BYTE specifies that the field type is BYTE. CHAR specifies that the field type is CHAR. DOUBLE specifies that the field type is DOUBLE. FLOAT specifies that the field type is FLOAT. INT specifies that the field type is INT. LONG specifies that the field type is LONG. SHORT specifies that the field type is SHORT.
2099
2100
GETtypeFIELD Method
4
Chapter 10
STRING specifies that the field type is STRING. See Also: “Type Issues” in SAS Language Reference: Concepts field-name
specifies the Java field name. Requirement: The field name must be enclosed in either single or double quotation marks. value
specifies the name of the variable that receives the returned field value.
Details Once you instantiate a Java object, you can access and modify its public fields through method calls on the Java object. The GETtypeFIELD method enables you to access non-static fields. Note: The type argument represents a Java data type. For more information about how Java data types relate to SAS data types, see “Type Issues” in SAS Language Reference: Concepts. 4
Comparisons The GETtypeFIELD method returns the value of a non-static field for a Java object. To return the value of a static field, use the GETSTATICtypeFIELD method.
Example The following example creates a simple class that contains three non-static fields. The Java object j is instantiated, and then the field values are modified and retrieved using the GETtypeFIELD method.
/* Java code */ import java.util.*; import java.lang.*; public class ttest { public int i; public double d; public string s; } }
/* DATA step code */ data _null_; dcl javaobj j("ttest"); length val 8; length str $20; j.setIntField("i", 100); j.setDoubleField("d", 3.14159); j.setStringField("s", "abc");
Java Object Language Elements
4
GETSTATICtypeFIELD Method
j.getIntField("i", val); put val=; j.getDoubleField("d", val); put val=; j.getStringField("s", str); put str=; run;
The following lines are written to the SAS log:
val=100 val=3.14159 str=abc
See Also Method: “GETSTATICtypeFIELD Method” on page 2101 “SETtypeFIELD Method” on page 2105
GETSTATICtypeFIELD Method Returns the value of a static field for a Java object. Category: Field reference Applies to:
Java object
Syntax object.GETSTATICtypeFIELD("field-name", value);
Arguments object
specifies the name of a Java object. type
specifies the type for the Java field. The type can be one of the following values: BOOLEAN specifies that the field type is BOOLEAN. BYTE specifies that the field type is BYTE. CHAR specifies that the field type is CHAR.
2101
2102
GETSTATICtypeFIELD Method
4
Chapter 10
DOUBLE specifies that the field type is DOUBLE. FLOAT specifies that the field type is FLOAT. INT specifies that the field type is INT. LONG specifies that the field type is LONG. SHORT specifies that the field type is SHORT. STRING specifies that the field type is STRING. See Also: “Type Issues” in SAS Language Reference: Concepts field-name
specifies the Java field name. Requirement: The field name must be enclosed in either single or double quotation marks. value
specifies the name of the variable that receives the returned field value.
Details Once you instantiate a Java object, you can access and modify its public fields through method calls on the Java object. The GETSTATICtypeFIELD method enables you to access static fields. Note: The type argument represents a Java data type. For more information about how Java data types relate to SAS data types, see “Type Issues” in SAS Language Reference: Concepts. 4
Comparisons The GETSTATICtypeFIELD method returns the value of a static field for a Java object. To return the value of a non-static field, use the GETtypeFIELD method.
Example The following example creates a simple class that contains three static fields. The Java object j is instantiated, and then the field values are set and retrieved using the GETSTATICtypeFIELD method.
/* Java code */ import java.util.*; import java.lang.*; public class ttestc { public static double d; public static double dm() { return d;
Java Object Language Elements
4
_NEW_ Operator, Java Object
} }
/* DATA step code */ data x; declare javaobj j("ttestc"); length d 8; j.callSetStaticDoubleField("d", 3.14159); j.callStaticDoubleMethod("dm", d); put d=; run;
The following line is written to the SAS log:
d=3.14159
See Also Method: “GETtypeFIELD Method” on page 2099 “SETSTATICtypeFIELD Method” on page 2107
_NEW_ Operator, Java Object Creates an instance of a Java object. Valid in:
DATA step
Applies to:
Java object
Syntax object-reference = _NEW_ JAVAOBJ ("java-class", );
Arguments
object-reference
specifies the object reference name for the Java object. java-class
specifies the name of the Java class to be instantiated. The Java class name must be enclosed in either single or double quotation marks.
Requirement:
2103
2104
_NEW_ Operator, Java Object
4
Chapter 10
argument
specifies the information that is used to create an instance of the Java object. Valid values for argument depend on the Java object. See Also: “Details” on page 2104
Details To use a DATA step component object in your SAS program, you must declare and create (instantiate) the object. The DATA step component interface provides a mechanism for accessing the predefined component objects from within the DATA step. If you use the _NEW_ operator to instantiate the Java object, you must first use the DECLARE statement to declare the Java object. For example, in the following lines of code, the DECLARE statement tells SAS that the object reference J is a Java object. The _NEW_ operator creates the Java object and assigns it to the object reference J. declare javaobj j; j = _new_ javaobj("somejavaclass" );
Note: You can use the DECLARE statement to declare and instantiate a Java object in one step. 4 A constructor is a method that is used to instantiate a component object and to initialize the component object data. For example, in the following lines of code, the _NEW_ operator instantiates a Java object and assigns it to the object reference J. Note that the only required argument for a Java object constructor is the name of the Java class to be instantiated. All other arguments are constructor arguments for the Java class itself. In the following example, the Java class name, testjavaclass, is the constructor, and the values 100 and .8 are constructor arguments. declare javaobj j; j = _new_ javaobj("testjavaclass", 100, .8);
For more information about the predefined DATA step component objects and constructors, see “Using DATA Step Component Objects” in SAS Language Reference: Concepts.
Comparisons You can use the DECLARE statement and the _NEW_ operator, or the DECLARE statement alone to declare and instantiate an instance of a Java object.
Examples In the following example, a Java class is created for a hash table. The _NEW_ operator is used to create and instantiate an instance of this class by specifying the capacity and load factor. In this example, a wrapper class, mhash, is necessary because the DATA step’s only numeric type is equivalent to the Java type DOUBLE.
/* Java code */ import java.util.*; public class mhash extends Hashtable; { mhash (double size, double load) {
Java Object Language Elements
4
SETtypeFIELD Method
super ((int)size, (float)load); } }
/* DATA step code */ data _null_; declare javaobj h; h = _new_ javaobj("mhash", 100, .8); run;
See Also Statements: “DECLARE Statement, Java Object” on page 1434 “Using DATA Step Component Objects” in SAS Language Reference: Concepts
SETtypeFIELD Method Modifies the value of a non-static field for a Java object. Category: Field reference Applies to:
Java object
Syntax object.SETtypeFIELD("field-name", value);
Arguments object
specifies the name of a Java object. type
specifies the type for the Java field. The type can be one of the following values: BOOLEAN specifies that the field type is BOOLEAN. BYTE specifies that the field type is BYTE. CHAR specifies that the field type is CHAR. DOUBLE specifies that the field type is DOUBLE. FLOAT specifies that the field type is FLOAT.
2105
2106
SETtypeFIELD Method
4
Chapter 10
INT specifies that the field type is INT. LONG specifies that the field type is LONG. SHORT specifies that the field type is SHORT. STRING specifies that the field type is STRING. See Also: “Type Issues” in SAS Language Reference: Concepts field-name
specifies the Java field name. Requirement: The field name must be enclosed in either single or double quotation marks. value
specifies the value for the field.
Details Once you instantiate a Java object, you can access and modify its public fields through method calls on the Java object. The SETtypeFIELD method enables you to modify non-static fields. Note: The type argument represents a Java data type. For more information about how Java data types relate to SAS data types, see “Type Issues” in SAS Language Reference: Concepts. 4
Comparisons The SETtypeFIELD method modifies the value of a non-static field for a Java object. To modify the value of a static field, use the SETSTATICtypeFIELD method.
Example The following example creates a simple class that contains three non-static fields. The Java object j is instantiated, the field values are set using the SETtypeFIELD method, and then the field values are retrieved.
/* Java code */ import java.util.*; import java.lang.*; public class ttest { public int i; public double d; public string s; } }
/* DATA step code */ data _null_;
Java Object Language Elements
dcl javaobj j("ttest"); length val 8; length str $20; j.setIntField("i", 100); j.setDoubleField("d", 3.14159); j.setStringField("s", "abc"); j.getIntField("i", val); put val=; j.getDoubleField("d", val); put val=; j.getStringField("s", str); put str=; run;
The following lines are written to the SAS log:
val=100 val=3.14159 str=abc
See Also Method: “GETtypeFIELD Method” on page 2099 “SETSTATICtypeFIELD Method” on page 2107
SETSTATICtypeFIELD Method Modifies the value of a static field for a Java object. Category: Field reference Applies to:
Java object
Syntax object.SETSTATICtypeFIELD("field-name", value);
4
SETSTATICtypeFIELD Method
2107
2108
SETSTATICtypeFIELD Method
4
Chapter 10
Arguments
object
specifies the name of a Java object. type
specifies the type for the Java field. The type can be one of the following values: BOOLEAN specifies that the field type is BOOLEAN. BYTE specifies that the field type is BYTE. CHAR specifies that the field type is CHAR. DOUBLE specifies that the field type is DOUBLE. FLOAT specifies that the field type is FLOAT. INT specifies that the field type is INT. LONG specifies that the field type is LONG. SHORT specifies that the field type is SHORT. STRING specifies that the field type is STRING. See Also: “Type Issues” in SAS Language Reference: Concepts field-name
specifies the Java field name. Requirement:
The field name must be enclosed in either single or double quotation
marks. value
specifies the value for the field.
Details Once you instantiate a Java object, you can access and modify its public fields through method calls on the Java object. The SETSTATICtypeFIELD method enables you to modify static fields. Note: The type argument represents a Java data type. For more information about how Java data types relate to SAS data types, see “Type Issues” in SAS Language Reference: Concepts. 4
Comparisons The SETSTATICtypeFIELD method modifies the value of a static field for a Java object. To modify the value of a non-static field, use the SETtypeFIELD method.
Java Object Language Elements
4
SETSTATICtypeFIELD Method
2109
Example The following example creates a simple class that contains three static fields. The Java object j is instantiated, the field values are set using the SETSTATICtypeFIELD method, and then the field values are retrieved.
/* Java code */ import java.util.*; import java.lang.*; public class ttestc { public static double d; public static double dm() { return d; } }
/* DATA step code */ data x; declare javaobj j("ttestc"); length d 8; j.callSetStaticDoubleField("d", 3.14159); j.callStaticDoubleMethod("dm", d); put d=; run;
The following line is written to the SAS log:
d=3.14159
See Also Method: “GETSTATICtypeFIELD Method” on page 2101 “SETtypeFIELD Method” on page 2105
2110
2111
3
P A R T
Appendixes Appendix
1. . . . . . . . . DATA Step Debugger
Appendix
2. . . . . . . . . Perl Regular Expression (PRX) Metacharacters
Appendix
3. . . . . . . . . SAS Utility Macro
Appendix
4. . . . . . . . . Recommended Reading
2113
2153 2157
2145
2112
2113
APPENDIX
1 DATA Step Debugger Introduction 2114 Definition: What Is Debugging? 2114 Definition: The DATA Step Debugger 2114 Basic Usage 2115 How a Debugger Session Works 2115 Using the Windows 2115 Entering Commands 2115 Working with Expressions 2116 Assigning Commands to Function Keys 2116 Advanced Usage: Using the Macro Facility with the Debugger 2116 Using Macros as Debugging Tools 2116 Creating Customized Debugging Commands with Macros 2116 Debugging a DATA Step Generated by a Macro 2117 Examples 2117 Example 1: Debugging a Simple DATA Step 2117 Discovering a Problem 2117 Using the DEBUG Option 2118 Examining Data Values after the First Iteration 2119 Examining Data Values after the Second Iteration 2120 Ending the Debugger 2122 Correcting the DATA Step 2122 Example 2: Working with Formats 2123 Example 3: Debugging DO Loops 2128 Example 4: Examining Formatted Values of Variables 2128 Commands 2129 List of Debugger Commands 2129 Debugger Commands by Category 2130 Dictionary 2130 BREAK 2130 CALCULATE 2133 DELETE 2134 DESCRIBE 2135 ENTER 2135 EXAMINE 2136 GO 2137 HELP 2138 JUMP 2139 LIST 2140 QUIT 2141 SET 2141 STEP 2142
2114
Introduction
4
Appendix 1
SWAP 2143 TRACE 2143 WATCH 2144
Introduction Definition: What Is Debugging? Debugging is the process of removing logic errors from a program. Unlike syntax errors, logic errors do not stop a program from running. Instead, they cause the program to produce unexpected results. For example, if you create a DATA step that keeps track of inventory, and your program shows that you are out of stock but your warehouse is full, you have a logic error in your program. To debug a DATA step, you could do the following:
3 copy a few lines of the step into another DATA step, execute it, and print the results of those statements
3 insert PUT statements at selected places in the DATA step, submit the step, and examine the values that are displayed in the SAS log.
3 use the DATA step debugger. While the SAS log can help you identify data errors, the DATA step debugger offers you an easier, interactive way to identify logic errors, and sometimes data errors, in DATA steps.
Definition: The DATA Step Debugger The DATA step debugger is part of Base SAS software and consists of windows and a group of commands. By issuing commands, you can execute DATA step statements one by one and pause to display the resulting variable values in a window. By observing the results that are displayed, you can determine where the logic error lies. Because the debugger is interactive, you can repeat the process of issuing commands and observing the results as many times as needed in a single debugging session. To invoke the debugger, add the DEBUG option to the DATA statement and execute the program. The DATA step debugger enables you to perform the following tasks:
3 execute statements one by one or in groups 3 bypass execution of one or more statements 3 suspend execution at selected statements, either in each iteration of DATA step statements or on a condition you specify, and resume execution on command
3 monitor the values of selected variables and suspend execution at the point a value changes
3 3 3 3 3
display the values of variables and assign new values to them display the attributes of variables receive help for individual debugger commands assign debugger commands to function keys use the macro facility to generate customized debugger commands.
DATA Step Debugger
4
Entering Commands
2115
Basic Usage How a Debugger Session Works When you submit a DATA step with the DEBUG option, SAS compiles the step, displays the debugger windows, and pauses until you enter a debugger command to begin execution. If you begin execution with the GO command, for example, SAS executes each statement in the DATA step. To suspend execution at a particular line in the DATA step, use the BREAK command to set breakpoints at statements you select. Then issue the GO command. The GO command starts or resumes execution until the breakpoint is reached. To execute the DATA step one statement at a time or a few statements at a time, use the STEP command. By default, the STEP command is mapped to the ENTER key. In a debugging session, statements in a DATA step can iterate as many times as they would outside the debugging session. When the last iteration has finished, a message appears in the DEBUGGER LOG window. You cannot restart DATA step execution in a debugging session after the DATA step finishes executing. You must resubmit the DATA step in your SAS session. However, you can examine the final values of variables after execution has ended. You can debug only one DATA step at a time. You can use the debugger only with a DATA step, and not with a PROC step.
Using the Windows The DATA step debugger contains two primary windows, the DEBUGGER LOG and the DEBUGGER SOURCE windows. The windows appear when you execute a DATA step with the DEBUG option. The DEBUGGER LOG window records the debugger commands you issue and their results. The last line is the debugger command line, where you issue debugger commands. The debugger command line is marked with a greater than (>) prompt. The DEBUGGER SOURCE window contains the SAS statements that comprise the DATA step you are debugging. The window enables you to view your position in the DATA step as you debug your program. In the window, the SAS statements have the same line numbers as they do in the SAS log. You can enter windowing environment commands on the window command lines. You can also execute commands by using function keys.
Entering Commands Enter DATA step debugger commands on the debugger command line. For a list of commands and their descriptions, refer to “Debugger Commands by Category” on page 2130. Follow these rules when you enter a command:
3 A command can occupy only one line (except for a DO group). 3 A DO group can extend over more than one line. 3 To enter multiple commands, separate the commands with semicolons: examine _all_; set letter=’bill’; examine letter
2116
Working with Expressions
4
Appendix 1
Working with Expressions All SAS operators that are described in “SAS Operators in Expressions” in SAS Language Reference: Concepts, are valid in debugger expressions. Debugger expressions cannot contain functions. A debugger expression must fit on one line. You cannot continue an expression on another line.
Assigning Commands to Function Keys To assign debugger commands to function keys, open the Keys window. Position your cursor in the Definitions column of the function key you want to assign, and begin the command with the term DSD. To assign more than one command to a function key, enclose the commands (separated by semicolons) in quotation marks. Be sure to save your changes. These examples show commands assigned to function keys:
3
dsd step3
3
dsd ’examine cost saleprice; go 120;’
Advanced Usage: Using the Macro Facility with the Debugger You can use the SAS macro facility with the debugger to invoke macros from the DEBUGGER LOG command line. You can also define macros and use macro program statements, such as %LET, on the debugger command line.
Using Macros as Debugging Tools Macros are useful for storing a series of debugger commands. Executing the macro at the DEBUGGER LOG command line then generates the entire series of debugger commands. You can also use macros with parameters to build different series of debugger commands based on various conditions.
Creating Customized Debugging Commands with Macros You can create a customized debugging command by defining a macro on the DEBUGGER LOG command line. Then invoke the macro from the command line. For example, to examine the variable COST, to execute five statements, and then to examine the variable DURATION, define the following macro (in this case the macro is called EC). Note that the example uses the alias for the EXAMINE command. %macro ec; ex cost; step 5; ex duration; %mend ec;
To issue the commands, invoke macro EC from the DEBUGGER LOG command line: %ec
The DEBUGGER LOG displays the value of COST, executes the next five statements, and then displays the value of DURATION. Note: Defining a macro on the DEBUGGER LOG command line allows you to use the macro only during the current debugging session, because the macro is not permanently stored. To create a permanently stored macro, use the Program Editor. 4
DATA Step Debugger
4
Example 1: Debugging a Simple DATA Step
2117
Debugging a DATA Step Generated by a Macro You can use a macro to generate a DATA step, but debugging a DATA step that is generated by a macro can be difficult. The SAS log displays a copy of the macro, but not the DATA step that the macro generated. If you use the DEBUG option at this point, the text that the macro generates appears as a continuous stream to the debugger. As a result, there are no line breaks where execution can pause. To debug a DATA step that is generated by a macro, use the following steps: 1 Use the MPRINT and MFILE system options when you execute your program. 2 Assign the fileref MPRINT to an existing external file. MFILE routes the program output to the external file. Note that if you rerun your program, current output appends to the previous output in your file. 3 Invoke the macro from a SAS session. 4 In the Program Editor window, issue the INCLUDE command or use the File
menu to open your external file. 5 Add the DEBUG option to the DATA statement and begin a debugging session. 6 When you locate the logic error, correct the portion of the macro that generated
that statement or statements.
Examples
Example 1: Debugging a Simple DATA Step This example shows how to debug a DATA step when output is missing.
Discovering a Problem This program creates information about a travel tour group. The data files contain two types of records. One type contains the tour code, and the other type contains customer information. The program creates a report listing tour number, name, age, and sex for each customer. /* first execution */ data tours (drop=type); input @1 type $ @; if type=’H’ then do; input @3 Tour $20.; return; end; else if type=’P’ then do; input @3 Name $10. Age 2. +1 Sex $1.; output; end; datalines; H Tour 101 P Mary E 21 F P George S 45 M P Susan K 3 F
2118
Example 1: Debugging a Simple DATA Step
H P P P ;
Tour 102 Adelle S Walter P Fran I
4
Appendix 1
79 M 55 M 63 F
proc print data=tours; title ’Tour List’; run;
Obs
Tour
1 2 3 4 5 6
Tour List Name
Age
1
Mary E George S Susan K Adelle S Walter P Fran I
21 45 3 79 55 63
Sex F M F M M F
The program executes without error, but the output is unexpected. The output does not contain values for the variable Tour. Viewing the SAS log will not help you debug the program because the data are valid and no errors appear in the log. To help identify the logic error, run the DATA step again using the DATA step debugger.
Using the DEBUG Option To invoke the DATA step debugger, add the DEBUG option to the DATA statement and resubmit the DATA step: data tours (drop=type) / debug;
The following display shows the resulting two debugger windows.
DATA Step Debugger
4
Example 1: Debugging a Simple DATA Step
2119
The upper window is the DEBUGGER LOG window. Issue debugger commands in this window by typing commands on the debugger command line (the bottom line, marked by a >). The debugger displays the command and results in the upper part of the window. The lower window is the DEBUGGER SOURCE window. It displays the DATA step submitted with the DEBUG option. Each line in the DATA step is numbered with the same line number used in the SAS log. One line appears in reverse video (or other highlighting, depending on your monitor). DATA step execution pauses just before the execution of the highlighted statement. At the beginning of your debugging session, the first executable line after the DATA statement is highlighted. This means that SAS has compiled the step and will begin to execute the step at the top of the DATA step loop.
Examining Data Values after the First Iteration To debug a DATA step, create a hypothesis about the logic error and test it by examining the values of variables at various points in the program. For example, issue the EXAMINE command from the debugger command line to display the values of all variables in the program data vector before execution begins: examine _all_
Note: Most debugger commands have abbreviations, and you can assign commands to function keys. The examples in this section, however, show the full command name to help you find the commands in “Debugger Commands by Category” on page 2130. 4 When you press ENTER, the following display appears:
2120
Example 1: Debugging a Simple DATA Step
4
Appendix 1
The values of all variables appear in the DEBUGGER LOG window. SAS has compiled, but not yet executed, the INPUT statement. Use the STEP command to execute the DATA step statements one at a time. By default, the STEP command is assigned to the ENTER key. Press ENTER repeatedly to step through the first iteration of the DATA step, and stop when the RETURN statement in the program is highlighted in the DEBUGGER SOURCE window. Because Tour information was missing in the program output, enter the EXAMINE command to view the value of the variable Tour for the first iteration of the DATA step. examine tour
The following display shows the results:
The variable Tour contains the value Tour 101, showing you that Tour was read. The first iteration of the DATA step worked as intended. Press ENTER to reach the top of the DATA step.
Examining Data Values after the Second Iteration You can use the BREAK command (also known as setting a breakpoint) to suspend DATA step execution at a particular line you designate. In this example, suspend execution before executing the ELSE statement by setting a breakpoint at line 7. break 7
DATA Step Debugger
4
Example 1: Debugging a Simple DATA Step
2121
When you press ENTER, an exclamation point appears at line 7 in the DEBUGGER SOURCE window to mark the breakpoint:
Execute the GO command to continue DATA step execution until it reaches the breakpoint (in this case, line 7): go
The following display shows the result:
SAS suspended execution just before the ELSE statement in line 7. Examine the values of all the variables to see their status at this point. examine _all_
2122
Example 1: Debugging a Simple DATA Step
4
Appendix 1
The following display shows the values:
You expect to see a value for Tour, but it does not appear. The program data vector gets reset to missing values at the beginning of each iteration and therefore does not retain the value of Tour. To solve the logic problem, you need to include a RETAIN statement in the SAS program.
Ending the Debugger To end the debugging session, issue the QUIT command on the debugger command line: quit
The debugging windows disappear, and the original SAS session resumes.
Correcting the DATA Step Correct the original program by adding the RETAIN statement. Delete the DEBUG option from the DATA step, and resubmit the program: /* corrected version */ data tours (drop=type); retain Tour; input @1 type $ @; if type=’H’ then do; input @3 Tour $20.; return; end; else if type=’P’ then do; input @3 Name $10. Age 2. +1 Sex $1.; output; end; datalines; H Tour 101 P Mary E 21 F P George S 45 M P Susan K 3 F H Tour 102 P Adelle S 79 M P Walter P 55 M P Fran I 63 F ;
4
DATA Step Debugger
Example 2: Working with Formats
run; proc print; title ’Tour List’; run;
The values for Tour now appear in the output: Obs 1 2 3 4 5 6
Tour List Name
Tour Tour Tour Tour Tour Tour Tour
101 101 101 102 102 102
1 Age
Mary E George S Susan K Adelle S Walter P Fran I
21 45 3 79 55 63
Sex F M F M M F
Example 2: Working with Formats This example shows how to debug a program when you use format statements to format dates. The following program creates a report that lists travel tour dates for specific countries. options yearcutoff=1920; data tours; length Country $ 10; input Country $10. Start : mmddyy. End : mmddyy.; Duration=end-start; datalines; Italy 033000 041300 Brazil 021900 022800 Japan 052200 061500 Venezuela 110300 11800 Australia 122100 011501 ; proc print data=tours; format start end date9.; title ’Tour Duration’; run; Tour Duration Obs 1 2 3 4 5
Country Italy Brazil Japan Venezuela Australia
1
Start
End
30MAR2000 19FEB2000 22MAY2000 03NOV2000 21DEC2000
13APR2000 28FEB2000 15JUN2000 18JAN2000 15JAN2001
Duration 14 9 24 -290 25
2123
2124
Example 2: Working with Formats
4
Appendix 1
The value of Duration for the tour to Venezuela shows a negative number, -290 days. To help identify the error, run the DATA step again using the DATA step debugger. SAS displays the following debugger windows:
At the DEBUGGER LOG command line, issue the EXAMINE command to display the values of all variables in the program data vector before execution begins: examine _all_
Initial values of all variables appear in the DEBUGGER LOG window. SAS has not yet executed the INPUT statement. Press ENTER to issue the STEP command. SAS executes the INPUT statement, and the assignment statement is now highlighted. Issue the EXAMINE command to display the current value of all variables: examine _all_
DATA Step Debugger
4
Example 2: Working with Formats
2125
The following display shows the results:
Because a problem exists with the Venezuela tour, suspend execution before the assignment statement when the value of Country equals Venezuela. Set a breakpoint to do this: break 6 when country=’Venezuela’
Execute the GO command to resume program execution: go
SAS stops execution when the country name is Venezuela. You can examine Start and End tour dates for the Venezuela trip. Because the assignment statement is highlighted (indicating that SAS has not yet executed that statement), there will be no value for Duration. Execute the EXAMINE command to view the value of the variables after execution: examine _all_
2126
Example 2: Working with Formats
4
Appendix 1
The following display shows the results:
To view formatted SAS dates, issue the EXAMINE command using the DATEw. format: examine start date7. end date7.
The following display shows the results:
DATA Step Debugger
4
Example 2: Working with Formats
2127
Because the tour ends on November 18, 2000, and not on January 18, 2000, there is an error in the variable End. Examine the source data in the program and notice that the value for End has a typographical error. By using the SET command, you can temporarily set the value of End to November 18 to see whether you get the anticipated result. Issue the SET command using the DDMMMYYw. format: set end=’18nov00’d
Press ENTER to issue the STEP command and execute the assignment statement. Issue the EXAMINE command to view the tour date and Duration fields: examine start date7. end date7. duration
The following display shows the results:
The Start, End, and Duration fields contain correct data. End the debugging session by issuing the QUIT command on the DEBUGGER LOG command line. Correct the original data in the SAS program, delete the DEBUG option, and resubmit the program. /* corrected version */ options yearcutoff=1920; data tours; length Country $ 10; input Country $10. Start : mmddyy. End : mmddyy.; duration=end-start; datalines; Italy 033000 041300 Brazil 021900 022800 Japan 052200 061500 Venezuela 110300 111800 Australia 122100 011501
2128
Example 3: Debugging DO Loops
4
Appendix 1
; proc print data=tours; format start end date9.; title ’Tour Duration’; run; Tour Duration Obs 1 2 3 4 5
Country Italy Brazil Japan Venezuela Australia
1
Start
End
30MAR2000 19FEB2000 22MAY2000 03NOV2000 21DEC2000
13APR2000 28FEB2000 15JUN2000 18NOV2000 15JAN2001
duration 14 9 24 15 25
Example 3: Debugging DO Loops An iterative DO, DO WHILE, or DO UNTIL statement can iterate many times during a single iteration of the DATA step. When you debug DO loops, you can examine several iterations of the loop by using the AFTER option in the BREAK command. The AFTER option requires a number that indicates how many times the loop will iterate before it reaches the breakpoint. The BREAK command then suspends program execution. For example, consider this data set: data new / debug; set old; do i=1 to 20; newtest=oldtest+i; output; end; run;
To set a breakpoint at the assignment statement (line 4 in this example) after every five iterations of the DO loop, issue this command: break 4 after 5
When you issue the GO commands, the debugger suspends execution when I has the values of 5, 10, 15, and 20 in the DO loop iteration. In an iterative DO loop, select a value for the AFTER option that can be divided evenly into the number of iterations of the loop. For example, in this DATA step, 5 can be evenly divided into 20. When the DO loop iterates the second time, I again has the values of 5, 10, 15, and 20. If you do not select a value that can be evenly divided (such as 3 in this example), the AFTER option causes the debugger to suspend execution when I has the values of 3, 6, 9, 12, 15, and 18. When the DO loop iterates the second time, I has the values of 1, 4, 7, 10, 13, and 16.
Example 4: Examining Formatted Values of Variables You can use a SAS format or a user-created format when you display a value with the EXAMINE command. For example, assume that the variable BEGIN contains a
DATA Step Debugger
4
List of Debugger Commands
2129
SAS date value. To display the day of the week and date, use the SAS WEEKDATEw. format with EXAMINE: examine begin weekdate17.
When the value of BEGIN is 033001, the debugger displays the following: Sun, Mar 30, 2001
As another example, you can create a format named SIZE: proc format; value size 1-5=’small’ 6-10=’medium’ 11-high=’large’; run;
To debug a DATA step that applies the format SIZE. to the variable STOCKNUM, use the format with EXAMINE: examine stocknum size.
When the value of STOCKNUM is 7, for example, the debugger displays the following: STOCKNUM = medium
Commands List of Debugger Commands BREAK
JUMP
CALCULATE
LIST
DELETE
QUIT
DESCRIBE
SET
ENTER
STEP
EXAMINE
SWAP
GO
TRACE
HELP
WATCH
2130
Debugger Commands by Category
4
Appendix 1
Debugger Commands by Category Table A1.1
Categories and Descriptions of Debugger Commands
Category
DATA Step Debugger
Description
Controlling Program Execution
“GO” on page 2137
Starts or resumes execution of the DATA step
“JUMP” on page 2139
Restarts execution of a suspended program
“STEP” on page 2142
Executes statements one at a time in the active program
“HELP” on page 2138
Displays information about debugger commands
“SWAP” on page 2143
Switches control between the SOURCE window and the LOG window
“CALCULATE” on page 2133
Evaluates a debugger expression and displays the result
“DESCRIBE” on page 2135
Displays the attributes of one or more variables
“EXAMINE” on page 2136
Displays the value of one or more variables
“SET” on page 2141
Assigns a new value to a specified variable
“BREAK” on page 2130
Suspends program execution at an executable statement
“DELETE” on page 2134
Deletes breakpoints or the watch status of variables in the DATA step
“LIST” on page 2140
Displays all occurrences of the item that is listed in the argument
“TRACE” on page 2143
Controls whether the debugger displays a continuous record of the DATA step execution
“WATCH” on page 2144
Suspends execution when the value of a specified variable changes
Tailoring the Debugger
“ENTER” on page 2135
Assigns one or more debugger commands to the ENTER key
Terminating the Debugger
“QUIT” on page 2141
Terminates a debugger session
Controlling the Windows
Manipulating DATA Step Variables
Manipulating Debugging Requests
Dictionary
BREAK Suspends program execution at an executable statement. Category: Alias:
B
Manipulating Debugging Requests
DATA Step Debugger
4
BREAK
2131
Syntax BREAK location < DO group >
Arguments location
specifies where to set a breakpoint. Location must be one of these: label
a statement label. The breakpoint is set at the statement that follows the label.
line-number
the number of a program line at which to set a breakpoint.
*
the current line.
AFTER count
honors the breakpoint each time the statement has been executed count times. The counting is continuous. That is, when the AFTER option applies to a statement inside a DO loop, the count continues from one iteration of the loop to the next. The debugger does not reset the count value to 1 at the beginning of each iteration. If a BREAK command contains both AFTER and WHEN, AFTER is evaluated first. If the AFTER count is satisfied, the WHEN expression is evaluated. Tip:
The AFTER option is useful in debugging DO loops.
WHEN expression
honors a breakpoint when the expression is true. DO group
is one or more debugger commands enclosed by a DO and an END statement. The syntax of the DO group is the following: DO; command-1 < ... ; command-n; >END; command specifies a debugger command. Separate multiple commands by semicolons. A DO group can span more than one line and can contain IF-THEN/ELSE statements, as shown: IF expression THEN command; IF expression THEN DO group; IF evaluates an expression. When the condition is true, the debugger command or DO group in the THEN clause executes. An optional ELSE command gives an alternative action if the condition is not true. You can use these arguments with IF: expression specifies a debugger expression. A non-zero, nonmissing result causes the expression to be true. A result of zero or missing causes the expression to be false. command specifies a single debugger command. DO group specifies a DO group.
Details The BREAK command suspends execution of the DATA step at a specified statement. Executing the BREAK command is called setting a breakpoint.
2132
BREAK
4
Appendix 1
When the debugger detects a breakpoint, it does the following: 3 checks the AFTER count value, if present, and suspends execution if count breakpoint activations have been reached
3 evaluates the WHEN expression, if present, and suspends execution if the condition that is evaluated is true
3 3 3 3
suspends execution if neither an AFTER nor a WHEN clause is present displays the line number at which execution is suspended executes any commands that are present in a DO group returns control to the user with a > prompt.
If a breakpoint is set at a source line that contains more than one statement, the breakpoint applies to each statement on the source line. If a breakpoint is set at a line that contains a macro invocation, the debugger breaks at each statement generated by the macro.
Examples 3 Set a breakpoint at line 5 in the current program: b 5
3 Set a breakpoint at the statement after the statement label eoflabel: b eoflabel
3 Set a breakpoint at line 45 that will be honored after every third execution of line 45: b 45 after 3
3 Set a breakpoint at line 45 that will be honored after every third execution of that line only when the values of both DIVISOR and DIVIDEND are 0: b 45 after 3 when (divisor=0 and dividend=0)
3 Set a breakpoint at line 45 of the program and examine the values of variables NAME and AGE: b 45 do; ex name age; end;
3 Set a breakpoint at line 15 of the program. If the value of DIVISOR is greater than 3, execute STEP; otherwise, display the value of DIVIDEND. b 15 do; if divisor>3 then st; else ex dividend; end;
See Also Commands: “DELETE” on page 2134 “WATCH” on page 2144
DATA Step Debugger
4
CALCULATE
2133
CALCULATE Evaluates a debugger expression and displays the result. Category: Manipulating DATA Step Variables
Syntax CALC expression
Arguments expression
specifies any debugger expression. Restriction: Debugger expressions cannot contain functions.
Details The CALCULATE command evaluates debugger expressions and displays the result. The result must be numeric.
Examples 3 Add 1.1, 1.2, 3.4 and multiply the result by 0.5: calc (1.1+1.2+3.4)*0.5
3 Calculate the sum of STARTAGE and DURATION: calc startage+duration
3 Calculate the values of the variable SALE minus the variable DOWNPAY and then multiply the result by the value of the variable RATE. Divide that value by 12 and add 50: calc (((sale-downpay)*rate)/12)+50
See Also “Working with Expressions” on page 2116 for information on debugger expressions
2134
DELETE
4
Appendix 1
DELETE Deletes breakpoints or the watch status of variables in the DATA step. Manipulating Debugging Requests
Category:
D
Alias:
Syntax DELETE BREAK location DELETE WATCH variable(s) | _ALL_
Arguments
BREAK
deletes breakpoints. Alias:
B
location
specifies a breakpoint location to be deleted. Location can have one of these values: _ALL_
all current breakpoints in the DATA step.
label
the statement after a statement label.
line-number
the number of a program line.
*
the breakpoint from the current line.
WATCH
deletes watched status of variables. Alias:
W
variable(s)
names one or more watched variables for which the watch status is deleted. _ALL_
specifies that the watch status is deleted for all watched variables.
Examples 3 Delete the breakpoint at the statement label eoflabel
: d b eoflabel
3 Delete the watch status from the variable ABC in the current DATA step: d w abc
DATA Step Debugger
4
ENTER
2135
See Also Commands: “BREAK” on page 2130 “WATCH” on page 2144
DESCRIBE Displays the attributes of one or more variables. Category: Manipulating DATA Step Variables Alias:
DESC
Syntax DESCRIBE variable(s) | _ALL_
Arguments
variable(s)
identifies one or more DATA step variables _ALL_
indicates all variables that are defined in the DATA step.
Details The DESCRIBE command displays the attributes of one or more specified variables. DESCRIBE reports the name, type, and length of the variable, and, if present, the informat, format, or variable label.
Examples 3 Display the attributes of variable ADDRESS: desc address
3 Display the attributes of array element ARR{i + j}: desc arr{i+j}
ENTER Assigns one or more debugger commands to the ENTER key. Category: Tailoring the Debugger
2136
EXAMINE
4
Appendix 1
Syntax ENTER
Arguments command
specifies a debugger command. Default: STEP 1
Details The ENTER command assigns one or more debugger commands to the ENTER key. Assigning a new command to the ENTER key replaces the existing command assignment. If you assign more than one command, separate the commands with semicolons.
Examples 3 Assign the command STEP 5 to the ENTER key: enter st 5
3 Assign the commands EXAMINE and DESCRIBE, both for the variable CITY, to the ENTER key: enter ex city; desc city
EXAMINE Displays the value of one or more variables. Category: Alias:
Manipulating DATA Step Variables
E
Syntax EXAMINE variable-1 < ...variable-n > EXAMINE _ALL_
Arguments variable
identifies a DATA step variable. format
identifies a SAS format or a user-created format. _ALL_
identifies all variables that are defined in the current DATA step.
DATA Step Debugger
4
GO
2137
Details The EXAMINE command displays the value of one or more specified variables. The debugger displays the value using the format currently associated with the variable, unless you specify a different format.
Examples 3 Display the values of variables N and STR: ex n str
3 Display the element i of the array TESTARR: ex testarr{i}
3 Display the elements i+1, j*2, and k-3 of the array CRR: ex crr{i+1}; ex crr{j*2}; ex crr{k−3}
3 Display the SAS date variable T_DATE with the DATE7. format: ex t_date date7.
3 Display the values of all elements in array NEWARR: ex newarr{*}
See Also Command: “DESCRIBE” on page 2135
GO Starts or resumes execution of the DATA step. Category: Controlling Program Execution Alias:
G
Syntax GO
Without Arguments If you omit arguments, GO resumes execution of the DATA step and executes its statements continuously until a breakpoint is encountered, until the value of a watched variable changes, or until the DATA step completes execution.
Arguments line-number gives the number of a program line at which execution is to be suspended next.
2138
HELP
4
Appendix 1
0label is a statement label. Execution is suspended at the statement following the statement label.
Details The GO command starts or resumes execution of the DATA step. Execution continues until all observations have been read, a breakpoint specified in the GO command is reached, or a breakpoint set earlier with a BREAK command is reached.
Examples 3 Resume executing the program and execute its statements continuously: g
3 Resume program execution and then suspend execution at the statement in line 104: g 104
See Also Commands: “JUMP” on page 2139 “STEP” on page 2142
HELP Displays information about debugger commands. Category:
Controlling the Windows
Syntax HELP
Without Arguments The HELP command displays a directory of the debugger commands. Select a command name to view information about the syntax and usage of that command. You must enter the HELP command from a window command line, from a menu, or with a function key.
DATA Step Debugger
4
JUMP
2139
JUMP Restarts execution of a suspended program. Category: Controlling Program Execution
J
Alias:
Syntax JUMP line-number | label
Arguments
line-number
indicates the number of a program line at which to restart the suspended program. label
is a statement label. Execution resumes at the statement following the label.
Details The JUMP command moves program execution to the specified location without executing intervening statements. After executing JUMP, you must restart execution with GO or STEP. You can jump to any executable statement in the DATA step. CAUTION:
Do not use the JUMP command to jump to a statement inside a DO loop or to a label that is the target of a LINK-RETURN group. In such cases you bypass the controls set up at the beginning of the loop or in the LINK statement, and unexpected results can appear. 4 JUMP is useful in two situations:
3 when you want to bypass a section of code that is causing problems in order to concentrate on another section. In this case, use the JUMP command to move to a point in the DATA step after the problematic section.
3 when you want to re-execute a series of statements that have caused problems. In this case, use JUMP to move to a point in the DATA step before the problematic statements and use the SET command to reset values of the relevant variables to the values they had at that point. Then re-execute those statements with STEP or GO.
Examples 3 Jump to line 5:
j 5
See Also Commands: “GO” on page 2137 “STEP” on page 2142
2140
LIST
4
Appendix 1
LIST Displays all occurrences of the item that is listed in the argument. Manipulating Debugging Requests
Category:
L
Alias:
Syntax LIST _ALL_ | BREAK | DATASETS | FILES | INFILES | WATCH
Arguments
_ALL_
displays the values of all items. BREAK
displays breakpoints. Alias:
B
DATASETS
displays all SAS data sets used by the current DATA step. FILES
displays all external files to which the current DATA step writes. INFILES
displays all external files from which the current DATA step reads. WATCH
displays watched variables. Alias:
W
Examples 3 List all breakpoints, SAS data sets, external files, and watched variables for the current DATA step: l _all_
3 List all breakpoints in the current DATA step: l b
See Also Commands: “BREAK” on page 2130 “DELETE” on page 2134 “WATCH” on page 2144
DATA Step Debugger
4
SET
2141
QUIT Terminates a debugger session. Category: Terminating the Debugger Alias:
Q
Syntax QUIT
Without Arguments The QUIT command terminates a debugger session and returns control to the SAS session.
Details SAS creates data sets built by the DATA step that you are debugging. However, when you use QUIT to exit the debugger, SAS does not add the current observation to the data set. You can use the QUIT command at any time during a debugger session. After you end the debugger session, you must resubmit the DATA step with the DEBUG option to begin a new debugging session; you cannot resume a session after you have ended it.
SET Assigns a new value to a specified variable. Category: Manipulating DATA Step Variables Alias:
None
Syntax SET variable=expression
Arguments variable
specifies the name of a DATA step variable or an array reference. expression
is any debugger expression. Tip: Expression can contain the variable name that is used on the left side of the equal sign. When a variable appears on both sides of the equal sign, the debugger uses the original value on the right side to evaluate the expression and stores the result in the variable on the left.
2142
STEP
4
Appendix 1
Details The SET command assigns a value to a specified variable. When you detect an error during program execution, you can use this command to assign new values to variables. This enables you to continue the debugging session.
Examples 3 Set the variable A to the value of 3: set a=3
3 Assign to the variable B the value 12345 concatenated with the previous value of B: set b=’12345’ || b
3 Set array element ARR{1} to the result of the expression a+3: set arr{1}=a+3
3 Set array element CRR{1,2,3} to the result of the expression crr{1,1,2} + crr{1,1,3}: set crr{1,2,3} = crr{1,1,2} + crr{1,1,3}
3 Set variable A to the result of the expression a+c*3: set a=a+c*3
STEP Executes statements one at a time in the active program. Category: Controlling Program Execution Alias: ST
Syntax STEP
Without Arguments STEP executes one statement.
Arguments n specifies the number of statements to execute.
Details The STEP command executes statements in the DATA step, starting with the statement at which execution was suspended. When you issue a STEP command, the debugger: 3 executes the number of statements that you specify 3 displays the line number 3 returns control to the user and displays the > prompt.
4
Note:
By default, you can execute the STEP command by pressing the ENTER key.
DATA Step Debugger
4
TRACE
2143
See Also Commands: “GO” on page 2137 “JUMP” on page 2139
SWAP Switches control between the SOURCE window and the LOG window. Category: Controlling the Windows Alias:
None
Syntax SWAP
Without Arguments The SWAP command switches control between the LOG window and the SOURCE window when the debugger is running. When you begin a debugging session, the LOG window becomes active by default. While the DATA step is still being executed, the SWAP command enables you to switch control between the SOURCE and LOG window so that you can scroll and view the text of the program and also continue monitoring the program execution. You must enter the SWAP command from a window command line, from a menu, or with a function key.
TRACE Controls whether the debugger displays a continuous record of the DATA step execution. Category: Manipulating Debugging Requests Alias:
T
Default:
OFF
Syntax TRACE
Without Arguments TRACE displays the current status of the TRACE command.
2144
WATCH
4
Appendix 1
Arguments ON prepares for the debugger to display a continuous record of DATA step execution. The next statement that resumes DATA step execution (such as GO) records all actions taken during DATA step execution in the DEBUGGER LOG window. OFF stops the display.
Examples 3 Determine whether TRACE is ON or OFF: trace
3 Prepare to display a record of debugger execution: trace on
WATCH Suspends execution when the value of a specified variable changes. Manipulating Debugging Requests
Category: Alias:
W
Syntax WATCH variable(s)
Arguments
variable(s)
specifies one or more DATA step variables.
Details The WATCH command specifies a variable to monitor and suspends program execution when its value changes. Each time the value of a watched variable changes, the debugger does the following:
3 3 3 3 3
suspends execution displays the line number where execution has been suspended displays the variable’s old value displays the variable’s new value returns control to the user and displays the > prompt.
Examples 3 Monitor the variable DIVISOR for value changes: w divisor
2145
APPENDIX
2 Perl Regular Expression (PRX) Metacharacters Tables of Perl Regular Expression (PRX) Metacharacters 2145 General Constructs 2145 Basic Perl Metacharacters 2145 Metacharacters and Replacement Strings 2147 Other Quantifiers 2147 Greedy and Lazy Repetition Factors 2148 Class Groupings 2149 Look-Ahead and Look-Behind Behavior 2150 Comments and Inline Modifiers 2151 Selecting the Best Condition by Using Combining Operators
2151
Tables of Perl Regular Expression (PRX) Metacharacters General Constructs Table A2.1
General Constructs
Metacharacter
Description
()
indicates grouping.
non-metacharacter
matches a character.
{ } [ ]( )^ $ . | * + ? \
to match these characters, override (escape) with \.
\
overrides the next metacharacter.
\n
matches capture buffer n.
(?:...)
specifies a non-capturing group.
Basic Perl Metacharacters The following table lists the metacharacters that you can use to match patterns in Perl regular expressions.
2146
Basic Perl Metacharacters
Table A2.2
4
Appendix 2
Basic Perl Metacharacters and Their Descriptions
Metacharacter
Description
\a
matches an alarm (bell) character.
\A
matches a character only at the beginning of a string.
\b
matches a word boundary (the position between a word and a space):
3 3 \B
"er\b" matches the "er" in "never" "er\b" does not match the "er" in "verb"
matches a non-word boundary:
3 3
"er\B" matches the "er" in "verb" "er\B" does not match the "er" in "never"
\cA-\cZ
matches a control character. For example, \cX matches the control character control-X.
\C
matches a single byte.
\d
matches a digit character that is equivalent to [0–9].
\D
matches a non-digit character that is equivalent to [^0–9].
\e
matches an escape character.
\E
specifies the end of case modification.
\f
matches a form feed character.
\l
specifies that the next character is lowercase.
\L
specifies that the next string of characters, up to the \E metacharacter, is lowercase.
\n
matches a newline character.
\num $num
matches capture buffer num, where num is a positive integer. Perl variable syntax ($num) is valid when referring to capture buffers, but not in other cases.
\Q
escapes (places a backslash before) all non-word characters.
\r
matches a return character.
\s
matches any white space character, including space, tab, form feed, and so on, and is equivalent to [\f\n\r\t\v].
\S
matches any character that is not a white space character and is equivalent to [^\f\n\r\t\v].
\t
matches a tab character.
\u
specifies that the next character is uppercase.
\U
specifies that the next string of characters, up to the \E metacharacter, is uppercase.
\w
matches any word character or alphanumeric character, including the underscore.
\W
matches any non-word character or nonalphanumeric character, and excludes the underscore.
\ddd
matches the octal character ddd.
Perl Regular Expression (PRX) Metacharacters
4
Other Quantifiers
Metacharacter
Description
\xdd
matches the hexadecimal character dd.
\z
matches a character only at the end of a string.
\Z
matches a character only at the end of a string or before newline at the end of a string.
2147
Metacharacters and Replacement Strings You can use the following metacharacters in both a regular expression and in replacement text, when you use a substitution regular expression: \l \u \L \E \U \Q These metacharacters are useful in replacement text for controlling the case of capture buffers that are used within replacement text. For an example of how these metacharacters can be used, see “Replacing Text: Example 3” on page 324 For a description of these metacharacters, see Table A2.2 on page 2146.
Other Quantifiers The following table lists other qualifiers that you can use in Perl regular expressions. The descriptions of the metacharacters in the table include examples of how the metacharacters can be used.
Table A2.3
Other Quantifiers
Metacharacter
Description
\
marks the next character as either a special character, a literal, a back reference, or an octal escape:
3 3 3 |
“\n” matches a newline character “\\” matches “\” “\(“ matches”(“
specifies the or condition when you compare alphanumeric strings. For example, the construct x|y, matches either x or y:
3 3
"z|food" matches either "z" or "food" “(z|f)ood” matches “zood” or “food”
^
matches the position at the beginning of the input string.
$
matches the position at the end of the input string.
2148
Greedy and Lazy Repetition Factors
4
Appendix 2
Metacharacter
Description
period (.)
matches any single character except newline. To match any character including newline, use a pattern such as "[.\n]".
(pattern)
specifies grouping. Matches a pattern and creates a capture buffer for the match. To retrieve the position and length of the match that is captured, use CALL PRXPOSN. To retrieve the value of the capture buffer, use the PRXPOSN function. To match parentheses characters, use "\(" or "\)".
Greedy and Lazy Repetition Factors Perl regular expressions support “greedy” repetition factors and “lazy” repetition factors. A repetition factor is considered greedy when the repetition factor matches a string as many times as it can using a specific starting location. A repetition factor is considered lazy when it matches a string the minimum number of times that is needed to satisfy the match. To designate a repetition factor as lazy, add a ? to the end of the repetition factor. By default, repetition factors are considered greedy. The following table lists the greedy repetition factors. The descriptions of the repetition factors in the table include examples of how they can be used.
Table A2.4
Greedy Repetition Factors
Metacharacter
Description
*
matches the preceding subexpression zero or more times:
3 3 +
zo* matches "z" and "zoo" * is equivalent to {0,}
matches the preceding subexpression one or more times:
3 3 3 ?
"zo+" matches "zo" and "zoo" "zo+" does not match "z" + is equivalent to {1,}
matches the preceding subexpression zero or one time:
3 3
"do(es)?" matches the "do" in "do" or "does" ? is equivalent to {0,1}
{n}
matches at least n times.
{n,}
matches a pattern at least n times.
{n,m}
m and n are non-negative integers, where n