microsystems Programming Utilities & Libraries Part Number: 800-3847-10 Revision A of 27 March, 1990 Trademarks SunOS™, Sun Workstation®, as well as the word “Sun” followed by a numerical suffix, are trademarks of Sun Microsystems, Incorporated. UNIX® and UNIX System V® are trademarks of Bell Laboratories. PDP- 1 1 ® is a trademark of Digital Equipment Corporation. All other products or services mentioned in this document are identified by the trademarks or service marks of their respective companies or organizations. Copyright © 1990 Sun Microsystems, Inc. - Printed in U.S.A. All rights reserved. No part of this work covered by copyright hereon may be reproduced in any form or by any means - graphic, electronic, or mechanical - including photocopying, recording, taping, or storage in an information retrieval system, without the prior written permission of the copyright owner. Restricted rights legend: use, duplication, or disclosure by the U.S. government is subject to restrictions set forth in subparagraph (c)(l)(ii) of the Rights in Technical Data and Computer Software clause at DFARS 52.227-7013 and in similar clauses in the FAR and NASA FAR Supplement. The Sun Graphical User Interface was developed by Sun Microsystems, Inc. for its users and licensees. Sun ack- nowledges the pioneering efforts of Xerox in researching and developing the concept of visual or graphical user inter- faces for the computer industry. Sun holds a non-exclusive license from Xerox to the Xerox Graphical User Interface, which license also covers Sun’s licensees. This product is protected by one or more of the following U.S. patents: 4,777,485 4,688,190 4,527,232 4,745,407 4,679,014 4,435,792 4,719,569 4,550,368 in addition to foreign patents and applications pending. This software and documentation is based in part on the Fourth Berkeley Software Distribution under license from the Regents of the University of California. We acknowledge the following individuals and institutions for their role in its development: The Regents of the University of California, the Electrical Engineering and Computer Sciences Department at the Berkeley Campus of the University of California, and Other Contributors. Contents Chapter 1 Shared Libraries 1 1.1. Definitions 2 Shared Object .'. 2 Shared Library 2 Static vs. Dynamic Link Editing 2 Position Independent Code (PIC) 2 Static and Dynamic Link Editors 2 1.2. Using Shared Libraries 2 Building a Program to Use Shared Libraries 2 Binding Mode Options 4 -Bstatic and -Bdynamic 4 -N and -n Options for Id 4 Binding of PIC with Non-PIC 5 -dc and -dp Options 5 Use of Assertions 5 The -assert Option 5 Run-Time Use of Shared Libraries 5 SunOS Shared Libraries 6 Dynamic vs. Static Binding Semantics 6 Debuggers 6 Performance Issues 7 Dependencies on Other Files 7 Setuid Programs 8 1.3. Version Control 8 — iii — Contents — Continued Version Numbers of . so’s 8 Version Management Issues 8 1.4. Shared Library Mechanisms 9 Memory Sharing 9 The C Compiler 9 The Assembler 10 crtO () 10 Link Editors: Id and Id . so 10 Id. so 11 Binding and Unbinding Routines: diopen ( ) , dlsym ( ) , diclose ( ) , dlerror ( ) 11 1.5. Building a Shared Library 12 Building the . so File 12 The . sa File 12 Building the . sa File 13 1.6. Building a Better Library 13 Sizing Down the Data Segment 14 Using xstr to Extract String Definitions 14 Better Ordering of Objects 15 crtO . o Dependency 15 The ldconf ig Command 15 1.7. Shared Library Problems 15 Id. so Is Deleted 15 Wrong Library Is Used 16 Error Messages 16 Chapter 2 Lightweight Processes 17 2.1. Introduction 17 Definition 17 Functionality 17 Tutorial Goals 18 2.2. Threads 18 Stack Issues 20 - iv - Contents — Continued Stack Size 20 Protecting Against Stack Overflow 20 Coroutines 21 Custom Schedulers 22 Special Context Switching 23 2.3. Messages 25 Messages vs. Monitors 25 Rendezvous Semantics 26 Messages and Threads 26 Intelligent Servers 28 2.4. Agents 29 System Calls 30 Non-blocking I/O Library 30 Using the Non-Blocking 10 Library 3 1 Examples of Agents 36 2.5. Monitors and Conditions 39 Monitors vs. Interrupt Masking 40 Programming with Monitors 40 Monitors and Events 41 Condition Variables 41 Enforcing the Monitor Discipline 41 Nested Monitors 42 Reentrant Monitors 42 Monitor Program Examples 42 2.6. Exceptions 44 Synchronous Traps 45 Implementation 45 Example of Exception Handling 46 2.7. Big Example 47 Chapter 3 System V Interprocess Communication Facilities 53 3.1. IPC Facilities in the SunOS Operating System 53 File I/O and Pipes 53 Contents — Continued State Files and File Locking 53 Named Pipes 53 Networking Facilities 54 3.2. System V IPC Facilities in Release 4.1 54 Configuring System V IPC Facilities 54 System V IPC Permissions 54 IPC System Calls, Key Arguments, and Creation Flags 55 System V IPC Configuration Options 56 3.3. Messages 56 Structure of a Message Queue 57 Initializing a Message Queue with msgget ( ) 58 Controlling Message Queues with msgct 1 ( ) 60 Sending and Receiving Messages with msgsnd ( ) and msgrcvf) 63 3.4. Semaphores 67 Structure of a Semaphore Set 68 Initializing a Semaphore Set with semget ( ) 70 Controlling Semaphores with semctl ( ) 72 Performing Semaphore Operations with semop ( ) 77 3.5. Shared Memory 81 Structure of a Shared Memory Segment 8 1 Using shmget ( ) to Get Access to a Shared Memory Segment 82 Controlling a Shared Memory Segment with shmctl ( ) 84 Attaching and Detaching a Shared Memory Segment with shmat ( ) and shmdt ( ) 87 Chapter 4 SCCS — Source Code Control System 93 4.1. Introduction 93 The sees Command 93 Initializing the SCCS History File: secs create 93 Basic secs Subcommands 94 Deltas and Versions 95 SIDs 95 - vi- Contents — Continued ID Keywords 95 4.2. sees Subcommands 96 Checking Files In and Out 96 Checking Out a File for Editing: secs edit 96 Checking in a New Version: secs delta 96 Retrieving a Version: secs get 97 Reviewing Pending Changes: secs diffs 97 Deleting Pending Changes: secs unedit 98 Combining delta and get: secs delget 98 Combining delta and edit: sees deledit 98 Retrieving a Version by SID: secs get -r 98 Retrieving a Version by Date and Time: secs get -c 98 Repairing a Writable Copy: secs get -k -G 98 Incorporating Version-Dependent Information: ID Keywords 99 Making Inquiries 100 Seeing Which Version Has Been Retrieved: The what Command 100 Determining the Most Recent Version: sees get -g 100 Determining Who Has a File Checked Out: secs info 100 Displaying Delta Comments: secs prt 101 Updating a Delta Comment: secs ede 101 Comparing Checked-In Versions: secs sccsdiff 101 Displaying the Entire History: secs get -m -p 102 Creating Reports: secs prs -d 102 Deleting Committed Changes 103 Replacing a Delta: secs fix 103 Removing a Delta: secs rmdel 103 Reverting to an Earlier Version 103 Excluding Deltas from a Retrieved Version 104 Combining Versions: sees comb 104 4.3. Version Control for Binary Files 105 4.4. Maintaining Source Directories 106 Duplicate Source Directories 106 - vii- Contents — Continued SCCS and make 106 Keeping SIDs Consistent Across Files 106 Starting a New Release 107 Temporary Files used by SCCS 107 4.5. Branches 107 Using Branches 110 Creating a Branch Delta 1 10 Retrieving Versions From Branch Deltas 1 10 4.6. Administering SCCS Files Ill Interpreting Error Messages: sees help Ill Altering History File Defaults: secs admin Ill Validating the History File 112 Restoring the History File 112 4.7. Reference Tables 112 Chapter 5 make User’s Guide 115 5.1. Overview 115 Dependency Checking: make vs. Shell Scripts 115 Writing a Simple Makefile 116 B asic Use of Implicit Rules 1 1 8 Processing Dependencies 119 Null Rules 122 Unknown Targets 122 Running Commands Silently 122 Ignoring a Command’s Exit Status 123 Automatic Retrieval of SCCS Files 124 Suppressing SCCS Retrieval 124 Passing Parameters: Simple make Macros 124 Command Dependency Checking and . KEEP_STATE 125 Suppressing or Forcing Command Dependency Checking for Selected Lines 126 The State File 126 Hidden Dependencies and . KEEP_STATE 127 — viii — Contents — Continued Hidden Dependencies and . INIT 128 Displaying Information About a make Run 128 5.2. Compiling Programs with make 130 Compilation Strategies 130 A Simple Makefile 130 Using make’s Predefined Macros 131 Using Implicit Rules to Simplify a Makefile: Suffix Rules 132 When to Use Explicit Target Entries vs. Implicit Rules 134 Implicit Rules and Dynamic Macros 134 Dynamic Macro Modifiers 135 Dynamic Macros and the Dependency List: Delayed Macro References 135 Dependency List Read Twice 135 Rules Evaluated Once 136 No Transitive Closure for Suffix Rules 136 Adding Suffix Rules 136 Pattern-Matching Rules: an Alternative to Suffix Rules 137 make ’s Default Suffix Rules and Predefined Macros 138 5.3. Building Object Libraries 141 Libraries, Members and Symbols 141 Library Members and Dependency Checking 141 Library Member Name-Length Limit 142 . PRECIOUS : Preserving Libraries Against Removal Due to Interrupts 142 Libraries and the $% Dynamic Macro 142 5.4. Maintaining Programs and Libraries With make 142 More about Macros 142 Embedded Macro References 143 Suffix Replacement in Macro References 143 Using lint with make 144 Linking With System-Supplied Libraries 144 Compiling Programs for Debugging and Profiling 145 Conditional Macro Definitions 146 - ix - Contents — Continued Compiling Debugging and Profiling Variants 146 Maintaining Separate Program and Library Variants 148 Pattern-Replacement Macro References 148 Makefile for a Program with Separate Variants 150 Makefile for a Library with Separate Variants 151 Maintaining a Directory of Header Files 15 1 Compiling and Linking With Your Own Libraries 152 Nested make Commands 152 Forcing A Nested make Command to Run 153 The MAKEFLAGS Macro 154 Macro Definitions and Environment Variables: Passing Parameters to Nested make Commands 154 Compiling Other Source Files 157 Compiling and Linking a C Program with Assembly Language Routines 157 Compiling lex and yacc Sources 157 Specifying Target Groups With the + Sign 159 Maintaining Shell Scripts with make and SCCS 159 Running Tests with make 159 Escaped References to a Shell Variable 160 Shell Command Substitutions 160 Command Replacement Macro References 160 Command Replacement Macro Assignment 161 5.5. Maintaining Software Projects 161 Organizing A Project for Ease of Maintenance 162 Using include Makefiles 163 Installing Finished Programs and Libraries 163 Building the Entire Project 163 Maintaining Directory Hierarchies With Recursive Makefiles 164 Recursive Targets 164 Recursive install Targets 165 Maintaining A Large Library as a Hierarchy of Subsidiaries 166 5.6. Closing Remarks about make 168 - X- Contents — Continued Chapter 6 lint — a Program Verifier for C 169 6.1. Using lint 169 6.2. A Word About Philosophy 170 6.3. Unused Variables and Functions 170 6.4. Set/Used Information 171 6.5. Flow of Control 171 6.6. Function Values 172 6.7. Type Checking 172 6.8. Type Casts 173 6.9. Nonportable Character Use 173 6.10. Assignments of Longs to Ints 174 6.11. Strange Constructions 174 6.12. Pointer Alignment 175 6.13. Multiple Uses and Side Effects 175 6.14. Implementation 175 6.15. Portability 176 6.16. Shutting lint Up 177 6.17. Library Declaration Files 178 6.18. Considerations When Using lint 179 6.19. lint Options 179 Chapter 7 Performance Analysis 181 7.1. t ime — Display Time Used by a Program 181 7.2. prof — Generate Profile of a Program 184 7.3. gprof — Generate a Call Graph Profile 186 7.4. tcov — Statement-Level Analysis 188 Chapter 8 m4 — a Macro Processor 193 8.1. Using the m4 Command 194 8.2. Defining Macros 194 8.3. Quoting and Comments 195 8.4. Macros with Arguments 197 8.5. Arithmetic Built-ins 197 - xi- Contents — Continued 8.6. File Manipulation 198 8.7. Running SunOS Commands 199 8.8. Conditionals 199 8.9. String Manipulation 200 8.10. Printing 201 8.11. Summary of Built-In m4 Macros 201 Chapter 9 lex — a Lexical Analyzer Generator 203 9.1. lex Source 206 9.2. lex Regular Expressions 207 9.3. lex Actions 210 9.4. Ambiguous Source Rules 214 9.5. lex Source Definitions 216 9.6. Using lex 217 9.7. lex and yacc 218 9.8. Examples 218 9.9. Left Context-Sensitivity 221 9.10. Character Set 223 9.11. Summary of Source Format 224 9.12. Caveats and Bugs 226 Chapter 10 yacc — Yet Another Compiler-Compiler 227 10.1. Basic Specifications 230 10.2. Actions 232 10.3. Lexical Analysis 234 10.4. How the Parser Works 236 10.5. Ambiguity and Conflicts 240 10.6. Precedence 244 10.7. Error Handling 247 10.8. The yacc Environment 249 10.9. Hints for Preparing Specifications 249 Input Style 250 Left Recursion 250 - xii - Contents — Continued Lexical Tie-ins 251 Reserved Words 252 10.10. Advanced Topics 252 Simulating Error and Accept in Actions 252 Accessing Values in Enclosing Rules 252 Support for Arbitrary Value Types 253 10.11. A Simple Example 254 10.12. yacc Input Syntax 256 10.13. An Advanced Example 257 10.14. Old Features Supported but not Encouraged 262 Chapter 11 The curses Library: Screen-Oriented Cursor Motions 265 Overview 265 Terminology 265 Cursor Addressing Conventions 266 Compiling Things 266 Screen Updating 267 Naming Conventions 267 11.1. Variables 268 11.2. Programming Curses 269 Starting Up 269 The Nitty-Gritty 269 Output 269 Input 270 Miscellaneous 270 Finishing Up 270 1 1.3. Cursor Motion Optimization: Standing Alone 270 Terminal Information 271 Movement Optimizations, or, Getting Over Yonder 271 11.4. Curses Functions 272 Output Functions 272 addch ( ) and waddch ( ) — Add Character to Window 272 — xiii - Contents — Continued addst r ( ) and waddstr ( ) — Add String to Window 272 box ( ) — Draw Box Around Window 273 clear () andwclear() — Reset Window 273 clearok ( ) — Set Clear Flag 273 clrtobotO and wclrtobot () — Clear to Bottom 273 clrtoeol() and wclrtoeol () — Clear to End of Line 273 delch ( ) and wdelch ( ) — Delete Character 273 deleteln ( ) and wdeleteln ( ) — Delete Current Line 274 erase and werase ( ) — Erase Window 274 f lu shok — Control Flushing of s t dout 274 idlok — Control Use of Insert/Delete Line 274 ins ch ( ) and wins ch () — Insert Character 274 insertln() and winsertln ( ) — InsertLine 275 move and wmove ( ) — Move 275 overlay ( ) — Overlay Windows 275 overwrite () — Overwrite Windows 275 printw() andwprintw() — Print to Window 275 refresh () and wref resh ( ) — Synchronize 276 standout ( ) and wstandout ( ) — Put Characters in Standout Mode 276 Input Functions 276 crbreak and nocrbreak — Set or Unset from Cbreak mode 276 echo ( ) and noecho ( ) — Turn Echo On or Off 276 get ch ( ) and wget ch ( ) — Get Character from Terminal 276 get str ( ) and wgetstr ( ) — Get String from Terminal 277 raw ( ) and noraw ( ) — Turn Raw Mode On or Off 277 scanw ( ) and wscanw ( ) — Read String from Terminal 277 Miscellaneous Functions 277 baudrate — Get the B audrate 277 delwin ( ) — Delete a Window 278 endwin ( ) — Finish up Window Routines 278 - xiv - Contents — Continued erasechar — Get Erase Character 278 get cap ( ) — Get Termcap Capability 278 getyx ( ) — Get Current Coordinates 278 inch ( ) and winch ( ) — Get Character at Current Coordinates 278 init s cr ( ) — Initialize Screen Routines 278 killchar — Get Kill Character 279 leaveok ( ) — Set Leave Cursor Flag 279 longname ( ) — Get Full Name of Terminal 279 mvwin — Move Home Position of Window 279 newwin ( ) — Create a New Window 280 nl ( ) and nonl ( ) — Turn Newline Mode On or Off 280 scrollok — Set Scroll Flag for Window 280 subwin ( ) — Create a Subwindow 280 touchline — Indicate Line Has Been Changed 280 touchoverlap — Indicate Overlapping Regions Have Been Changed 281 touchwin ( ) — Indicate Window Has Been Changed 281 unctrl ( ) — Return Representation of Character 281 Details 281 gettmode ( ) — Get tty Statistics 281 mvcur ( ) — Move Cursor 281 scroll () — Scroll Window 281 savettyO andresettyO — Save and Reset tty Flags 281 set term ( ) — Set Terminal Characteristics 282 tstp 282 _putchar() 282 11.5. Capabilities from termcap 282 Overview 282 Variables Set By setterm ( ) 283 Variables Set By gettmode ( ) 284 11.6. The WINDOW structure 284 11.7. Example 286 - XV - Contents — Continued Chapter 12 System V curses and terminfo: 289 12.1. Overview 290 What is curses? 290 What is terminfo? 291 How curses and terminfo Work Together 292 Other Components of the Terminal Information Utilities Package 292 12.2. Working with curses Routines 293 What Every curses Program Needs 293 The Header File 293 The Routines initscr ( ) , refresh ( ) , and endwin ( ) 294 Compiling a curses Program 295 More about initscr ( ) and Lines and Columns 295 More about ref resh ( ) and Windows 295 Simple Output and Input 297 Output 297 addch ( ) — Write a single character to stdscr 297 addstr ( ) — write a string of characters to stdscr 298 print w ( ) — formatted printing on stdscr 298 move ( ) — position the cursor for stdscr 299 mvaddch — move and print a character 300 mvaddstr — move and print a string 300 mvprintw — move and print a formatted string 301 clear ( ) and erase ( ) — clear the screen 301 clrtoeol ( ) and clrtobot ( ) — partial screen clears 301 Input 302 get ch ( ) — read a single character from the current terminal 302 get st r ( ) — read character string into a buffer 303 s canw ( ) — formatted input conversion 304 Controlling Output and Input 305 Output Attributes 305 Bit Masks 306 — xvi - Contents — Continued attron ( ) , attrset ( ) , and attrof f ( ) — set or modify attributes 307 standout () and standend ( ) — highlight with preferred attribute 307 Bells, Whistles, and Flashing Lights 307 beep ( ) and flash ( ) — ring bell or flash screen 308 Input Options 308 echo ( ) and noecho ( ) — turn echoing on and off 310 cbreak ( ) and nocbreak ( ) — turn “break for each character’ ’ on or off 310 Building Windows and Pads 310 Window Output and Input 310 The Routines wnoutrefresh ( ) and doupdate ( ) 311 New Windows 312 newwin ( ) — open and return a pointer to new window 312 subwin () 313 Using Advanced curses Features 313 Routines for Drawing Lines and Other Graphics 314 Routines for Using Soft Labels 315 Working with More than One Terminal 316 12.3. Working with terminfo Routines 317 What Every terminfo Program Needs 317 Compiling and Running aterminfo Program 318 An Example terminfo Program 318 12.4. Working with the terminfo Database 321 Writing Terminal Descriptions 321 Naming the Terminal 321 Learning About the Capabilities 322 Specifying Capabilities 322 Basic Capabilities 324 Screen-Oriented Capabilities 324 Keyboard-Entered Capabilities 325 Parameter String Capabilities 325 — xvii- Contents — Continued Compiling the Description 326 Testing the Description 327 Comparing or Printing terminf o Descriptions 327 Converting a termcap Description to a terminf o Description 328 12.5. curses Program Examples 328 The editor Program 328 editor — a Sample Program Listing 330 The highlight Program 333 The scatter Program 335 The show Program 336 The two Program 337 The window Program 339 Appendix A make Enhancements Summary 341 A.l. New Features 341 Default Makefile 341 The State File .make, state 341 Hidden Dependency Checking 341 Command Dependency Checking 341 Automatic Retrieval of SCCS Files 341 Tilde Rules Superceded 341 SCCS History Files 342 Pattern-Matching Rules: More Convenient than Suffix Rules 342 Pattern Replacement Macro References 343 New Options 344 Support for C++ and Modula-2 344 Naming Scheme for Predefined Macros 344 New Special-Purpose Targets 345 New Implicit Rule for lint 345 Macro Processing Changes 345 Macros: Definition, Substitution, and Suffix Replacement 345 Patterns in Conditional Macros 345 - xviii - Contents — Continued Shell Command Output in Macros 346 Improved ar Library Support 346 Lists of Members 346 Handling of ar ’s Name Length Limitation 346 Target Groups 346 A.2. Incompatibilities with Previous Versions of make 347 New Meaning for -d Option 347 Dynamic Macros 347 Tilde Rules not Supported 347 Target Names Beginning with . / Treated as Local Filenames 348 Index 349 - xix - ) Tables Table 4-1 SCCS ID Keywords 112 Table 4-2 SCCS Utility Commands 113 Table 4-3 Data Keywords for prs -d 113 Table 5-1 make’s Standard Suffix Rules 138 Table 5-2 make’s Predefined and Dynamic Macros 140 Table 5-3 Summary of Macro Assignment Order 156 Table 7-1 Control Key Letters for the time Command 183 Table 7-2 Default Timing Summary Chart 183 Table 8-1 Operators for the eval Built-In in m4 198 Table 8-2 Summary of Built-In m4 Macros 201 Table 9-1 Changing Internal Array Sizes in lex 225 Table 9-2 Regular Expression Operators in lex 225 Table 11-1 Description of Terms 266 Table 11-2 Variables to Describe the Terminal Environment 268 Table 11-3 Variables Set by set term ( ) 283 Table 11-4 Variables Set By gettmode ( ) 284 - xxi- Figures Figure 3-1 IPC Permissions Data Structure 55 Figure 3-2 IPC Permission Modes 55 Figure 3-3 Structure of a Message Queue 57 Figure 3-4 Message Queue Control Structure 58 Figure 3-5 Message Header Structure 58 Figure 3-6 Synopsis of ms gget () 59 Figure 3-7 Sample Program to Illustrate msgget ( ) 59 Figure 3-8 Synopsis of msgctl ( ) 60 Figure 3-9 Sample Program to Illustrate msgct 1 ( ) 61 Figure 3-10 Synopses of ms gsnd ( ) andmsgrcv() 63 Figure 3-1 1 Sample Program to Illustrate msgsnd ( ) and msgrcv ( ) 64 Figure 3-12 Structure of a Semaphore 69 Figure 3-13 Synopsis of semget ( ) 70 Figure 3-14 Sample Program to Illustrate semget ( ) 71 Figure 3-15 Synopsis of semctl () 72 Figure 3-16 Sample Program to Illustrate semctl ( ) 73 Figure 3-17 Synopsis of semop ( ) 77 Figure 3-18 Sample Program to Illustrate semop ( ) 78 Figure 3-19 Structure of a Shared Memory Segment 81 Figure 3-20 Synopsis of shmget ( ) 82 Figure 3-21 Sample Program to Illustrate shmget ( ) 83 Figure 3-22 Synopsis of shmctl ( ) 84 Figure 3-23 Sample Program to Illustrate shmctl ( ) 85 Figure 3-24 Synopses of shmat ( ) and shmdt ( ) . 87 — xxiii - Figures — Continued Figure 3-25 Sample Program to Illustrate shmat ( ) and shmdt ( ) 88 Figure 4-1 Evolution of an SCCS File 108 Figure 4-2 Tree Structure with Branch Deltas 109 Figure 4-3 Extending the Branching Concept 1 10 Figure 5-1 Makefile Target Entry Format 116 Figure 5-2 A Trivial Makefile 1 17 Figure 5-3 Simple Makefile for Compiling C Sources: Everything Explicit 130 Figure 5-4 Makefile for Compiling C Sources Using Predefined Macros 132 Figure 5-5 Makefile for Compiling C Sources Using Suffix Rules 132 Figure 5-6 The Standard Suffixes List 133 Figure 5-7 Makefile for a C Program With System-Supplied Libraries 145 Figure 5-8 Makefile for a C Program with Alternate Debugging and Profiling Variants 147 Figure 5-9 Makefile for a C Library with Alternate Variants 148 Figure 5-10 Makefile for Separate Debugging and Profiling Program Variants 150 Figure 5-11 Makefile for Separate Debugging and Profiling Library Variants 151 Figure 5-12 Target Entry for a Nested make Command 153 Figure 5-13 Makefile for C Program With User-Supplied Libraries 154 Figure 9-1 An overview of lex 204 Figure 9-2 lex with yacc 205 Figure 9-3 Sample character table 223 Figure 12-1 A Simple curses Program 291 Figure 12-2 A Shell Script Using terminf o Routines 292 Figure 12-3 initscr ( ) , ref resh () , and endwin ( ) in a Program 294 Figure 12-4 Multiple Windows and Pads Mapped to a Terminal Screen 296 Figure 12-5 Input Option Settings for curses Programs 309 - XXIV - Figures — Continued Figure 12-6 Sending a Message to Several Terminals 317 Figure 12-7 Typical Framework of a terminf o Program 317 -xxv- ) Preface The following chapters describe a number of system facilities, utility commands, and libraries of primary interest to application developers. □ Chapter 1: Shared Libraries This chapter describes Sun’s approach to shared library support, along with techniques for using and creating shared libraries. □ Chapter 2: Lightweight Process Library this chapter describes Sun’s implementation of lightweight processes. □ Chapter 3: System V Interprocess Communication Facilities This chapter describes facilities that support standard System V IPC. □ Chapter 4: SCCS — Source Code Control System SCCS is a version control utility for source files. □ Chapter 5: make User’s Guide make is a utility that provides consistent generation of programs and sys- tems. □ Chapter 6: lint — a Program Verifier for C lint is a utility that you can use to check your C programs for internal con- sistency and portability. □ Chapter 7: Performance Analysis This chapter describes system utilities for timing, profiling and coverage analysis of programs. □ Chapter 8: m4 — a Macro Processor m4 is a parametric macro-language (pre)processor. □ Chapter 9: lex — a Lexical Analyzer Generator lex is a program generator that produces scanning routines in C. □ Chapter 10: yacc — Yet another Compiler Compiler yacc is a program generator that produces parsing routines in C. - XXV11 - Preface — Continued Bibliography and Acknowledgements □ Chapter 1 1 : The curses Library This chapter describes the curses screen-cursor motion library package derived from BSD. □ Chapter 12: System V curses and terminf o This chapter describes the standard System V curses terminal-display library routines and support facilities. □ Appendix A This appendix summarizes the enhancements made to Sun’s version of the make utility. For detailed information about system utilities, library functions, file- and device-level facilities, and other details about specific features of the operating system, refer to the SunOS Reference Manual. This manual has been derived in large part from sources that include technical papers distributed with U.C. Berkeley’s BSD release, System V Release 3 docu- mentation, and others. In particular, Sun Microsystems wishes to acknowledge the following sources: 1 . Aho, A. V., and Corasick, M. J., Efficient String Matching: An Aid to Biblio- graphic Search, Comm. ACM 18, 333-340 (1975). 2. Allman, Eric, Source Code Control System, University of California at Berkeley. 3. Arnold, K. C. R. C., Curses — Screen Updating and Cursor Movement Optimization: A Library Package, Bell Laboratories, Murray Hill, New Jer- sey. Author’s Acknowledgements: This package would not exist without the work of Bill Joy, who, in writ- ing his editor, created the capability to generally describe terminals, wrote the routines which read this database [and] implement optimal cursor movement . . . Doug Merritt and Kurt Shoens also were extremely important, as were . . . Ken Abrams, Alan Char, Mark Horton and Joe Kalash. Editor’s Note: The curses library was implemented by Ken Arnold, based on the screen-updating and optimizing routines originally written by Bill Joy for the vi editor. 4. Bonanni, L. E., and Salemi, C. A., Source Code Control System User’s Guide, Bell Laboratories, Piscataway, New Jersey. 5. Feldman, S. I., Make — A Program for Maintaining Computer Programs Bell Laboratories, Murray Hill, New Jersey. 6. Graham, S. L., Kessler, P. B., and McKusick, M. K., Gprof — A Call Graph Execution Profiler, Computer Science Division, Electrical Engineering and - xxviii — Preface — Continued Computer Science Department, University of California at Berkeley. Editor’s Note: This paper is for the scholar inertested in the theory behind call-graph profiling. 7. Johnson, S. C., ‘A Portable Compiler: Theory and Practice’ , Proc. 5th ACM Symp. on Principles of Programming Languages, (January 1978). 8. Johnson, S. C., Lint, a C Program Verifier Bell Laboratories, Murray Hill, New Jersey. 9. Johnson, S. C., Yacc — Yet Another Compiler-Compiler, Bell Laboratories Computing Science Technical Report #32, July 1978. 10. Johnson, S. C., and Ritchie, D. M., ‘UNIX Time-Sharing System: Portability of C Programs and the UNIX System’ , Bell System Technical. Journal 57(6) pp. 2021-2048 (1978). 11. Kemighan, B. W., and Plauger, P. J., Software Tools, Addison-Wesley, Inc., 1976. 12. Kemighan, B. W., and Ritchie, D. M., The C Programming Language, Prentice-Hall, N. J. (1978). 13. Kemighan, B. W., and Ritchie, D. M„ The M4 Macro Processor, Bell Laboratories, Murray Hill, New Jersey. Author’s Acknowledgements: We are indebted to Rick Becker, John Chambers, Doug Mcllroy, and especially Jim Weythman, whose pioneering use of m.4 has led to several valuable improvements. We are also deeply grateful to Weyth- man for several substantial contributions to the code. The m4 macro processor is an extension of a macro processor called M3 which was written by D. M. Ritchie for the AP-3 minicomputer. 14. Kemighan, B. W., UNIX for Beginners — Second Edition, Bell Laboratories, 1978. 15. Kemighan, B. W., and Ritchie, D. M., UNIX Programming, Ritchie, Bell Laboratories, Murray Hill, New Jersey. 16. Lesk, M. E., Lex — A Lexical Analyzer Generator, Computing Science Technical Report #39, October 1975. Author’s Acknowledgements: [The] outside of lex is patterned on yacc and the inside on Aho’s string matching routines. Therefore, both S. C. Johnson and A. V. Aho are really originators of much of lex, as well as debuggers of it. Many thanks are due to both. The current version of lex was designed, writ- ten, and debugged by Eric Schmidt. -XXIX- Shared Libraries Operating systems like SunOS have long achieved more efficient use of memory by sharing a single physical copy of a program’s text (code) among the processes executing it. But while the text of a program may be shared among its concurrent invocations, a significant portion of that text, consisting of library routines, may be duplicated as part of other running programs. For example, widely-used library functions such as print f ( ) may be replicated any number times throughout memory, and again in various executables throughout the file system. This suggests that still-greater efficiencies can be had by sharing text at the library level whenever possible. The SunOS shared library mechanism improves resource utilization in a way that is both straightforward and flexible: □ No specialized kernel support is required; it uses the standard memory- mapping and copy-on-write features provided by the mmap(2) system call and the kernel memory management facilities. □ It is designed to minimize the burdens placed on users of existing code. In particular: • Shared libraries are transparent to the programs that use them, as well as the build procedures for those programs. • They are largely transparent to standard system utilities, including debuggers. • Shared libraries are transparent to library source code written in C. However, some special procedures are necessary when building the shared libraries themselves. • The allocation of address space for shared library routines is handled automatically. • Unlike statically-linked executables, programs that rely on shared libraries need not be rebuilt if an underlying library changes (so long as that library’s calling interface remains compatible). • The use of shared libraries is not required. You can specify the static version of a SunOS shared library as desired. • Shared libraries may be bound and unbound dynamically, during the course of program execution. microsystems 1 Revision A of 27 March 1990 2 Programming Utilities and Libraries In addition, shared libraries enhance the development environment by making it easier to modify and test compatible updates to library functions. 1.1. Definitions Shared Object A shared object, or . so file, is an a . out (5) format file produced by ld(l). A shared object differs from a runnable program in that it lacks an initial entry point. At run-time, such an object may be linked to a number of executing pro- grams, all of which share access to a single copy of that object. Shared Library A shared library is a shared object file that is used as a library. In cases where the shared library exports initialized data, the shared object ( . so) may be paired with an optional data interface description ( . sa) file. (See Building a Shared Library, below, for details.) Static vs. Dynamic Link Editing Link editing is the set of operations necessary to build an executable program from one or more object files. Static linking indicates that the results of these operations are saved to a file. Dynamic linking refers to these same link-edit operations when performed at mn-time; the executable that results from dynamic linking appears in the running process, but is not saved to a file. Position Independent Code (PIC) Position-Independent code (PIC) requires link editing only to relocate references to objects that are external to the current object module. Position-independent code is readily shared. Static and Dynamic Link Editors The link-editing facilities of Id have been made available for use at mn-time as well as at compile-time. At compile time, the static link editor, Id, can build an executable file in which some symbols remain unresolved. An executable (a . out) file that contains unresolved symbols is said to be incomplete. Incom- plete executables require dynamic link editing at run-time. The dynamic link editor, /usr / lib/ Id . so, uses the system’s memory management facilities to map in and bind the shared object files that are required at mn-time, and performs the link editing operations that were deferred by Id. As long as the text bound-in at mn-time is not subsequently modified (say, by a link-edit operation or an update to initialized external data), it remains shared among the various (disparate) programs that use it. However, if the text of a shared routine should need to be modified by a process during the course of exe- cution, local (exclusive) copies of the affected pages are created and maintained. 1.2. Using Shared Libraries For the application developer, the decision to use shared libraries is made at the static linking phase, when running Id. By default, if a shared version of a library is available, Id constmcts an executable that uses the shared version. Building a Program to Use Shared Libraries Id combines a variety of object files to produce an executable (a . out) file. Exactly what code gets produced, and how complete the a . out is, depends on the command-line options and input files supplied as arguments on the command line. Id simply defers the resolution of any symbols that remain after it has mn out of definitions, and assumes that the program will be fully linked by Id . so at mn-time. Id accepts as input: SUH. Revision A of 27 March 1990 microsystems Chapter 1 — Shared Libraries 3 □ Simple object files. Id simply concatenates (and links) . o files in the order that they are encountered. □ ar(l) libraries. Each . a file is searched exactly once as it is encountered, and only those definitions that match an unresolved external symbol are extracted, concatenated to the text (or data), and linked. □ Shared objects. Any . so encountered is searched for symbol definitions and references, but does not normally contribute to the concatenated text (see Binding of PIC with non-PIC, for exceptions having to do with Id’s - dc option). However, the occurrence of each shared object is noted in the resulting a . out file; this information is used by Id . so to perform dynamic link editing at run-time. Id’s output can be one of two basic types: □ An “executable” (a . out) file. This file is either a program, if it has an entry point, or a shared object ( . so), if it does not. □ Another “simple object” ( . o) file. When given the -r flag, Id combines the input object files to form a single, larger one. (This is a special use for Id which is of little relevance to shared libraries.) You can indicate which libraries are to be used by supplying a -1 name option on the Id command line for each. Id searches each library in the order specified. The name string is an abbreviated version of the library’s filename; the full name is of the form ‘lib name . a’ if in archive format, or ‘lib name . so . version' if it is in shared object form, (see Version Control below, for a detailed discussion of the version suffix). At ld-time, this version information is noted; it must be matched properly for successful binding at run-time by Id . so. The location of the library specified by a -1 option is determined by an ordered list of directories in which to search called the library search path. This search path is specified as follows. At compile time, directories specified by the -L options are searched first, followed by those specified in the LD__LIBRARY_PATH environment variable (a colon-separated list of path- names), and then the default libraries, /usr/lib, /usr/51ib and /usr/local/lib. At run-time, directories in LD_LIBRARY_PATH environ- ment variable are searched first, followed by libraries specified with -L, and finally, the default directories. Each directory supplied with -L is recorded for use when the program is exe- cuted, as are the default directories. Directory search information obtained from LD_LIBRARY_PATH is not recorded in this manner. However, the search path that LD_L I B RARY_P AT H contains at run-time is searched at that time; this allows an alternate set of libraries to be used. At ld-time, the library search is satisfied by the first occurrence of either form of the library ( . so or . a if no . so is found), but if both versions are found in the same directory, the . so form is used by default. However, the choice of whether a . so or .a version is used by Id can be controlled by the binding mode options described in the next section. microsystems Revision A of 27 March 1990 Programming Utilities and Libraries Binding Mode Options -Bstatic and -Bdynamic You can specify the binding mode by supplying one of the -B keyword options on the command line: -Bdynamic Allow dynamic binding, do not resolve symbolic references, and allow creation of execution-time symbol and relocation information. This is the default setting. Note that Id records the name of the . so file with the highest version number in the executable. -Bstatic Force static binding, this mode is also implied by options that generate non-sharable executable formats. -Bdynamic and -Bstatic may both be specified a number of times to toggle the binding mode for specific libraries. Like -1, their influence is dependent upon their location in the command line. Libraries that appear after a - Bstatic are linked statically. Libraries that appear after a -Bdynamic are treated as shared (when a shared version is available). NOTE Since -Bdynamic is the default setting, the use of shared libraries in the con- struction of a program thus "falls out’’ from installing the .so in Id’s library search path. If -Bstatic is in effect, Id refuses to use the . so form of a library; it contin- ues searching for an equivalent library with the . a suffix, and an explicit request to load a . so file is treated as an error. The following example shows how -Bstatic and -Bdynamic can be used to use selected shared and static libraries. This cc command: cc -o test test.c -Bstatic -lsuntool -lsunwindow -Bdynamic -lsunwindow -lpixrect generates the Id command: /bin/ld -dc -dp -e start -X -o test /usr/lib/crtO . o test.o -Bstatic -lsuntool \ -Bdynamic -lsunwindow -lpixrect -lc Since -Bstatic turns off the use of shared libraries, Id finds the static ( . a) sunt ool library and uses it for link editing immediately. The subsequent - Bdynamic option tells Id to use shared versions of the sunwindow, pix- rect and C libraries, if available. -N and -n Options for Id The Id options -N and -n instruct Id to build a non-pageable executable. Their use implies a -Bstatic option. microsystems Revision A of 27 March 1990 Chapter 1 — Shared Libraries 5 Binding of PIC with Non-PIC -dc and -dp Options As noted in the above example, the cc command generates an Id command with the -dp and -dc options. These options are included to facilitate binding of non-PIC code (generated by default) with the PIC shared libraries that a program might use. The bindings of interest are to: □ commons, (externs): allocated after the program is completely assembled (-dc); □ initialized data: imported from the shared libraries (-dc); and □ entry points: supplied by the shared libraries (-dp). Without special handling, references to these objects would require execution- time link editing, resulting in unsharable code. To improve the degree of sharing for such programs, -dc and -dp force the allocation of commons and the crea- tion of aliases for library entry points, respectively. These allocations and aliases are created as part of the non-PIC executable, and result in programs that are con- sidered to be “pure-text” non-PIC programs, even though they may require dynamic link editing. NOTE While it is possible to invoke the Id command directly, it is generally better practice to rely on the compiler-driver ( such as cc) to generate the appropriate Id command, so as to remain insulated from any future changes in the compila- tion environment. Compiler commands such as cc accept and pass on options to Id. Use of Assertions The -assert Option To help detect any potential sharability or correctness problems, Id can validate certain assertions about an executable that it builds. This assertion checking is invoked by the “-assert keyword ” option, where keyword is one of: definitions if the resulting program were run now, there would be no run- time undefined symbol diagnostics. This assertion is set by default, and is sufficient for validating applications that make use of shared libraries. pure-text the resulting executable requires no further relocations to its text. The code of a shared library should be validated using this assertion. Run-Time Use of Shared At run-time, Id . so finishes the job started by Id. That is, it performs the link- Libraries editing operations needed to resolve a program’s remaining references using shared-library code and data. Id . so’s first task is to find and map in the required libraries. It uses slightly different search rules than Id. Id . so looks first in the directories specified by the current value of LD_L I BR AR Y_P AT H , and then in the directories in the search path recorded by Id (the default direc- tories and those specified by -L). In addition, Id. so attempts to find the “best’ ’ version of a shared library, that is, the version with the highest minor number (as described under Version Control below). microsystems Revision A of 27 March 1990 6 Programming Utilities and Libraries SunOS Shared Libraries The shared libraries provided in SunOS are: □ The C library (both BSD and System V variants) □ Window libraries (suntool and sunwindow) a pixrect □ kernel virtual memory access (kvm) □ The optional FORTRAN library (purchased and installed separately). Static ( . a) versions of these libraries are also provided. There are some semantic differences between dynamic and static binding. These are not expected to cause a problem with programs that avoid questionable prac- tices with regard to library search order. However, there is a potential for prob- lems when programs are built from some components that have become dynami- cally loadable, while others remain static. Given the case where: hermes% Id -o x ...dc sc The executable x is composed of several objects, including a dynamic com- ponent, dc, and a static component, sc. dc was, prior to the introduction of shared libraries, an unordered archive file, and both dc and sc contain definitions for the symbol getsym. Suppose that dc contains a reference to getsym. If, in dc’s archive version, the definition for getsym preceded its reference, Id might have resolved that reference using the definition from sc. But in dc’s current (dynamic) form, its own definition is used instead. This is a result of the fact that at run-time, Id . so searches for a symbol definition start- ing with the main program, and then all . so’s in load order. Even though it allows for an inconsistency of this sort, this behavior preserves the ability to interpose definitions on library entry points. Debuggers The SunOS debuggers have been modified to deal with the dynamic linking environment provided by the new Id. In particular, they understand that symbol definitions may appear after a program starts executing. However debugger users must be aware that library symbols will not be resolved until main ( ) has been called, as the next example shows. Dynamic vs. Static Binding Semantics microsystems Revision A of 27 March 1990 Chapter 1 — Shared Libraries 7 C . 1 : — — ; — ■ — ■ — ; — — — — — ' hermes% cc -g -o test test.c hermes% dbx test Reading symbolic information. . . Read 40 symbols (dbx) stop in printf no module, procedure or file named 'printf' (dbx) stop in main (1) stop in main (dbx) run Running: test stopped in main at line 4 in file "test.c" . 4 printf ("%d 0, errno) ; (dbx) stop in printf (3) Stop in printf (dbx) cont stopped in printf at 0xed76954 0xed7 6954 : moveml #, sp@ Current function is main 4 printf ("%d 0, errno); v — Users of debugging tools also need to be aware that core files have incomplete information on the state of shared code. Core files contain only the stack and data regions of a process image. The text, and more importantly, the static data regions of dynamically loaded objects, do not appear. Thus, modifications made to initialized data are not reflected in the core file. Performance Issues Shared libraries represent a classic space vs. time trade-off. The work of incor- porating the library code into an address space is deferred in order to save both primary and secondary storage. Therefore, one can expect to pay a slight CPU time penalty with programs that use shared libraries. This penalty can be attri- buted to added cost of: □ dynamically loading the libraries, □ performing the link editing operations, and □ the execution of the library PIC code. However, these costs can be offset by the savings in I/O access time when library code is already mapped in by another program, since the (real) I/O time required to bring in a program and begin execution will be greatly reduced. As long as the CPU time required to merge the program and its libraries does not exceed the I/O time saved, the apparent performance of the program will be the same or better. However, if sharing does not occur, or if the system’s CPU is already saturated, such savings may not be achieved. Dependencies on Other Files A dynamically bound program consists not only of the executable file that is the output of Id, but also of the files referred to during execution. Moving a dynam- ically bound program may also involve moving a number of other files as well. Moving (or deleting) a file on which a dynamically bound program depends may prevent that program from functioning. microsystems Revision A of 27 March 1990 8 Programming Utilities and Libraries Setuid Programs 1.3. Version Control For those programs that execute with an effective UID (user ID) or GID (group ID) different than the real UID or GID, Id . so ignores libraries in directories other than / usr/lib, /usr/51ib and /usr/ local /lib in the search path. A version numbering mechanism has been provided for shared libraries. This allows newer compatible versions of a library to be bound at run-time. It also allows the link editors to distinguish between compatible and incompatible ver- sions of a library. Version Numbers of . so’s The version number is composed of two parts, a major version, and a minor ver- sion number. This version-control suffix can be extended to an arbitrary string of numbers in Dewey-decimal format, although only the first two components are significant to the link editors at this time. As noted earlier, Id records the version number of the shared library in the exe- cutable it builds. When Id . so searches for the library at run-time, it uses this number to decide which of the (possibly multiple) versions of a given library is “best,” or whether any of the available versions are acceptable. The rules it fol- lows are: □ Major Versions Identical: the major version used at execution time must exactly match the version found at ld-time. Failure to find an instance of the library with a matching major version will cause a diagnostic to be issued and the program’s execution terminated. □ Highest Minor Version: in the presence of multiple instances of libraries that match the desired major version, Id . so will use the highest minor ver- sion it finds. However, if the highest minor version found at execution time is lower than the version noted at ld-time, a warning diagnostic is issued. Major version numbers should be changed whenever there is an incompatible change to the library’s interface. NOTE As always, the detection of incompatibilities between library versions remains the responsibility of the library’s developer. Version Management Issues Whenever there is an incompatible change to the library’s calling interface, the major number of that library should be changed. A library’s interface is defined by: □ the names and types of exported functions and their parameters; and □ the names and types of exported data (initialized or not) Incompatible changes would include the deletion of a exported procedure, dele- tion of exported data, changes to an procedure’s parameter list, and changes to data stmctures declared in a . h file normally included by both the library and the applications that use it. Changes to internal library procedures and data do not constitute an interface change. microsystems Revision A of 27 March 1990 Chapter 1 — Shared Libraries 9 1.4. Shared Library Mechanisms Memory Sharing The C Compiler Minor versions should be changed to reflect compatible updates to libraries. An example of a compatible update would be changing a procedure’s algorithm without changing its parameter list. Although adding a new library routine con- stitutes an interface change, it can be considered a compatible change. Note that link-editors silently select the highest compatible version they can obtain. If the minor version used at ld-time is higher than the highest one found at run-time, then although the interfaces should remain compatible, it is possible that certain bug fixes or compatible enhancements on which the application depends might be missing: hence the warning message mentioned above. There is no single mechanism in SunOS that implements shared libraries. Instead, the ability to construct a shared library comes as a consequence of enhancements to various existing facilities. The system components and their features that are instrumental in supporting shared libraries are: □ Virtual memory supports file mapping and “copy-on- write” sharing □ PIC generation by the compiler and assembler □ Link editor support for dynamic linking and loading Memory sharing is provided by the kernel’s virtual memory (VM) system. The mechanisms of interest for shared libraries are: □ File mapping by way of nunap ( ) . □ Sharing at the granularity of a file page □ A per-page copy-on-write facility that allows run-time modification of a shared file, without affecting other users of that same file. The VM system uses these features internally, so that an exec ( ) of a program is reduced to establishing a copy-on-write mapping of the file containing the pro- gram. A shared library is added to the address space in exactly the same way, using this general file-mapping mechanism. The C compiler’s -pic option generates position-independent code. When - pic is specified, references to objects that are external to the body of the code are made by way of linkage tables. These indirect references can degrade execu- tion performance slightly, depending on of the number of dynamic references to global objects. The code sequences generated often assume that the linkage tables are no larger than a limit that is convenient for the specific machine (64K bytes for an MC68000, or 8K for a SPARC, for instance). In the (presumably rare) event the tables require a larger size, the compiler can be coerced into gen- erating code sequences that permit larger linkage-table entries with the -P IC option. Shared library code should be generated as PIC using either -pic or -PIC as appropriate. The use of PIC in shared libraries results in code that does not require relocation in order to be used, and is thus inherently sharable by any pro- gram that uses it. The same copy of PIC code can be shared among multiple pro- grams, even if that code is placed at different addresses in each program. Any sun microsystems Revision A of 27 March 1990 10 Programming Utilities and Libraries The Assembler crtO ( ) Link Editors: Id and Id. dependence on actual addresses is isolated to the linkage tables, which are modified on a per- program basis to match the actual addresses selected. The linkage tables are actually divided into two portions: a Global Offset Table (GOT) that provides indirections to data objects referenced by the PIC code, and a Procedure Linkage Table (PLT) that provides indirections to procedures refer- enced by the PIC code. The principal difference between the two types of indirections is that PLT entries are evaluated during dynamic linking, whereas GOT entries are evaluated at the start of execution. Code generated by the -pic option requires support from the assembler. This support is enabled by the -k assembler flag, and is generated automatically by cc when invoking the assembler for a compilation performed with the -pic or the -PIC option. User-written assembly code for use in a shared object must also be PIC. Refer to the appropriate Sun-3 Assembly Language Reference for your Sun system for details. Every main program produced by the standard languages is linked with a pro- gram prologue module, crt 0 ( ) . This module contains the program’s entry point, and performs various initializations of the environment prior to calling the program’s main ( ) function, crt 0 ( ) refers to the symbol DYNAMIC. As described above, when Id builds an executable requiring execution-time link editing, it defines this symbol as the address of a data structure containing infor- mation needed for execution-time link editing operations. If the structure is not needed, any reference to the symbol __dynamic is relocated to zero. At program start-up, crtO ( ) tests to see whether or not the program being exe- cuted requires further link editing. If not, crt 0 ( ) simply proceeds with the execution of the program as it always has - no further processing is involved. However, if DYNAMIC is defined, crtO ( ) opens the file /usr/lib/ld. so and requests the system to map it into the program’s address space via the mmap ( ) system call. It then calls Id . so, passing as an argument the address of its program’s DYNAMIC structure. crtO ( ) assumes that Id . so’s entry point is the first location in its text. When the call to Id . so returns, the link editing operations required to begin the program’s execution have been completed. so After Id has processed all of its input files, it attempts to resolve each symbolic reference to a relative offset within the executable being built. Id is able to complete this symbolic reduction at ld-time only if: □ all information relating to the program has been given and no .so will be added at execution time or □ the program has an entry point and symbolic reduction can be made for those symbols defined in the program After performing all the reductions it can, if there are no further symbols to resolve, the output is a fully linked (static) executable. However, if any unresolved symbols remain, then the executable will require further link editing microsystems Revision A of 27 March 1990 Chapter 1 — Shared Libraries 1 1 Id. so Binding and Unbinding Routines: diopen ( ) , dlsym ( ) , diclose ( ) , dlerror ( ) at run-time. In this case, Id deposits the information (including version number) needed to obtain any needed . so files, in the data space of the incomplete exe- cutable. It should be noted that uninitialized “common” areas (essentially all uninitial- ized C globals) are allocated by the link editor after it has collected all refer- ences. In particular, this allocation can not occur in a program that still requires the addition of information contained in a .so file, as the missing information may affect the allocation process. Initialized “commons,” however, are allo- cated in the executable in which their definition appears. After Id has performed all the symbolic reductions it can, it attempts to transform all relative references to absolute addresses. Id is able to do this rela- tive reduction only if it has been provided some absolute address. At run-time, after receiving control from crt 0 ( ) , Id . so, executes a short bootstrap routine that performs any relocations Id . so itself requires. It then processes the information contained in the DYNAMIC structure of the program that called it. Id . so examines the list of required dynamic objects Each ele- ment of the list contains an offset relative to the DYNAMIC structure of an array of link_ob ject structures and has information to identify a . so that must be incorporated. The identification is the name specified on the Id com- mand line used to build the program, and includes a bit indicating whether the object was named explicitly or via a -1 option. Some version control informa- tion is also recorded for each entry in the ld_need array. Id . so looks up the indicated file, and maps it into the process’s address space. After all modules comprising the program have been placed in the address space, Id . so attempts to resolve the remaining symbols. After performing allocations for all uninitialized commons Id . so attempts to resolve all unbound references that occur outside of procedure linkage tables. Unresolved procedural references in the linkage tables are not processed during program startup. Instead, such references are initialized such that the initial call results in a transfer of control to Id . so. When called in this way. Id . so first resolves the reference to an absolute address, and then modifies the linkage table entry to use that address. Deferring the binding of procedural entry-points until the first call eliminates unnecessary bindings to entry points that the program may not use. SunOS provides a programmatic interface to the run-time linker, which you can use to bind or unbind shared libraries during the course of program execution, dlopen ( ) allows you to get access to a shared library, which it binds to the process’s address space (if it isn’t bound already), dlsym () returns the address of a given symbol within a (bound) shared library, die lose ( ) deletes a refer- ence to a shared object. When the last reference is deleted, the shared object is removed from the process’s address space, dlerror () can be used to obtain information about the last error occurring as the result of dlopen ( ) , dlsym (), or diclose (). Refer to ld(3) for details. microsystems Revision A of 27 March 1990 12 Programming Utilities and Libraries 1.5. Building a Shared Library Building the . so File The . sa File In the simplest of cases, the commands needed to build a shared library might be: ■ 1 1 —■ — — 1 — — ■ — — 1 ■ — — ■ ■ ■ > hermesl cc -pic -c *.c hermes% Id -o libx.so.1.1 -assert pure- text *.o V _ I:- : . : : ... . . But note that this assumes that the library exports no initialized data. And it makes no guarantee that the library text makes the most efficient possible use of space, or allows for a minimal amount of paging. As noted earlier, a shared library should be structured to avoid undue modification in the course of dynamic linking and execution. Otherwise, it is possible that some or all of the shared text may be rendered unsharable when run. Although this lack of sharing would not effect the correct execution of library routines, it will impact system performance. If only a few programs use the library, this impact is small. But for a widely-used library, the impact on system performance could be significant. Thus, shared library objects should be PIC, they should be validated using the pure-text assertion, and those libraries that export initialized data should be accompanied by a data interface description (. sa) file. To build the . so portion of a shared library, simply invoke Id with the list of object files that will comprise it. The version number is not automatically gen- erated by Id (which creates a file named a . out by default), but you can specify the full name of the library, including the version number, with Id’s -o option. It is strongly suggested that you use the -assert pure-text assertion to uncover any instances of non-PIC code. The . sa file is used to support Id’s -dc option, which provides a space/time efficient implementation of the interface between non-position-independent code and dynamically linked objects. The . sa file is an ar-format file (archive library) that contains the exported initialized data used by a shared library. When present, the . sa file it is statically linked at ld-time to insure correct allocation. A data item is exported from a library if a program that uses the library refers to the data item by name. The contents of the data item are included if they are specified by value in the declaration. For instance, with a definition of the form: char *strlist[] = { "string 1", "string 2" }; the data itself must be included in the . sa file, whereas with: struct *strlist[] = { ptrl, ptr2 }; definitions for the objects named ptrl and ptr 2 would not necessarily have to be included. Note that if ptrl were itself defined as an initialized global in the library source, say: extern char *ptrl = NULL then this definition would also have to go into the . sa file. Uninitialized data (exported or not) is handled automatically, and need not be included in the . sa file. If the library does not export any data, then a . sa »sun XT microsystems Revision A of 27 March 1990 Chapter 1 — Shared Libraries 13 would be unnecessary. The full name of a . sa also includes a version number that must match the version string of the . so it accompanies. CAUTION If a shared object exports initialized data, it is very important that a . sa file be created that contains such data. Failure to do so can degrade the perfor- mance of applications or, if the library is used heavily, the system as a whole. Further, in the event that such data is located within the text segment of the shared object, it is possible for Id to confuse the data with procedures defined by the library and to incorrectly link applications that reference such data. Initialized data can appear in the text segment of a shared object if it is part of a source file that is compiled with the -R (make initialized data read-only) option. Building the . sa File To build a . sa file: 1. Segregate the declarations of exported initialized data from the sources for each object, and place them in a separate source file. Make sure that an up- to-date object is compiled from each of those data-description sources, and include each of those data-description objects in both the static and shared versions of the library. 2. Create a separate (static) archive library composed of only the data- description objects, and give it a name of the form 'libname . sa . version’. This archive constitutes the . sa file. Be sure that the . sa has the same ver- sion number as the . so it is to accompany. 3. Use ranlib(l) to incorporate a symbol table within the . sa archive. As an example, consider the system’s C library. It contains a number of data structures that are initialized at program startup and which are exported to appli- cations. Examples of these include the global variable errno, and the array of error messages sys_errlist. The C library source has been constructed such that the variable errno appears in its own source file (errno . c). This accomplishes step 1 of the procedure outlined above. The relevant portion of this source file consists of the line: int errno =0; /* global error return value, initially This source file is compiled -pic, and the resulting object file, errno . o, is archived into the C library’s . sa file. Since everything placed in a . sa file must also appear in the . so file, errno . o is also included in the . so file. Thus errno . o is also linked into the C library’s . so file when it is built. Once all such files have been placed in the . sa file, it is processed with ranlib to add a symbol table. Library code that maximizes sharing is considered “better” because it makes more efficient use of the system’s memory resources. Building the library com- ponents PIC is an important and easy first step, but there are other tuning stra- tegies to consider as well. 1.6. Building a Better Library Revision A of 27 March 1990 14 Programming Utilities and Libraries One way to maximize sharing is to minimize a . so’s data segment (containing initialized data), and its bss segment (containing uninitialized data). Often a .so’s data requirements are large because a significant portion of that data that is functionally read-only. There are several problems with this mix of read-only and modifiable data: □ data that could be shared is not, □ an unnecessary amount of swap space is reserved, and □ read-only data fragments the read-write storage, spreading it over more pages. One approach is to move initialized read-only data into the text segment. This is done by compiling with the -R option. However caution needs to be exercised, since initialized data structures that contain pointers require relocation at run- time. For instance, given the declarations: The references to &x and test are instances of pointers embedded in an initial- ized structure. The actual addresses to which those pointers are resolved will not be determined until the program starts executing, and the shared object is placed in the address space. If this data structure is placed in the text segment of the shared object through the use of the -R option, then the relocation will cause that portion of the text segment to become unshared. Such data structures should not be contained in modules compiled with the -R option. You can check whether such relocations are occurring within a shared object by specifying the *- as sert pure-text ’ option when building the shared object. Using xst r to Extract String Another common example of initialized data containing pointers is an array of Definitions strings: char *errlist[] = ("errl", "err2"); The xstr ( 1 ) utility can be used to make code containing initialized strings more sharable. It segregates the literal string data from its relocatable references, which allows the literal data to be merged safely into the text segment. However, files containing references to the string data should not be compiled with the -R option. If there are several related pieces of data, another strategy is to coalesce the smaller items into a larger structure and allocate the space from the heap. Sizing Down the Data Segment Revision A of 27 March 1990 Chapter 1 — Shared Libraries 1 5 Better Ordering of Objects crtO . o Dependency The ldconfig Command 1.7. Shared Library Problems Id . so Is Deleted The order of the objects in the executable can be important to minimizing the memory requirements. Since objects are concatenated together, linking in the wrong order may result in a unnecessarily large memory requirement. Two approaches that encourage better utilization of memory resources are: □ Routines that are frequently called should be packaged together, and isolated from startup or rarely-called code. □ A set of routines that represent a common sequence should also be packaged together. For example, given modules A, B, C, D, and E, where A and B fit on one VM page, C and D fit on another, and E fits on a partial page, if A always calls into E and never calls into B, the memory requirements may be reduced by a page if E follows A. Sometimes a program will define its own crt 0 ( ) initial routine. If it is intended that the program use shared libraries, then the programmer needs to pro- vide a hook for the run-time linker. Further discussion of this can be found under link(5) in the SunOS Reference Manual. ldconf ig(8) is a program used to construct a run-time linking cache for use by Id. so. The cache has a default list of directories /usr/lib, /usr/51ib, /usr/lib/f soft, /usr/lib/f 688 81, /usr/lib/ f fpa, and /usr / lib/ f switch and will accept as input a list of additional directories to augment this list, ldconfig records the pathname of the highest compatible version of each shared library in the specified search path. At runtime, Id . so first queries the cache to determine which is the best version of a library in a particular directory. If the cache is unable to satisfy the request, Id . so enumerates the directory entries for the best version. Since many system utilities are built to use shared libraries, and thus rely on dynamic link-editing, the potential exists for chaos if an important shared library (such as the C library) or /usr/lib /Id. so should be deleted. If the latter has been deleted, you will see the following message: f crtO: no /usr/lib/ld. so -J To deal with the chaos resulting from either the shared C library or Id . so being deleted, a number of commands and utilities have been statically linked. These include: rcp(l) init(8), getty(8), sh(l), csh(l), mv(l), ln(l), tar(l) and restore(8). Since most system utilities may be rendered unusable by this con- dition, it may be necessary to boot the system single-user in order to restore either /usr/lib/ld.soortheC library. Refer to System and Network Administration for procedures to restore these files. microsystems Revision A of 27 March 1990 1 6 Programming Utilities and Libraries Wrong Library Is Used Error Messages Id . so will not detect a library that is newly installed in the cache unless the cache is rebuilt using ldconfig. Thus, a program that depends on the newly- installed library may not be able to find it. You can use the ldd(l) command to identify the libraries on which a program depends. — \ Id. so : v libname .so .major not found J Id . so failed to find a library with the appropriate major version number. r Id. so : open error for library \ Id. so: can't read struct exec for library Id. so: library is not for this machine type v Either the shared object has been corrupted, has incorrect access permissions, or was built to execute on another processor architecture. Id. so: call to undefined procedure symbol from address Id. so: Undefined symbol symbol These messages generally indicate that the execution path attempts to refer to an undefined symbol. This is usually the result of a programming error. A Id. so. cache corrupted V / The file /etc/ Id . so . cache has become damaged. To correct it, remove the existing file and reboot the system. The file will be rebuilt. ( Id. so: warning library has older version than expected v The version of the shared library that is currently being used has a minor version number that is lower than the version that was present at the time the application was compiled. microsystems Revision A of 27 March 1990 2.1. Introduction Definition Functionality Lightweight Processes This tutorial provides some examples of how to use the lightweight process library. Although the term “lightweight processes” is often used, it is really a misnomer since the fundamental property of lightweight processes is not that they are somehow “lighter” than ordinary processes, but that a lightweight pro- cess represents a thread of control not bound to an address space. If threads appear to operate more efficiently than ordinary SunOS processes, it is because threads communicate via shared memory instead of a filesystem. Because threads can share a common address space, the cost of creating tasks and inter- task communication is substantially less than the cost of using more “heavy- weight” primitives. The availability of lightweight processes provides an abstraction well-suited to writing programs which react to asynchronous events such as servers. In addition, lightweight processes are useful for simulation pro- grams which model concurrent situations. The idea is to provide a process abstraction: a thread is a data type representing a flow of control. A number of operations are available to manipulate threads, including ways to control their scheduling and communication. Lightweight processes exist independently of virtual memory, I/O, resource allocation, and other operating system-supported objects, but are able to smoothly work with these objects. The lightweight process abstraction for managing asynchrony is superior to the UNIX system signal abstraction. Under the UNIX system, a signal causes a sort of context switch (to a new instruction and optionally, to a new location on the stack) but the thread is the same: for example, you can long jmp ( ) to the main program (the signal handler and main program can’t run in parallel). Critical sections are implemented by disabling interrupts. With lightweight processes, the only way to manage an asynchronous activity is via a thread. There are no asynchronous exceptions in a thread. Critical sections are implemented with monitors. There is no need to lock out interrupts, with the concomitant possibil- ity of losing information while in the critical section. The Sun lightweight process library provides primitives for manipulating threads, as well as for controlling all events (interrupts and traps) on a processor. The present library is supported for user-level processes only. This means that the time slice given to a process by the operating system is shared by all the threads within that process. Further, LWP objects are not accessible outside of microsystems 17 Revision A of 27 March 1990 1 8 Programming Utilities and Libraries Tutorial Goals 2.2. Threads the containing process. Briefly, the primitives supported by the library include: □ Thread creation, destruction, status gathering, scheduling manipulation, suspend and resume □ Multiplexing the clock (any number of threads can sleep concurrently) □ Individualized context switching (e.g., it is possible to specify that a given set of threads will touch floating point registers and only those threads will context switch these registers) □ Monitors and condition variables to synchronize threads □ Extended rendezvous (message send-receive-reply) between threads □ An exception handling facility that provides both notify and escape excep- tions □ A way to map interrupts into extended rendezvous □ A way to map traps into exceptions □ Utilities to allocate red-zone-protected stacks, and to provide some stack integrity checking for environments that lack sophisticated memory manage- ment Scheduling is by default, priority-based, non-preemptive within a priority. How- ever, sufficient primitives are available that it is possible to write your own scheduler. For example, to provide a round-robin time-sliced scheduler, a high- priority thread may periodically reshuffle the queue of time-sliced threads which are at a lower priority. Although pure coroutine scheduling is possible, it is not required and purely preemptive scheduling may be used. Threads currently lack kernel support, so system calls still serialize thread activity, although the non- blocking HO library ( libnbio.a ) mitigates this problem somewhat. When a set of threads are running, it is assumed that they all share memory. This tutorial provides some practical examples of how to program using light- weight processes. Also included is some discussion of the rationale for the light- weight process primitives. Syntax details of the lightweight process primitives are not supplied in this tutorial, though they can be found in the SunOS Reference Manual. The lightweight process mechanism allows several threads of control to share the same address space. Each lightweight process is represented by a procedure which will be converted into a thread by the lwp_create ( ) primitive. Once created, a thread is an independent entity, with its own stack as supplied by its creator. lwp_create ( ) performs a number of actions: a thread context is allo- cated, the stack is initialized, and the thread is made eligible to run. A collection of threads runs within a single ordinary process. This collection is sometimes called a pod. Lightweight processes ( LWP’s or threads) are scheduled by priority. It is always the case that the highest priority non-blocked thread is executing. Threads may block on certain occurrences, such as the arrival of a message or the procurement microsystems Revision A of 27 March 1990 Chapter 2 — Lightweight Processes 1 9 of a monitor lock. Within a priority, threads execute on a first-come, first-served basis. Thus, if two threads are created at the same priority, they will execute in the order of creation. Here is an example of how to do something simple with lightweight processes. The program below creates a thread which prints out the “hello world” message and then terminates (by “falling through” the procedure), main ( ) becomes a lightweight process as soon as aLWP primitive (here, pod_setmaxpri()) is called. Note that main ( ) is created with a priority of MAXPRIO so that it may set things up as it wishes before allowing other threads to mn. The command to compile this program (call it foo.c) is: example cc -o foo foo.c -llwp Let’s go through this program line by line. We begin by printing a message “main here” at line 1. Then, pod_setmaxpri ( ) turns main ( ) into a light- weight process (as it’s the first LWP primitive to be called). pod_setmaxpri ( ) also specifies the maximum scheduling priority: in this case, 10. The range of scheduling priorities 1..10 is now available to the client. If we didn’t use pod_setmaxpri ( ) the available priority would be just MINPRIO. Now, main ( ) is a thread running at a priority of 10, the maximum priority. In other words, main ( ) will execute until it explicitly blocks or other- wise yields control to another thread. lwp_set stkcache ( ) initializes a cache of stacks that can be used by subse- quent lwp_newstk () calls. lwp_newstk() will return a stack of at least the size specified in the lwp_setstkcache ( ) call (here, 1000 bytes), and this stack is red-zone protected. The second argument to lwp_set stkcache ( ) Revision A of 27 March 1990 20 Programming Utilities and Libraries specifies how big the cache should be initially (how many stacks it should con- tain). Larger numbers will require more memory, but will make cache faults less likely. On a fault, an additional cache of the same size will be allocated. A stack allocated from the stack cache will automatically be freed when the thread that uses it dies. Allocation from this cache is almost as efficient as using statically allocated stacks. At line 4, we create a new thread. This thread will begin execution at task ( ) , have a scheduling priority of 10, use the stack cache for a stack, and take no arguments initially. Even though it will run at the same priority as main ( ) , task ( ) will not run until main ( ) relinquishes control because of the FCFS scheduling policy for threads at the same priority, and t ask ( ) is at the same priority as main ( ) . (It is not a good programming practice to rely on the order- ing of threads within a priority since this assumption may not hold on a multipro- cessor or in the presence of external scheduling). The identity of the new thread is returned in tid. This identity may be used in subsequent LWP primitives. When the main () thread “falls through”, it terminates. At this point, task ( ) will run, print its message, and terminate. The LWP library will notice that no more threads remain, and the program will terminate. Be careful not to confuse threads with ordinary heavyweight processes. For example, there are no inheritance rales about lightweight processes, and light- weight processes do not have their own set of descriptors. Stack Issues Stack Size Protecting Against Stack Overflow A major problem is to determine how big to make the thread stacks. Once this determination is made, you can decide how or if you need protection against exceeding this limit. UNIX presents the same problem to the user, but it rarely causes trouble because the maximum stack length is very big. Allocating large stacks is not a big performance drain because pages are only allocated if actually used. Hence, you can allocate very large stacks fairly casually. lwp_newstk ( ) automatically allocates red-zone protected stacks (references beyond the stack limit will generate a SIGSEGV event). There are two ways to ensure stack integrity when not using lwp_newstk ( ) . One way is to use the CHECK ( ) macro at the beginning of each procedure (before any locals are assigned), in conjunction with the lwp_checkstkset ( ) primitive. If the procedure exceeds the thread stack limit, the procedure will return and set a glo- bal variable. Another way is to use the lwp_st kcswset ( ) primitive. This enables stack checking on context switching. Although this is transparent to the client programs, it may not detect errors until after the stack limit has been exceeded. Thus, with lwp_stkcswset ( ) , an error is considered fatal. CHECK ( ) detects errors before any damage is done, so error recovery is possi- ble. It is possible to assign a statically allocated stack to a thread. Thus, in the pro- gram above, we could declare a stack as follows, using the macros defined in microsystems Revision A of 27 March 1990 Chapter 2 — Lightweight Processes 2 1 stackdep . h to declare the stack portably. MINSTACKSZ ( ) is added to include any stack room needed by the LWP library to execute the LWP primi- tives. > ♦include ♦include ♦include ♦define MINSTACKSZ 1024 ♦define MAXPRIO 10 stkalign_t stack [1000+MINSTACKSZ] ; main ( ) { int task ( ) ; thread_t tid; (void) pod_setmaxpri (MAXPRIO) ; lwp_create (&tid, task, MAXPRIO, 0, STKTOP (stack) , 0) ; } task ( ) { printf ( "task : hello world\n") ; } s Coroutines It is possible to use threads as pure coroutines in which one thread explicitly yields control to another. lwp_yield ( ) allows a thread to yield to either a specific thread at the same priority, or the next thread in line at the same priority. Here is an example of three coroutines: main ( ) , coroutine ( ) , and other () . The result should be the numbers 1 through 7 printed in sequence. In the case where a generic yield is done (lwp_yield (THREADNULL) ), the current thread goes to the end of its scheduling queue. When a specific yield is done, the specified thread butts in front of the current one at the front of the scheduling queue. Since we are just using coroutines, a single priority (MINPRIO) is sufficient and we do not increase the number of available priori- ties with pod_setmaxpri ( ) . ♦include ♦include thread_t col; /* thread_t co2; /* thread_t co3; /* main(argc, argv) int argc; char **argv; { int coroutine (), other (); lwp_self (&col) ; lwp_setstkcache (1000, 3) ; main’s tid */ coroutine’ s tid */ other’s tid */ m sun microsystems Revision A of 27 March 1990 22 Programming Utilities and Libraries Custom Schedulers lwp_create (&co2 , coroutine, MINPRIO, 0, lwp_newstk ( ) , 0 ) ; lwp_create (&co3, other, MINPRIO, 0, lwp_newstk ( ) , 0); print f ("l\n") ; lwp_yield (THREADNULL) ; /* yield to coroutine */ print f ("4\n") ; lwp_yield (co3) ; /* yield to other */ printf ( " 6\n" ) ; exit ( 0 ) ; } coroutine () { printf ("2\n") ; if (lwp_yie Id (THREADNULL) < 0) { lwp_j?error ("bad yield"); return; } printf ("7\n") ; } other () { printf ("3\n") ; lwp_yie Id (THREADNULL) ; printf ("5\n") ; } v There are three ways to provide scheduling control of threads to the client. One way is to do nothing and simply provide the client a pointer to a thread context which can be scheduled at will. This method suffers from the fact that most clients don’t want to be bothered by constructing their own scheduler from scratch. Another way to do it is to provide a single scheduling policy, with very little client control over what runs next. The UNIX system provides such a pol- icy. While this is the simplest (from the point of view of the client) way to go, it makes it difficult to implement policies that take into account the differing response time needs of client threads. We chose to take a middle ground in an effort to avoid these problems. There is a default scheduling policy, but enough primitives are provided that it is possible to construct a wide variety of schedul- ing policies based on it. It is possible to custom-build your own scheduler by using the primitives lwp_suspend( ) , lwp_yield ( ) , lwp_resume ( ) , lwp_setpri ( ) , and lwp_resched ( ) . lwp_suspend ( ) may also be used in debugging, to ensure that a thread is stopped before inspecting it. Here, we give an example of how to build a round-robin time-sliced scheduler. The idea is to have a high priority thread act as a scheduler, with the other threads at a lower priority. This scheduler thread simply sleeps for the desired quantum. When the quantum expires, the scheduler issues a lwp_resched ( ) command for the priority of the scheduled threads. This causes a reshuffling of the run queue at that priority. < ^ ♦include ♦include s > sun microsystems Revision A of 27 March 1990 Chapter 2 — Lightweight Processes 23 #def ine MAXPRIO 10 main(argc, argv) int argc; char **argv; { int scheduler (), task(), i; (void) pod_setmaxpri (MAXPRIO) ; lwp_setstkcache (1000, 5); (void) lwp_create ( (thread_t *)0, scheduler, MAXPRIO, 0, lwp_newstk ( ) , 0 ) ; for (i = 0; i < 3; i++) (void) lwp_create ( (thread_t *)0, task, MINPRIO, 0, lwp_newstk ( ) , 1, i) ; exit ( 0 ) ; } scheduler () { struct timeval quantum; quantum. tv_sec = 0; quantum. tv_usec = 10000; for ( ; ; ) { lwp_sleep (&quantum) ; lwp_resched (MINPRIO) ; ) } / * these tasks are scheduled round-robin, preemptive * / task(arg) ( for ( ; ; ) printf("task %d\n", arg) ; } Special Context Switching A thread can pretend to be the only activity executing on its machine even though many threads are running. The LWP library is the entity that provides this illusion. As such, the LWP library provides for context switches between threads which cause volatile machine resources to be multiplexed so that each thread operates with its own set of machine resources. In many cases, a context switch requires only that machine registers and the stack be multiplexed. In other cases, floating point state, memory management registers, and even software state may be multiplexed as well. The LWP library allows threads to have differing amounts of switchable state to efficiently allow processes with dif- ferent resource needs to coexist. In addition to switchable state, a thread will possess state that is updated by other primitives. This per-thread state includes such information as messages sent to a thread, and monitor locks it holds. The only per-thread state maintained by the library is that used to support the LWP primitives, whereas heavyweight processes entail a considerable amount of per-process state. With threads, this amount of state is much smaller with the intent that only those threads which need to should maintain additional state. Thus, operating-system-specific infor- mation such as signal state, accounting information, and file descriptors is not Revision A of 27 March 1990 24 Programming Utilities and Libraries found in the thread context. It is up to the clients to provide as much “weight’ ’ as is required. The reason that special contexts are not directly incorporated into the context of a thread is that not all threads will use these contexts and there is no reason to make a thread pay for something it won’t use. The LWP library will allocate a new context buffer for each special context a thread is initialized with, and pass a pointer to this context to the save and restore routines defined for this context. The id of the previous and new threads to use the context are also passed in, in case the save and restore routines maintain per-thread information about a special context. This information could be used, for example, by a memory- management special context to avoid doing work if the previous and current threads access the exact same memory management registers. To use the special context mechanism, you first define a special context with the lwp_ctxset ( ) primitive. This requires that you figure out how to save and restore the state required by your context and provide procedures to do this. In the example below, which context-switches the C-library global errno, the rou- tines libc_save ( ) and libc_restore ( ) are provided, and the con- text they will save into and restore from is of type libc_ctxt_t. The routine libcenable ( ) is used to define the context, and the global LibcCtx remembers the cookie that defines the context. Once a special context is defined, you may initialize any thread to use the resource multiplexed by the special context by using lwp_ctxinit ( ) . The initialization of a given thread to use a special context can be done directly, or, if the resource permits, by catching a trap when the resource is first used by a thread. In the example below, we expect that each thread accessing errno will be initialized via libcset ( ) to use the special libc context. Threads protected with this special context can read errno without fear that another thread can change errno (e.g., via a system call) from underneath them. Because this errno multiplexing is quite useful, it is available in the routine lwp_libcset ( ) which does all of the work for you. — #include #define TRUE 1 typedef struct libc_ctxt_t { int libc_errno; } libc_ctxt_t; static int LibcCtx; /* enable libc special contexts */ libcenable ( ) { extern void libc_save(); extern void libc_restore ( ) ; LibcCtx = lwp_ctxset ( libc_save, libc_restore, sizeof (libc_ctxt_t ) , TRUE) ; } /* set a thread to have libc context */ v J A sen Revision A of 27 March 1990 microsystems Chapter 2 — Lightweight Processes 25 2.3. Messages Messages vs. Monitors r lwp_libcset (tid) thread_t tid; { (void) lwp_ctxinit (tid, LibcCtx) ; } /* routines for saving/restoring global library data. */ void libc_save (cntxt, old, new) caddr_t cntxt ; thread_t old; thread_t new; { extern int errno; #ifdef lint old = old; new = new; #endif lint ( (libc_ctxt_t *) cntxt) ->libc_errno = errno; } void libc_restore (cntxt, old, new) caddr_t cntxt; thread_t old; thread_t new; ( extern int errno; #ifdef lint old = old; new = new; #endif lint errno = ( (libc_ctxt_t *) cntxt) ->libc_errno; } s _> There are two predominant types of process synchronization in use today: the rendezvous paradigm and the monitor paradigm. The lightweight process pack- age provides both, in part to avoid denying a large number of people their favor- ite primitives, and in part because each has compelling reasons. Rendezvous has the advantages that it maps cleanly to Sun interprocess- communications facilities (Sun RPC), can potentially support communication across different address spaces, is higher-level than monitors because both data transmission and synchronization are combined into a single concept, and is a natural way to map asynchronous events into higher-level abstractions since mes- sages are reliable and conditions are not. microsystems Revision A of 27 March 1990 26 Programming Utilities and Libraries Rendezvous Semantics Messages and Threads The big advantage with monitors are their familiarity to UNIX system program- mers (via similarity to sleep ( ) and wakeup ( ) in the kernel), and the efficiency win when protected data is accessed: with rendezvous, a context switch is always required; with monitors, a context switch is only necessary if the monitor lock is busy at the time of access. To use messages, one thread issues a msg_send ( ) and another thread issues a msg_recv ( ) . Whichever thread gets to the corresponding primitive first waits for the other, hence the term rendezvous. When the rendezvous takes place, the sender remains blocked until the receiver decides to issue a msg_r eply ( ) . Immediately after msg_reply ( ) returns, both threads are unblocked. It is the responsibility of the sender to provide the buffer space both for a mes- sage to be sent to the receiver, and for a reply message from the receiver. Either of these messages may be empty. While the sender is blocked, the receiver has access to the buffers provided by the sender. When the receiver replies, she is undertaking not to use these buffers any more: the transaction is complete. If memory management was used to share address spaces, the sender’s buffers would be mapped into the receiver’s address space only for the duration of the rendezvous. Because both send and receive buffers are provided by the sender, there is no need for further synchronization to tell the receiver that her reply was accepted by the sender. Sometimes it is desired to perform a non-blocking send in which the sender does not block on a send request. We did not provide this as a primitive because it is easily implemented by using an additional thread to do the send. Messages are sent to threads, and each thread has exactly one queue associated with it to receive messages on. We could have provided message queues (ports) as objects not bound to processes. This would give more flexibility, but would require a more complex selection primitive to really justify the extra functional- ity. In addition, it would complicate the implementation because we desire to terminate a rendezvous on behalf of the remaining thread should one of the ren- dezvousing threads be destroyed. To receive a rendezvous request, a process specifies the identity of the sending thread it wishes to rendezvous with. Optionally, a receiver may specify that any sender will do. There is no other form of selection available. If more power is needed, the client can build server processes which act as intelligent ports capa- ble of performing complex selection criteria. Note that the id of the sending thread or agent is supplied to the receiver by the LWP library, so that it is not possible to forge the identity of the sender. Here is an example of basic message passing, main ( ) creates two threads, sender () and receiver () . Because it has a higher priority, the receiver starts first and blocks, awaiting a rendezvous. Then, the sender runs and prepares a message. However, the sender sleeps for 2 seconds before sending it. In this time, the receiver gave up waiting and tried again, now waiting with infinite pati- ence. The sender wakes up a second later and attempts to rendezvous with the receiver. This rendezvous immediately succeeds, the receiver reads the message, prepares a reply, and replies. At this point, the rendezvous is complete and both microsystems Revision A of 27 March 1990 Chapter 2 — Lightweight Processes 27 sender and receiver are runnable processes. Because the receiver has a higher priority, the message ‘ ‘done receiving’ ’ is printed ahead of the “got reply’ ’ mes- sage. Note that the receiver should not touch any of the data mentioned in the send once the reply has been made. ♦include ♦include ♦include ♦define MAXPRIO 10 thread_t cl, c2; main(argc, argv) int argc; char **argv; { int sender (), receiver (); (void) pod_setmaxpri (MAXPRIO) ; lwp_setstkcache (1000, 3); lwp_create (&cl, sender, MINPRIO, 0, lwp_newstk ( ) , 0) ; lwp_create (&c2, receiver, MINPRIO+1, 0, lwp_newstk () , 0) ; exit ( 0 ) ; } sender () { char out [20]; char in [30]; int i ; struct timeval wait; wait.tv_sec = 2; wait.tv_usec = 0; for (i = 0; i < 19; i++) out[i] = (int) 'A' + i; out [19] = ' \ 0 ' ; lwp_sleep (Swait) ; if (msg_send(c2, out, 20, in, 26) == -1) { lwp_perror ( "msg_send" ) ; return; } printf("got reply %s\n", in); ) receiver () { int i; struct timeval wait; char *arg, *res; int asz, rsz; thread_t sender; wait.tv_sec = 1; wait.tv_usec = 0; / * try one second * / sun microsystems Revision A of 27 March 1990 28 Programming Utilities and Libraries Intelligent Servers sender = THREADNULL; /* take message from anyone */ if (msg_recv (& sender, &arg, &asz, &res, &rsz, &wait) == - 1 ) { if (lwp_geterr ( ) != LE_TIMEOUT) { lwp_j?error ( "msg_recv" ) ; return; } / * wait forever or until message arrives from sender * / if (msg_recv (Ssender, &arg, &asz, &res, &rsz, INFINITY) == -1) { lwp_j?error ("msg_recv") ; return; } } printf("got message %s\n”, arg) ; for (i = 0; i < rsz - 1; i++) res[i] = (int)'B' + i; res [rsz - 1] = ' \ 0' ; msg_reply (sender) ; printf("done receiving\n") ; Because the reply can be done at any time, a receiver can receive a number of messages before replying to them. This makes it possible to implement complex servers. In the following example, processes send requests in a random order to a server thread. The server serializes the requests and processes them in the order associated with the request. #include #include thread_t pt; typedef struct port_msg { int order; char *msg; } port_msg; #def ine MAXPRIO 10 main(argc, argv) int argc; char **argv; { int process ( ) ; int port ( ) ; (void) pod_setmaxpri (MAXPRIO) ; lwp_setstkcache (1000, 3) ; / * argument to new thread is order # * / lwp_create ( (thread_t *)0, process, MINPRIO, 0, lwp_newstk ( ) , 1, 3); lwp_create ( (thread_t *)0, process, MINPRIO, 0, v m sun Xr microsystems Revision A of 27 March 1990 Chapter 2 — Lightweight Processes 29 2.4. Agents Some environments will present asynchronous interrupts to the client. For exam- ple, on a bare machine, a character typed at a tty can cause an interrupt to ran- domly steal control away from the executing program. Similarly, a signal can interrupt the current thread. Because of the random nature of interrupts, it is hard to understand programs that deal with them. The lightweight process library pro- vides a simple way to transform asynchronous events into synchronous ones. Revision A of 27 March 1990 30 Programming Utilities and Libraries A message paradigm (as opposed to a monitor paradigm) was chosen for map- ping interrupts because an interrupt cannot wait for a monitor lock if held by a client. Even if condition variables are used outside of a monitor, it is still neces- sary to add memory to the condition variable to prevent races (just before the client decides to sleep, an interrupt comes in, causing a condition to be notified, which is missed by the client, who then sleeps, resulting in deadlock). Adding a flag to a condition to prevent this is analogous to converting the condition into a 1 -bit message. With asynchronous interrupts, an event causes a sort of context switch within the same thread. With LWP’s, a thread must synchronously rendezvous with an interrupt. Thus, to have an event do something asynchronously, it is necessary to use a separate thread to handle it. To simulate typical UNIX signal handling, you would create two threads: one thread to represent the main program, and another thread at a higher priority to represent the signal handler. The latter thread would have an agent set up to receive signals. The agent mechanism is provided to map asynchronous events into messages to a lightweight process. A message from an agent looks exactly like a message from another thread. When you create an agent, you also provide a portion of the pod’s address space for the agent to store its message. You cannot receive the next message from an agent until you reply to the current one. Because the LWP scheduler is preemptive, when a signal is mapped into a message, it will cause the highest priority thread blocked on the agent to run next. Client threads which have agents can use all of the LWP library facilities (monitors, condition vari- ables, messages) to synchronize with other threads. The agent mechanism does its best to process signals as rapidly as possible. Nonetheless, it is possible that events will be missed because the kernel does not remember more than one signal occurring while a signal is being processed. Furthermore, signals are not delivered for each occurrence of I/O. Therefore, a thread which wakes up from a SIGIO agent for example, should not sleep again until read ( ) on the descriptor fails, indicating that another SIGIO will be delivered when more I/O is available. When an interrupt arrives, the LWP library saves only volatile information about the interrupt, and wakes up any threads waiting on the agent. On a bare machine, volatile information would include for example, the character typed in from a tty. Under SunOS, volatile information includes the state normally delivered to a sig- nal handler as well as the identity of the thread running at the time of the event. This volatile information is passed as a message to the client thread. System Calls Non-blocking I/O Library A set of heavyweight processes can execute concurrently in the kernel. For example, three heavyweight processes can concurrently initiate writes to the same device. This is not the case for lightweight threads. Some relief can be provided by marking descriptors asynchronous with f cntl (2 ) . This allows threads to block on SIGIO agents and only block on a system call when it is likely to be immediately productive (i.e., without blocking indefinitely). Revision A of 27 March 1990 Chapter 2 — Lightweight Processes 3 1 Similarly, a thread can block on a SIGCHLD agent instead of blocking on a wait ( 2 ) system call. However, there is no general solution to the problem of having several threads execute system calls concurrently until the LWP primi- tives are made available as true system calls operating on a shared set of descrip- tors. The use of the non-blocking I/O library can help by automatically blocking a thread attempting any I/O until such I/O is likely to succeed immediately. The blocked thread will try the system call again automatically when a SIGIO event occurs. Here is an example of how to use the non-blocking 10 library. We have a pro- cedure compute _pi that runs at low priority, and a procedure reader that runs at high priority. If we link this program without the non-blocking 10 library, the reader will prevent the compute-bound thread from running since the read ( ) system call blocks. However, if we link in the non-blocking 10 library, the compute-bound procedure will execute until some 10 is made available (in this case, by the user typing something at the terminal). Using the Non-Blocking 10 Library Revision A of 27 March 1990 Here is another example of how to use the non-blocking I/O library. The first program is a server which accepts requests over the wire. When a request arrives, a thread is created to handle the request so that accepting and processing the requests can proceed in parallel. The processing of the request consists in sleeping for the amount of time specified in the request message. Note that if the non-blocking I/O library is not linked in, the main program loop prevents any (lower priority) request-processing threads from executing. lwp_dat astk ( ) is used to put the message on the stack of the newly-created thread. Thus, there is no need to keep the message in main. * sleep server program. */ #include #include #include ♦include ♦include ♦include ♦define MYPORT 8889 ♦define MAXPRIO 10 ♦define BUFSIZE 10 struct message { int timeout; int msgsize; char buf [BUFSIZE] ; } message; extern int errno; main ( ) { int s ; struct sockaddr_in addr; int len = sizeof (struct sockaddr_in) ; int fromlen; int rlen; void compute (); stkalign_t sp; caddr_t loc; if (pod_setmaxpri (MAXPRIO) < 0) { lwp_perror ("pod_setmaxpri") ; _exit (1) ; } if (lwp_setstkcache (5000, 5) < 0) { lwp_perror ("lwp_setstkcache") ; _exit ( 1 ) ; } if ((s = socket (AF_INET, SOCK_DGRAM, IPPROTO_UDP) ) < 0) { A sun microsystems Revision A of 27 March 1990 Chapter 2 — Lightweight Processes 33 perror ("can't get socket"); _exit (1) ; } addr . s in_addr . s_addr = INADDR_ANY ; addr . sin_family = AF_INET; addr . sin_j?ort = MYPORT; if (bind(s, (struct sockaddr *)&addr, len) < 0) { perror ("bind") ; close (s) ; _exit ( 1 ) ; } if (getsockname (s, (caddr_t) Saddr, &len) != 0) { perror ("can't get name") ; close (s) ; _exit (1) ; ) for ( ; ; ) { do { fromlen = len; rlen = recvfromfs, (caddr_t) &message, sizeof (struct message), 0, &addr, & fromlen) ; ) while ( (rlen - -1) && (errno == EINTR) ) ; if (rlen == -1) { perror ("recvfrom") ; _exit ( 1 ) ; } sp = lwp_datastk (message .buf, message .msgsize, &loc) ; lwp_create ( (thread_t *)0, compute, MINPRIO, 0, sp, 2, message .timeout, loc) ; } exit ( 0 ) ; ) void compute (timeout, msg) int timeout; char *msg; { struct timeval time; time.tv_sec - timeout; time.tv_usec =0; printf ("%s\n", msg); lwp_sleep (&time) ; printf ("%s slept %d secs\n", msg, timeout); * program to send a message to the sleep-server. * usage: sip */ ♦include ♦include 34 Programming Utilities and Libraries ♦include ♦include ♦include ♦define MYPORT 8889 ♦define BUFSIZE 10 struct messsage { int timeout; int msgsize; char buf [BUFSIZE] ; } message; extern int errno; main(argc, argv) int argc; char **argv; { int s ; struct sockaddr_in addr; int len = sizeof (struct sockaddr_in) ; int err; struct hostent *hp; char *server; if (argc != 4) { printf ( "usage : %s server seconds message\n", argv [ 0 ] ) ; exit (2) ; ) server = argv[l]; message .timeout = atoi (argv [2] ) ; message .msgsize = strlen (argv [3] ) + 1; bcopy (argv [3] , message. buf, message .msgsize) ; if ( (hp = gethostbyname (server) ) == 0) { printf ( "can' t get host name\n") ; exit (1) ; } bcopy (hp->h_addr, &addr . sin_addr, hp->h_length) ; addr . sin_f amily = AF_INET; addr . sin_port = MYPORT; if ( (S = socket (AF_INET, SOCK_DGRAM, IPPROTOJJDP) ) < 0) { perror ( "can' t get socket"); exit (1) ; } do { err = sendto(s, (caddr_t) Smessage, sizeof (message) , 0, &addr, len) ; } while ((err == -1) && (errno == EINTR) ) ; if (err == -1) { perror ( "sendto" ) ; exit ( 1 ) ; } sue microsystems Revision A of 27 March 1990 A final example of the non-blocking I/O library illustrates how the wait ( 2 ) system call can be used. Here, the parent process forks two children. The chil- dren do something (in this case, they just sleep) and terminate with an exit status. The parent would like to reap the children, but does not want to block in the pro- cess. The solution is to link in the non-blocking I/O library which lets the parent block without stopping other threads. Behind the scenes, a SIGCHLD agent thread is watching for terminating processes. If the non-blocking I/O library is not linked in, the wait will succeed, but the otherwork thread will not get a chance to run. Note that threads using system calls remapped by the non- blocking I/O library automatically receive the C-library special context, so errno is not lost across context switches. #include #include ♦include main ( ) { int child; union wait stat; void otherwork (); (void) pod_setmaxpri (10) ; (void) lwp_setstkcache (1000, 2) ; (void) lwp_create ( (thread_t *) 0, otherwork, MINPRIO, 0, lwp_newstk ( ) , 0) ; if (fork() == 0) { sleep (5) ; _exit ( 7 ) ; } else if (fork() == 0) { sleep (3) ; _exit ( 5 ) ; } for (;;) { /* reap children */ child = wait(Sstat); printf("%d got %d\n", child, stat . w_retcode) ; if (child == -1) { perror ("wait") ; break; 1 } exit ( 0 ) ; 1 void otherwork ( ) { struct timeval time; time. tv sec = 2; « sun Nlr microsystems Revision A of 27 March 1990 36 Programming Utilities and Libraries Examples of Agents We present two examples of agent use below. The first example shows how a traditional UNIX signal handler can be emulated. Note the use of monitors to protect access to shared state. The second example shows the use of a SIGIO agent. — /* Example of the UNIX system style of signal handling */ ♦include ♦include ♦include ♦define MAXPRIO 10 mon_t mid; int shared_state; main(argc, argv) int argc; char **argv; { int sigint_catch () ; int task () ; int taskl(); (void) pod_setmaxpri (MAXPRIO) ; lwp_setstkcache (3000, 3) ; mon_create (&mid) ; (void) lwp_create ( (thread_t *)0, sigint_catch, MAXPRIO, 0, lwp_newstk ( ) , 0); /* * the signal handler will preempt the main program * so we give it the higher priority */ lwp_setpri (SELF, MINPRIO) ; for ( ; ; ) { /* do other work */; mon_enter (mid) ; /* access shared_state */ mon_exit (mid) ; } exit (0) ; } sigint_catch () { event inf o_t sigmem; char *arg; k m sun microsystems Revision A of 27 March 1990 int asz; thread t sender; agt_create (Ssender, SIGINT, (char *) Ssigmem) ; for ( ; ; ) { (void) msg_recv (& sender, &arg, &asz, 0, 0, INFINITY); (void) msg_reply (sender) ; printf("got ~C\n") ; mon_enter (mid) ; /* access shared_state */ mon_exit (mid) ; } } /* Example showing how to process SIGIO */ /* * Some points about this code: * 1. because the system call could be interrupted, we * check for EINTR. In order that errno is accurate, we * make sigio_catch a libc thread (else, it may be lost * on a context switch) . * * 2 . We reset stdin before returning so the shell won't * get confused. (It would otherwise get EWOULDBLOCK * trying to read stdin, and bomb out with an error) . */ ♦include ♦include ♦include ♦include ♦include ♦define TRUE 1 ♦define MAXPRIO 10 main(argc, argv) int argc; char **argv; { int sigio_catch ( ) ; thread_t tid; (void) pod_setmaxpri (MAXPRIO) ; lwp_setstkcache (3000, 3); lwp_create (&tid, sigio_catch, MAXPRIO, 0, lwp_newstk ( ) , 0); lwp_libcset (tid) ; lwp_setpri (SELF, MINPRIO) ; / * do main's work */ } sigio_catch ( ) { int cnt ; sun microsystems Revision A of 27 March 1990 38 Programming Utilities and Libraries char buf [256] ; int f d = 0 ; /* stdin */ extern int errno; int emask, rmask, wmask; eventinfo_t agtmemory; thread_t sender; char *arg; int asz; int inputbits = 01 « fd; /* * Enable SIGIO on stdin. When we actually read, it * may still return EWOULDBLOCK (SIGINT before SIGIO * delivered flushes input leaving nothing to read) , * so need to read again. */ fcntl (fd, F_SETFL, FASYNC | FNDELAY) ; rmask = inputbits; emask = wmask = 0; agt_create (Ssender, SIGIO, &agtmemory) ; for ( ; ; ) { /* * block pending notification that reading would * be useful meanwhile, main can get work done. */ (void) msg_recv (Ssender, &arg, &asz, 0, 0, INFINITY); (void) msg_reply (sender) ; select (32, Srmask, &wmask, Semask, (struct timevel *)0); if (rmask & inputbits) { cnt = read(fd, buf, 256) ; if (cnt != -1 || errno != EWOULDBLOCK | | errno != EINTR) break; buf [cnt] =0; printf ("\ngot %s\n' fcntl (fd, F_SETFL, , buf) ; 0); /* reset stdin so no shell confusion */ * To do simple signal handling within main, * we could just write: */ main(argc, argv) int argc; char **argv; { int cnt ; char buf [256] ; sun microsystems Revision A of 27 March 1990 Chapter 2 — Lightweight Processes 39 int fd = 0; /* stdin */ extern int errno; int emask, rmask, wmask; eventinfo_t agtmemory; thread_t sender; char *arg; int asz; int inputbits = 01 « fd; (void) pod_setmaxpri (1) ; fcntl (fd, F SETFL, FASYNC I FNDELAY) ; rmask = inputbits; emask = wmask = 0; agt_create (Ssender, SIGIO, Sagtmemory) ; for ( ; ; ) { (void) msg_recv (&sender, &arg, &asz, 0, 0, INFINITY); (void) msg_reply (sender) ; select (32, &rmask, &wmask, Semask, (struct timeval *) 0) ; if (rmask & inputbits) { cnt = read(fd, buf, 256) ; if (cnt != -1 II errno != EWOULDBLOCK | | errno != EINTR) break; ) } buf [cnt] = 0; printf ("\ngot %s\n", buf); fcntl (fd, F_SETFL, 0); exit ( 0 ) ; ) The monitor-condition variable paradigm is a familiar one to kernel programmers because of the analogue to sleep ( ) and wakeup ( ) in the UNIX system ker- nel. A monitor implements a critical section. This is a reentrant region of code in which access is serialized. As a result, shared data accessed by this code is pro- tected against races that can lead to incorrect interpretations of the data. Once a thread is executing within a monitor, other threads block until that monitor is exited. When thread priorities are equal, they are queued first-come-first-served for access to the monitor. This ensures fair, serial access to the protected data. As an example, a producer and consumer thread may use a monitor to protect access to a buffer of data being produced or consumed (so that the state of the buffer’s “fullness” is consistent). When the producer has filled the buffer, it must wait for the consumer to drain the buffer. This sort of synchronization is provided by condition variables. When a thread waits on a condition, it atomi- cally gives up the monitor and blocks pending a notification. The result of the notification is that the blocked thread will eventually reacquire the monitor in Asun microsystems 2.5. Monitors and Conditions Revision A of 27 March 1990 40 Programming Utilities and Libraries order to attempt access to the buffer again. One goal of lightweight processes is to avoid the use of sigsetmask’ s or other primitives which lock out interrupts to prevent races. By using monitors as a synchronization tool, and by using threads with agents to handle interrupts, the use of interrupt masking can be eliminated, and the risk of dropping interrupts reduced. Within the LWP library itself, most critical sections are implemented by disa- bling the scheduler (and not by disabling interrupts) for the duration of the criti- cal section. If an interrupt arrives during a critical section, it is processed only to the point of saving the volatile interrupt state. At the end of a critical section, if there are any accumulated events, scheduling decisions are made based upon the agents associated with the events. Interrupts are only masked to ensure that a) the nugget stack is not grown indefinitely by repeated interrupts and b) as a thread is being resumed, to ensure that the new context is loaded atomically. Thus, interrupts are only disabled as a consequence of an interrupt occurring, and never preventively. Programming with Monitors Typically, there is some state associated with a condition. When the state acquires a given value, a thread can take some action. Otherwise, it will wait until the state changes. For example, if a buffer is full, a thread writing to the buffer will wait until the state of the buffer indicates that it is no longer full. Another thread reading from the buffer will cooperate by notifying any such waiting thread when the buffer is no longer full. Because the buffer state is accessed by several threads, it is protected by a monitor. Otherwise, a thread could decide to wait for a state change, only to have the state change before the wait can be executed, resulting in deadlock. Therefore, both the waiter and the notifier must access the state in a monitor, and the wait primitive ( cv_wait ) must atomically release the monitor. The typical wait code looks like this: V mon_enter (m) ; • • • / while (! state) cv_wait (cv) ; • • • / mon_exit (m) ; / Monitors vs. Interrupt Masking The while loop is there because if there are several threads waiting in the monitor when the condition is broadcast, all of them wake up, but the first thread to gain entry to the monitor may alter the state, invalidating it for the other awakened threads. In our current example, if two producers are awakened because the buffer is no longer full, the first one may fill the buffer again and wait, leaving the second one to run. The second producer must not add to the buffer now, because it is full again. Some subtle points about thread scheduling priority should be mentioned. Note that threads queue for monitors and conditions based upon thread priority. No context switch necessarily takes place when a monitor is exited. Thus, a monitor that is repeatedly reentered by a high-priority thread can starve other threads microsystems Revision A of 27 March 1990 Chapter 2 — Lightweight Processes 4 1 Monitors and Events Condition Variables Enforcing the Monitor Discipline wanting access to the monitor. Care should be taken in assigning priorities to threads using monitors, since a low-priority thread which owns a monitor can still prevent a higher priority thread from accessing that monitor. If a low- priority thread owning a monitor is preempted, it may cause long delays to more important threads needing monitor access. Since events are processed by threads, state manipulated by a thread receiving agent messages can be protected by monitors and condition variables. Thus, after receiving an agent message, a thread may enter a monitor before accessing some global state. Since the LWP library has a large memory for events, no events should be lost if this thread has to block for access to the monitor. cv__broadcast ( ) awakens all threads blocked on a condition. cv_not if y ( ) awakens only a single thread blocked on a condition. cv_notif y ( ) can result in deadlock states if the awakened thread is not the particular one that should notice a state change and should only be used when it is known that a single other thread is involved. cv_notify() is available because it is more efficient to awaken only a single thread. Note that an awak- ened thread will be queued to reacquire the monitor. When the thread actually resumes, it will own the monitor it released when it waited for the condition with cv_wait ( ) . Because it is both confusing to the programmer and expensive to implement, no provision for a condition to be shared by several monitors is made. Instead, con- dition variables are bound to a monitor when they are created. It would be possi- ble to let them be bound when the condition is waited upon, but it would allow the very improbable case of having a waiter awaken in a state testing loop, only to find that his condition was reassigned. mon_de st roy ( ) will remove any conditions bound to the monitor being removed. Ifmon_destroy () fails because some threads are still waiting on an associated condition, you can use cv_waiters ( ) to see which threads are blocked on conditions associated with the monitor, followed by lwp_destroy ( ) to terminate the blocked threads. After the offending threads are terminated, mon_destroy ( ) should succeed. Because a thread which forgets to exit a monitor may deadlock the system, it is convenient to use the exception handler mechanism to enforce the enter-exit dis- cipline. The MONITOR () macro enforces this discipline by ensuring that mon_exit ( ) is called when the procedure that embodies the monitor exits. (It is good form to use a single procedure to contain a monitor, viz:) r V foot) { MONITOR (m) ; • • • / } y This method ensures that no matter how the procedure is exited (barring long jmp(», the monitor will be exited. That is, if the procedure raises an Asun microsystems Revision A of 27 March 1990 42 Programming Utilities and Libraries exception or returns explicitly or implicitly, the monitor is freed. Nested Monitors When a thread blocks on a condition while holding several ( nested) monitor locks, all of the locks except the current one are held. This ensures that the thread does not need to painfully reacquire all of its locks, with the concomitant possibility of deadlock if not all of the locks remain available. If thread T1 holds monitor Ml and wants to acquire monitor M2, and thread T2 holds monitor M2 and wants to acquire monitor Ml, deadlock results. One way to avoid this error is to require that the monitors are always acquired in a certain order. Reentrant Monitors When a monitor is used to protect a data structure, it may happen, for informa- tion hiding reasons, that two different procedures wish to use the same monitor. It may also happen that one of those procedures wishes to use the facilities pro- vided by the other. If these procedures are accessed by the same thread the moni- tor calls are reentrant. If you anticipate such use, you should program your mon- itors as r \ if (mon enter (m) < 0) { error ("bad monitor"); } V J However, if you wish to catch reentrant monitor use as an error, you should pro- gram monitors as: if (mon_enter (m) != 0) { error ("reentrant monitor"); 1 v N V Monitor Program Examples The following is a simple example of monitor use. As described above, we have a producer and a consumer thread, synchronizing with condition variables. To spice it up a bit, we’ve added some scheduling to make things more realistic. ✓ #include #include thread_t cl, c2, sched; mon t ml ; cv_t notempty, notfull; int cnt = 0 ; int in = 0; int out = 0; #def ine MAXBUF 20 char buf [MAXBUF]; #def ine MAXPRIO 10 main(argc, argv) int argc; char **argv; sun microsystems Revision A of 27 March 1990 Chapter 2 — Lightweight Processes 43 { int producer (), consumer () ; int sch ( ) ; (void) pod_setmaxpri (MAXPRIO) ; lwp_setstkcache (3000, 3) ; lwp_create (&cl, producer, MINPRIO+1, 0, lwp_newstk ( ) , 0 ) ; lwp_create (&c2, consumer, MINPRIO, 0, lwp_newstk ( ) , 0) ; lwp_create (Ssched, sch, MAXPRIO, 0, lwp_newstk ( ) , 0) ; mon_create (&ml) ; cv_create (Snotempty, ml) ; cv_create (Snotfull, ml); exit ( 0 ) ; } put (c) /* add a character to the buffer */ char c; { MONITOR (ml) ; while (cnt == MAXBUF ) { /* buffer never > MAXBUF */ printf ("waiting on notfull\n") ; cv_wait (notfull) ; } buf[in] = c; in = (in + 1) % MAXBUF; cnt++; cv_broadcast (notempty) ; /* may be a no-op */ } get (c) char *c; { MONITOR (ml) ; while (cnt == 0) { /* buffer never < 0 chars */ printf ( "waiting on notempty\n") ; cv_wait (notempty) ; ) *c = buf [out] ; out = (out +1) % MAXBUF; cnt — ; cv_broadcast (notfull) ; } producer () { char c; int i ; int j ; for(j =0; j < 500; j++) { c = "abcdefghi jklmnopqrstuvwxyz" [cnt] ; /* produce */ put ( c ) ; } printf ( "producer done\n") ; ■# sun microsystems Revision A of 27 March 1990 44 Programming Utilities and Libraries — \ consumer ( ) { char c; int i; int j ; for(j =0; j < 500; j++) { get (&c) ; /* consume the character */ } printf ("consumer done\n") ; } sch ( ) { int k; thread_t x; struct timeval wait; x = cl; wait.tv_sec = 0; wait.tv_usec = 100000; for (k =0; k < 100; k++) { lwp_sleep (&wait) ; lwp_setpri (x, MINPRIO) ; if (x.thread_id == cl . thread_id) x = c2 ; else x = cl; lwp_setpri (x, MINPRIO+1) ; } } V < 2.6. Exceptions The exception primitives can be used to manage synchronous exceptional condi- tions in a lightweight process. There are no asynchronous exceptions supported by threads because asynchrony can be managed completely with threads and agents, and in a more well-structured fashion. For example, when parsing com- mands and anticipating an interrupt from the keyboard, you can simply create a thread to parse the command and a thread with an agent to catch the interrupt. When the agent thread catches the interrupt it can simply destroy the parsing thread. This is more elegant than doing a long jmp ( ) from a signal handler when an interrupt occurs. There are several aspects of exceptions. First, you can use exitjiandlers to be invoked automatically any time a procedure exits. Second, you can provide an exception handler which assumes control anywhere back on the procedure calling chain ( escape exceptions). Third, you can provide an exception handler which is invoked at the time of an exception and leaves the flow of control alone when it returns ( notification exceptions ). Finally, you can map machine faults ( synchro- nous traps) into exceptions. An exception is an event caused by the explicit (or implicit, in the case of synchronous traps) invocation of exc_raise ( ) . Revision A of 27 March 1990 Chapter 2 — Lightweight Processes 45 Synchronous Traps Implementation When a procedure can exit via a large number of return statements or excep- tion raises, it is difficult to monitor the flow of control. Thus, exit handlers can be established by exc_on_exit ( ) to ensure that a particular action is taken on procedure exit, no matter how the procedure exits. For this reason, no primitive to remove an exit handler is provided, because this provides a way to defeat the whole purpose of exit handlers. se t jmp ( ) and long jmp ( ) support non-local gotos, but do not give the pro- grammer a disciplined way to invoke them. Pattern-directed handler invocation gives the client an opportunity to establish a set of handlers which are matched by particular patterns. For example, an exception in a memory allocation routine can be raised in such a way that a particular handler (say, a garbage collector) can be explicitly invoked by using a well-known pattern. The CATCHALL pattern can be used by a thread either to implement more general sorts of pattern match- ing (by handling those patterns it wants and discarding those patterns it is not interested in and reraising the exception), or to catch exceptions which must always be caught (e.g., a routine which normally allocates some memory per- manently and returns should free the memory if an exception occurs). exc_not if y ( ) is provided for those exceptions which require an action to be executed on behalf of the exception handler and control to be returned to the raiser of the exception. The handler of a notify exception establishes a function, as well as an argument which can refer to an execution- time environment. By providing a null function, a handler can indicate that only escape exceptions (invoked by exc_raise()) are to be used. Exception handling is useful for assisting disciplined use of lightweight process primitives. The MONITOR ( ) macro is one example. Another is the fork ( ) example discussed in the next section. Some events are completely synchronous, such as division by zero faults. For such events, it is not logical to allocate a separate thread, since threads are intended to handle asynchronous events. In the lightweight process world, syn- chronous events appear to be exceptions. Use agt_t rap ( ) to enable excep- tion mapping for a given event. Note that unhandled exceptions cause termina- tion of the offending thread. One possible way to implement an exception mechanism at the language level would be to use a LWP special context to contain a pointer to the current excep- tion handler for each thread. Using this context, it would be possible to search backwards on the exception chain looking for pattern matches. Rather than require the client to explicitly pass in a context variable to be used to save and restore exception context, the LWP implementation allocates the con- text automatically. This is less efficient because by using local variables as con- texts, allocation and freeing of the context are free. However, in addition to the more pleasant interface, there are several advantages to the implicit allocation strategy. Because the stack is reset when an exit handler runs, there is no room for local variables to be used by the library code that implements exit handlers (note that the exit handler can make procedure calls of undetermined depth!). This is especially problematic when several exit handlers have been established. microsystems Revision A of 27 March 1990 46 Programming Utilities and Libraries Example of Exception Handling Also, if the system being used can’t take interrupts on a separate stack, a fair amount of interrupt masking may be required to protect the stack once it is reset. Exception handling is really a language issue. However, since synchronous traps may be mapped into exceptions, the LWP library itself must be able to access the exception contexts. Thus, the exception handling facility is part of the LWP library and not a separate language facility. In the future, a more flexible inter- face to agt trap ( ) may be provided so languages can provide their own style of exception handling. In the following example, we use the exception handling mechanisms to facili- tate a garbage collector. In the event that a resource is exhausted, the client attempts to correct things by notifying the garbage collector. If the next attempt to obtain the resource fails, the client gives up by raising an exception. As an exercise, pretend that the client had resources that needed to be freed as a result of the fatal exception. Use CATCHALL handlers to allow procedures higher up the calling chain to free the resources they allocated. ♦include ♦include ♦define ATTRIBUTE 9 ♦define FATAL 7 ♦define MAXPRIO 10 main(argc, argv) char **argv; { int task ( ) ; (void) pod_setmaxpri (MAXPRIO) ; lwp_setstkcache (1000, 3); (void) lwp_create ( (thread_t *)0, task, MINPRIO, 0, lwp_newstk ( ) , 0 ) ; exit ( 0 ) ; } task () { int garb_collect () ; / * establish garbage collector for ATTRIBUTE-type resources * / (void) exc_handle (ATTRIBUTE, garb_collect, ATTRIBUTE); / * establish handler for unrecoverable errors * / if (exc_handle (FATAL, 0, 0) == 0) someprocedure () ; else abort ( ) ; } someprocedure ( ) { char *r; char *getresource ( ) ; sun microsystems Revision A of 27 March 1990 Chapter 2 — Lightweight Processes 47 r = getresource (ATTRIBUTE) ; /* use resource */ } char * getresource (attribute) int attribute; { int (*f) () ; char *resource; char *obtain(); resource = obtain (attribute) ; if (resource == 0) { (void) exc_notify (attribute) ; resource = obtain (attribute) ; if (resource == 0) exc_raise (FATAL) ; } return (resource) ; } garb_collect (atr) int atr; { /* * garbage collect resource of type atr such that * obtain might succeed if tried again. *1 } char * obtain (atr) int atr; { /* * try to allocate resource of type atr * return 0 if unable to get the resource. */ } y / * try to get resource * / /* couldn’ t get it */ /* garbage collect */ /* try again */ / * still couldn' t get it * / / * give up * / 2.7. Big Example This example illustrates many of the LWP features: exit handlers, monitors, con- dition variables, messages, threads. It is a parallel binary tree fringe comparator. Given two binary trees T1 and T2, they have the same fringe if and only if their leaf nodes are equivalent when read left to right. Part of the program relies on a fork ( ) and join ( ) mechanism. The idea is that a thread may wish to start some threads and wait for n of them to terminate. (To wait for one specific thread to die, use Iwp Join.) Thus, a program could look like: microsystems Revision A of 27 March 1990 48 Programming Utilities and Libraries proc() { • • • / tfork (threadl) ; tfork (thread2) ; tfork (thread3) ; j o in ( 2 ) ; / * wait for any 2 forked threads to die * / • • • / join(l); /* wait for last thread to die */ } To make this work, we have tf ork ( ) create its thread via an intermediary which uses an exit handler (see exc_on_exit(3L)) to ensure that the thread calls die () when it terminates. die() will keep track of the number of ter- minated threads. Since a tf ork ( ) ’ed thread may be destroyed by another thread, lwp_destroy ( ) should be encapsulated by a procedure that calls die ( ) as well. This is an illustration of how the exception handling facility can be used to create new protocols (enforced exit actions, for example). The program begins by declaring two trees (which don’t, in this case, have the same fringe). Then, we create three threads: one thread to evaluate each tree, and one thread to compare leaf values and serve as an information exchanger. The two tree evaluators proceed in parallel, sending a message to the comparator con- taining the leaf value when a leaf is encountered. When the comparator finds a mismatch, it terminates the tree evaluators. When the main program joins suc- cessfully, the two evaluators are dead. It then sends a message to the comparator to find out what the results were. The tree evaluators are simple: they merely recurse down their subtree, pausing to tell the comparator when a leaf is encountered. The comparator is fairly com- plex. It first receives a message from either of the two tree evaluators (which, after all, are running in parallel. As an exercise, add preemptive round-robin scheduling to this program!). Then, it waits for a message from the other tree evaluator (else, it could get another value from the same tree evaluator). If the answers disagree, the comparator terminates the evaluators to prevent further (useless and confusing) messages from being sent. Finally, because the two trees being compared may be structurally quite different, one evaluator may finish while the other remains active. As a result, the comparator could do a ms g_recv ( ) on a non-existent thread. Therefore, we check this condition by noting if msg_recv ( ) fails. Just to show that it’s possible, this program lints when linted with the LWP lint library! , #include #include ♦include ♦define NULL 0 thread_t cmp, pi, p2; thread_t driver; int tfork (); c v_t cv ; mon_t mon; k. m sun Xr microsystems Revision A of 27 March 1990 Chapter 2 — Lightweight Processes 49 int numdead = 0; typedef struct tree_t { int val; struct tree_t *left, *right; } tree_t; #def ine TREENULL ( (tree_t *) 0) #def ine TRUE 1 #define FALSE 0 fdefine MAXPRIO 10 t ree_t 1 1 [ ] = { {0, &tl[l], &tl [2] } , {1, &tl[3], &tl [4] } , { 4 , TREENULL , TREENULL } , {1, TREENULL, TREENULL), {3, TREENULL, &tl[5]}, {5, TREENULL, TREENULL), ) ; tree_t t2[] = { {0, &t2 [1] , &t2 [2] } , { 1 , TREENULL , TREENULL } , {2, &t2 [3] , &t2 [4] }, { 3 , TREENULL , TREENULL } , { 4 , TREENULL, TREENULL } , } ; main ( ) { int compare (), parsetree () ; int answer; if (pod_setmaxpri (MAXPRIO) == -1) lwp_jperror ( "setmaxpri" ) ; (void) lwp_setstkcache (10000, 5) ; (void) lwp_self (Sdriver) ; tfork(&cmp, compare, 0); tfork(&pl, parsetree, (int)tl); tfork(&p2, parsetree, (int)t2); join (2) ; (void) msg_send (cmp, (caddr_t)0, 0, (caddr_t) Sanswer, sizeof (answer) ) ; if (answer) (void) printf("same fringe\n") ; else (void) printf("not same fringe\n") ; exit ( 0 ) ; } compare ( ) { int vail; thread_t next; thread_t sender; int same fringe = TRUE; int *resbuf; w sun microsystems Revision A of 27 March 1990 50 Programming Utilities and Libraries int ressize; int *argbuf; int argsize; int err; for ( ; ; ) { err = MSG_RECVALL (Ssender, (caddr_t *)&argbuf, sargsize, (caddr_t *)&resbuf, &ressize, INFINITY) ; if (err < 0) lwp_jperror ( "MSG_RECVALL" ) ; if (SAMETHREAD (sender, driver)) { *resbuf = samefringe; (void) msg_reply (driver) ; return; } vail = *argbuf; next = (SAMETHREAD (sender, pi) ? p2 : pi); (void) msg_reply (sender) ; err = msg_recv (&next, (caddr_t *)&argbuf, &argsize, (caddr_t *)&resbuf, Sressize, INFINITY) ; if (err < 0) { /* he died */ samefringe = FALSE; destroy (sender) ; ) else { samefringe = (*argbuf == vail) ; if (! samefringe) { destroy (pi) ; destroy (p2) ; ) else (void) msg_reply (next) ; parsetree (t) tree_t *t; { if (t == TREENULL) return; if ( (t->left == TREENULL) && (t->right == TREENULL) ) { /* leaf */ (void) msg_send (cmp, (caddr_t) &t->val, sizeof (int), (caddr_t)0, 0) ; } else { parsetree (t->left) ; parsetree (t->right) ; tfork(new, adr, arg) thread_t *new; int (*adr) ( ) ; int arg; Chapter 2 — Lightweight Processes 5 1 extern void prochelpO; static int init = 0; if (init == 0) { init = 1 ; (void) mon_create (&mon) ; (void) cv_create (&cv, mon) ; } (void) lwp_create (new, prochelp, MINPRIO, 0, lwp_newstk ( ) , 2, adr, arg) ; } void prochelp (proc, arg) int ( *proc) () ; { extern void die(); (void) exc_on_exit (die, (caddr_t) 0) ; proc (arg) ; } void die () { MONITOR (mon) ; numdead++; (void) cv_notify (cv) ; } join (cnt) { MONITOR (mon) ; while (numdead < cnt) (void) cv_wait (cv) ; numdead -= cnt; } / * use this instead of Iwp destroy with tfork and join * / destroy (pid) thread_t pid; { die ( ) ; (void) lwp_destroy (pid) ; } v Revision A of 27 March 1990 52 Programming Utilities and Libraries Revision A of 27 March 1990 3 3.1. IPC Facilities in the SunOS Operating System File I/O and Pipes State Files and File Locking Named Pipes System V Interprocess Communication Facilities Interprocess Communication involves sharing data between processes and, when necessary, coordinating access to the shared data. Release 4.1 of the SunOS operating system (referred to hereafter as “Release 4.1,” or “4.1”) provides a number of facilities and mechanisms by which processes can communicate. In the simplest case, processes can communicate by writing to and reading infor- mation from files. Alternatively, a process may provide data for direct consump- tion by another concurrent process using a pipe. Pipes employ the basic byte- stream model used for file I/O. A process may deposit context in a state file for use by a later invocation. Processes that make use of state files can prevent multiple concurrent access (and race conditions on writes), by using lock files to simulate semaphores. Before attempting to open a state file for write access, a program can test for the existence of a lock file, to determine whether the desired file or device is avail- able. A simple way to create a lock file is to use the open(2) system call with the 0_CREAT and 0_EXCL, flags. When called in this way, open ( ) creates the lock file only if it does not already exist. If multiple processes both attempt to get a lock at about the same time, only the first will succeed. The other processes may be instructed to block (suspend execution) until such time as the lock file is removed, or to exit with an appropriate error message. Lock files are most useful when the lock is to persist through a reboot of the sys- tem. A case in point is the permissions file used by SCCS. Additionally, the system provides library routines such as f lock(3) and lockf (3) for advisory or mandatory file locking. Locks placed with flock ( ) are only visible to processes running on the local processor. Locks placed with lockf ( ) are visible to any process running on any processor with access to the file, lockf ( ) also provides record locking for fine-grained control over updates to specific regions (strings of contiguous bytes) within a file. Another facility that makes use of the file system for IPC is the System V named pipe mechanism. A named pipe (also referred to as a FIFO) has an entry in the file system, but otherwise behaves like an ordinary pipe. It allows one process to provide output directly to another process through ordinary reads and writes to the named device. Unlike ordinary pipes, when the processes terminate, the named pipe remains available for use by other processes. (Refer to mknod(8) for fsun Xr microsystems 53 Revision A of 27 March 1990 54 Programming Utilities and Libraries more information.) Named pipes suffer from all the limitations of regular pipes. For instance, the sender is unknown to the process reading the pipe. Unfortunately, this allows multiple processes to interleave output. Input from a named pipe should there- fore be used with caution. Networking Facilities Release 4. 1 supports two important facilities for networking and IPC in general. They are: TLI (from System V) and sockets (from BSD). These facilities, which both support the file I/O (byte stream) model, can be used for IPC on the local host. They are often preferred when a service has both local and network clients. For more information about networking and general EPC facilities, refer to Net- work Programming. 3.2. System V EPC Release 4.1 provides the following System V facilities for memory-based IPC on Facilities in Release 4.1 a local system: □ Messages □ Semaphores □ Shared Memory These facilities allow local processes to share and process messages, to share access to memory segments in a manner that is compatible with existing System V applications, and to coordinate access to shared objects. If the process that creates an IPC facility dies, the facility does not expire along with it; an IPC facility must be removed explicitly. A shared memory segment remains active, even after it has been flagged for removal, as long as it is attached anywhere in the address space of any process. Only after the last attachment is released, is the (detached) segment freed. Configuring System V IPC In order to use these facilities, they must be configured into your kernel. The Facilities relevant configuration options are: I P CME S SAGE for the System V Messages facility. IPCSEMAPHORE for the System V Semaphore facility. IPCSHMEM for the System V Shared Memory facility. For details on how to configure a kernel, refer to System and Network Adminis- tration. System V IPC Permissions Permissions for a System V IPC facility can be extended to users other than the one for which the facility was created. The creating process identifies the default owner. Unlike files, however, the creator can assign ownership of the facility to another user; it can also revoke an ownership assignment. The current owner process, in turn, can grant read or write access to still other users. The definition for the IPC permissions data structure ipc_perm, is given in , as shown below. Revision A of 27 March 1990 Relying on the native virtual memory manager, in conjunction with the mmap(2) system call, often provides better performance for shared access to read-only seg- ments in memory. Chapter 3 — System V Interprocess Communication Facilities 55 Figure 3-1 IPC Permissions Data Structure struct ipc_perm { ushort uid; /* owner's user id */ ushort gid; /* owner's group id */ ushort cuid; /* creator' s user id */ ushort cgid; /* creator's group id */ ushort mode ; /* access modes */ ushort seq; /* slot usage sequence number key_t key; /* key */ } ; - This structure is common to all System V IPC facilities. Permissions for an IPC facility are initialized by the creating process, and can be modified by any pro- cess that has permission to perform control operations on that facility. Permis- sions are specified as octal values in the flags argument of the appropriate IPC creation or control system call: Figure 3-2 IPC Permission Modes Access Permissions Octal Value Write by Owner 0200 Read by Owner 0400 R/W by Owner 0600 Write by Group 0020 Read by Group 0040 R/W by Group 0060 Write by Others 0002 Read by Others 0004 R/W by Others 0006 For instance, if read access by the owner, and read/write by others is desired, the permissions value would be 0 4 0 6. IPC System Calls, Key Multiple processes requesting access to a common IPC facility must have a Arguments, and Creation means for determining the identity of the desired facility. To that end, system Flags calls that initialize or provide access to an IPC facility make use of a key argu- ment (of type key_t). This key is a value that is either known to all the pro- grams, or preferably, one that can be derived from a common seed at run time. The typical method for deriving a key is to use f t ok (3) to convert a convenient filename to a suitable value. The value derived is virtually unique within the sys- tem. It can be used by all programs (processes) that attempt to obtain access to the facility. System calls that initialize or get access to a System V IPC facility return an ID number (of type int). This ID is used by IPC system calls that perform read, write and control operations, once the facility’s ID has been acquired. Revision A of 27 March 1990 5 6 Programming U tilities and Libraries If the key argument is specified as IPC_PRIVATE (defined to be zero), the call initializes a new instance of an IPC facility that is private to the creating process. When the IPC_CREAT flag is supplied in the flags argument appropriate to the call, the system call attempts to create the facility if it does not exist already. When called with both IPC_CREAT and IPC_EXCL flags, the system call fails if the facility already exists. This can be useful when more than one process may attempt to initialize the facility. One such case might involve several server processes having access to the same facility. If they all attempt to create the facility with IPC_EXCL in effect, only the first attempt succeeds. If neither of these flags is given, and the facility already exists, the system calls to get access simply return the ID of the facility. If IPC_CREAT is omitted and the facility is not already initialized, the calls fail. These control flags are combined, using logical (bitwise) OR, with the octal per- mission modes to form the flags argument. For example: msqid. = msgget (ftok ( "/tmp", ' A '), (IPC_CREAT | IPC_EXCL | 0400)); initializes a new message queue, but only if the queue does not exist already. The first argument evaluates to a key based on the string; the second, the com- bined permissions and control flags. A number of system configuration options* for data structures used by System V IPC facilities can be adjusted in the system configuration file. Some of these options set limits on the amount of resources avaliable to an IPC facility. Those that affect specific system calls are discussed in the descriptions of those system calls. For more information about System V EPC configuration options, you may wish to refer to System and Network Administration. 3.3. Messages The System V messaging facility provides processes with a means to send and receive messages, and to queue messages for processing in an arbitrary order. Unlike the typical file byte-stream model of data flow (in sockets and TLI), Sys- tem V messages each have an explicit length. More importantly, messages can be assigned a specific type. Among other uses, this allows a server process to direct message traffic between multiple clients on its queue (by using the PID of the client process as the message type). For operations involving single-message transactions, a server can balance the load between multiple server processes that have access to the queue. Before a process can send or receive a message, the queue must be initialized by making an msgget(2) system call. The owner or creator of a queue can change its ownership or permissions using msgct 1(2). In addition, any process with permission to do so can use msgct 1 ( ) to perform control operations. System V IPC Configuration Options Refer to conf ig(8) and Installing SunOS 4.1 for information on how to configure a SunOS operating system kernel. tss&r *-» *--*■ microsystems Revision A of 27 March 1990 Chapter 3 — System V Interprocess Communication Facilities 57 Operations to send and receive messages are performed respectively by the msgsndO andmsgrcvO system calls (see ms gop(2)). When a message is sent, its text is copied to the message queue. msgsndO andmsgrcvO can be performed as either blocking, or non- blocking operations. A blocked message operation remains suspended until one of three conditions occurs: □ The call succeeds. □ The process receives a signal. □ The queue is removed. Structure of a Message Queue A message queue is composed of a control structure with a unique ID, a linked list of message headers, and a buffer in which to store the text of the message(s). The identifier for the queue is referred to as the msqid. Figure 3-3 Structure of a Message Queue The control structure for the message queue contains the following information: □ A permissions structure. □ A pointer to the first message on the queue. □ A pointer to the last message on the queue. □ The current number of bytes in the queue. □ The number of messages in the queue. □ The maximum number of bytes allowed in the queue. □ The process ID (PID) of last message sender, a The PID of last message receiver. □ The time the last message was sent. □ The time the last message was received. microsystems Revision A of 27 March 1990 58 Programming Utilities and Libraries □ The time of the last change to the structure. Each message header contains the following information: □ A pointer to the next message on the queue. □ The message type. □ The message text size. □ The message text address. The message queue control structure is defined in the header file : Figure 3-4 Message Queue Control Structure f struct { msqid_ds \ struct ipc_perm msg_perm; /* access permission struct * / struct msg *msg_f irst; /* ptr to first message on q */ struct msg *msg_last; /* ptr to last message on q * / ushort msg_cbytes; /* current # bytes on q */ ushort msg qnum; /* # of messages on q */ ushort msg_qbytes ; /* max # of bytes on q */ ushort msg_lspid; /* pid of last msgsnd */ ushort msg_lrpid; /* pid of last msgrcv */ time_t msg_stime; /* last msgsnd time */ time t msg_rtime; /* last msgrcv time */ } ; >. time t msg_ctime; /* last change time */ Likewise, the definition for the message-header data structure is given as: Figure 3-5 Message Header Structure struct msg { struct msg *msg next ; /* ptr to next message on q */ long msg_type; /* message type */ short msg_ts; /* message text size */ short }; msg_spot; /* message text map address */ Initializing a Message Queue The msgget ( ) system call is used to initialize a new message queue. It can with msgget ( ) also be used to return the message queue ID (msqid) of the existing queue that corresponds to the key argument. When the call fails, it returns -1, and sets the external variable errno to the appropriate error code, msgget ( ) has the synopsis shown below. #sun microsystems Revision A of 27 March 1990 Chapter 3 — System V Interprocess Communication Facilities 59 Figure 3-6 Synopsis of msggetO The value passed as the msgf lg argument must be an octal integer, which incorporates settings for the queue’s permissions and control flags, as described under System V IPC Permissions, above. The MSGMNI kernel configuration option determines the maximum number of unique message queues that the kernel will support, msgget ( ) fails when this limit is exceeded. The following example is a simple exerciser to illustrate the msgget ( ) system call. The program begins by prompting for a key, an octal permissions code, and finally, for your choice of control flags. It allows all possible combinations. If msgget ( ) fails, the program indicates that there was an error, and displays the value of errno. Otherwise, it displays the message queue ID that the call returned. Figure 3-7 Sample Program to Illustrate ms gge t ( ) ** msgget. c: Illustrate the msggetO system call. ** ** This is a simple exerciser of the msggetO system call. ** It prompts for the arguments, makes the call, and reports the ** results. ♦include #include #include ♦include extern void extern void exit () ; perror 0 ; main ( ) key_tkey; /* key to be passed to msggetO */ int msgf lg, /* msgflg to be passed to msggetO */ msqid; /* return value from msggetO */ (void) fprintf (stderr, "All numeric input is expected to follow C conventions : \n") ; (void) fprintf (stderr, "\tOx. . . is interpreted as hexadecimal, \n" ) , (void) fprintf (stderr, "\t0... is interpreted as octal, \n") ; (void) fprintf (stderr, "\totherwise, decimal. \n") ; (void) fprintf (stderr, "IPC_PRIVATE == %#lx\n", IPC_PRIVATE) ; (void) fprintf (stderr, "Enter desired key: ") ; m sun Xr microsystems Revision A of 27 March 1990 60 Programming Utilities and Libraries (void) scanf("%li", skey) ; (void) fprintf (stderr, "\nExpected flags for msgflg argument are:\n") ; (void) fprintf (stderr, "\tIPC_EXCL =\t%#8 . 8o\n" , IPC_EXCL) ; (void) fprintf (stderr, "\tIPC_CREAT =\t%#8 .8o\n", IPC_CREAT) ; (void) fprintf (stderr, "\towner read =\t%#8 . 8o\n", 0400); (void) fprintf (stderr, "\towner write =\t%#8 . 8o\n", 0200); (void) fprintf (stderr, "\tgroup read =\t%#8 .8o\n", 040); (void) fprintf (stderr, "\tgroup write =\t%#8 . 8o\n" , 020); (void) fprintf (stderr, "\tother read =\t%#8 .8o\n", 04); (void) fprintf (stderr, "\tother write =\t%#8 . 8o\n", 02); (void) fpr intf ( stderr , "Enter desired msgflg value: "); (void) scanf ("%i", smsgf lg) ; (void) fprintf (stderr, "\nmsgget: Calling msgget (%#lx, %#o)\n", key, msgflg) ; if ( (msqid = msgget (key, msgflg)) == -1) { perror ( "msgget : msgget failed"); exit (1) ; } else { (void) fprintf (stderr, "msgget: msgget succeeded: msqid = %d\n", msqid); exit (0) ; > /* NOTREACHED */ Controlling Message Queues with msgctl ( ) Figure 3-8 Upon successful completion, the call returns zero. It returns -1 on failure, and sets err no appropriately. The msqid argument must be the ID of an existing message queue. The cmd argument is one of the following: IPC_STAT Place information about the status of the queue in the the data structure pointed to by buf . The process must have read per- mission for this call to succeed. IPC_SET Set the owner’s user and group ID, the permissions, and the size (number of bytes) of the message queue. A process must have the effective user ID of the owner, creator or the super-user for this call to succeed. The msgctl ( ) system call is used to alter the permissions and other charac- teristics of a message queue. Its synopsis is as follows: Synopsis of msgctl ( ) Revision A of 27 March 1990 Chapter 3 — System V Interprocess Communication Facilities 6 1 IPC_RMID Remove the message queue specified by the msqid argument. The following sample program illustrates the msgctl(2) system call with all its various flags. Figure 3-9 Sample Program to Illustrate ms get 1 ( ) /* ** msgctl.c: Illustrate the msgctlO system call. ** ** This is a simple exerciser of the msgctlO system call. It allows ** you to perform one control operation on one message queue. It ** gives up immediately if any control operation fails, so be careful not ** to set permissions to preclude read permission; you won't be able to reset ** the permissions with this code if you do. */ ♦include ♦include ♦include ♦include ♦include static void do_msgctl(); extern void exit(); extern void perrorO; static char warning_message [] = "If you remove read permission for\ yourself, this program will fail frequently!"; main ( ) { struct msqid_dsbuf; /* queue descriptor buffer for IPC_STAT and IPC_SET commands */ int cmd, /* command to be given to msgctlO */ msqid; /* queue ID to be given to msgctlO */ (void) fprintf (stderr, "All numeric input is expected to follow C convent ions : \n" ) ; (void) fprintf (stderr, "\tOx... is interpreted as hexadecimal, \n") ; (void) fprintf (stderr, "\t0... is interpreted as octal, \n") ; (void) fprintf (stderr, " \totherwise, decimal . \n" ) ; /* Get the msqid and cmd arguments for the msgctlO call. */ (void) fprintf (stderr, "Please enter arguments for msgctlO as requested."); (void) fprintf (stderr, "\nEnter the desired msqid: ”); (void) scanf("%i", Smsqid) ; (void) fprintf (stderr, "Valid msgctl commands are:\n"); (void) fprintf ( stderr , "\tIPC_RMID = %d\n", IPC_RMID) ; (void) fprintf (stderr, "\tIPC_SET = %d\n", IPC_SET) ; (void) fprintf (stderr, "\tIPC_STAT = %d\n", IPC_STAT); (void) fprintf (stderr, "\nEnter the value for the desired command: "); (void) scanf("%i", Scmd) ; switch (cmd) { case IPC_SET: /* Modify settings in the message queue control structure. */ (void) fprintf (stderr, "Before IPC_SET, get current values:"); /* fall through to IPC_STAT processing */ case IPC_STAT: /* ** Get a copy of the current message queue control structure ** and show it to the user. sun V microsystems Revision A of 27 March 1990 62 Programming Utilities and Libraries */ do_msgctl (msqid, IPC_STAT, Sbuf ) ; (void) fprintf (stderr, "msg_perm.uid = %d\n", buf.msg_perm.uid); (void) fprintf (stderr, "msg_perm.gid = %d\n", buf.msg_perm.gid); (void) fprintf (stderr, "msg_perm. cuid = %d\n", buf .msg_perm. cuid) ; (void) fprintf (stderr, "msg_perm. cgid = %d\n", buf .msg_perm. cgid) ; (void) fprintf (stderr, "msg_perm.mode = %#o, ", buf ,msg_perm. mode) ; (void) fprintf (stderr, "access permissions = %#o\n", buf .msg_perm. mode & 0777) ; (void) fprintf (stderr, "msg_cbytes = %d\n", buf .msg_cbytes) ; (void) fprintf (stderr, "msg_qbytes = %d\n", buf ,msg_qbytes) ; (void) fprintf (stderr, "msg_qnum = %d\n", buf .msg_qnum) ; (void) fprintf (stderr, "msg_lspid = %d\n", buf .msg_lspid) ; (void) fprintf (stderr, "msg_lrpid = %d\n", buf .msg_lrpid) ; (void) fprintf (stderr, "msg_stime = %s", buf .msg_stime ? ctime (sbuf .msg_stime) : "Not Set\n"); (void) fprintf (stderr, "msg_rtime = %s", buf .msg_rtime ? ctime (Sbuf .msg_rtime) : "Not Set\n") ; (void) fprintf (stderr, "msg_ctime = %s", ctime (sbuf .msg_ctime) ) ; if (cmd == IPC_STAT ) break; /* ** Now continue with IPC_SET. */ (void) fprintf (stderr, "Enter desired msg_perm.uid: "); (void) scanf ("%hi", Sbuf .msg_perm.uid) ; (void) fprintf (stderr, "Enter desired msg_perm.gid: ") ; (void) scanf ("%hi", Sbuf .msg_perm.gid) ; (void) fprintf (stderr, "%s\n", warning_message) ; (void) fprintf (stderr, "Enter desired msg_perm.mode : ") ; (void) scanf ("%hi", Sbuf ,msg_perm. mode) ; (void) fprintf (stderr, "Enter desired msg_qbytes: ") ; (void) scanf ("%hi", Sbuf ,msg_qbytes) ; do_msgctl (msqid, IPC_SET, Sbuf ) ; break; case IPC_RMID: default : /* Remove the message queue or try an unknown command. */ do_msgctl (msqid, cmd, (struct msqid_ds *)NULL); break; ) exit (0) ; /* NOTREACHED */ ) /* ** Print indication of arguments being passed to msgctlf), call msgctlO, ** and report the results. ** If msgctlO fails, do not return; this example doesn't deal with ** errors, it just reports them. */ static void do_msgctl (msqid, cmd, buf) struct msqid_ds*buf ; int cmd, msqid; m sun \r microsystems Revision A of 27 March 1990 Chapter 3 — System V Interprocess Communication Facilities 63 register int rtrn;/* hold area for return value from msgctl() */ (void) fprintf (stderr, "\nmsgctl: Calling msgctl (%d, %d, %s)\n", msqid, cmd, buf ? "sbuf" : "(struct msqid_ds *)NULL"); rtrn = msgctl (msqid, cmd, buf); if (rtrn == -1) { perror ("msgctl : msgctl failed"); exit (1) ; /* NOTREACHED */ } else { (void) fprintf (stderr, "msgctl: msgctl returned %d\n", rtrn); } > Sending and Receiving ms gs nd(2) and ms gr cv(2) are used to send and receive messages, respectively. Messages with ms gsnd ( ) and Their synopses are as follows: msgrcv ( ) Figure 3-10 Synopses of msgsndO andmsqrcv () ♦include ♦include ♦include int msgsnd (msqid, msgp, msgsz, msgflg) int msqid; struct msgbuf *msgp; int msgsz, msgflg; int msgrcv (msqid, msgp, msgsz, msgtyp, msgflg) int msqid; struct msgbuf *msgp; int msgsz; long msgtyp; int msgflg; Upon successful completion, these system calls each return zero; when unsuc- cessful, they return -1, and set the external variable errno to the appropriate error code. The msqid argument must be the ID of an existing message queue. The msgp argument is a pointer to a structure that contains the type of the message and its text. The msgsz argument specifies the length of the message (in bytes). Various control flags can be passed in the msgflg argument. Flags can be com- bined within the argument using logical OR operator. If IPC_N0WAIT is set, a send or receive operation that cannot complete will fail. For instance, a non- blocking msgrcv ( ) operation will fail if there is no message to receive. If MS G_NOERROR is set, then a message longer than the size specified by msgs z is truncated to that size. Note that the trailing portion of the truncated message is lost. Without the MS G_N OE RROR flag, attempting to receive a message that is longer than expected results in failure. Revision A of 27 March 1990 64 Programming Utilities and Libraries The msgtyp argument to msgrcv ( ) is used to indicate the type of message to receive. If this argument is equal to zero, the call receives the first message on the queue. If it is greater than zero, the call receives the first message of the indi- cated type. If msgtyp is less than zero, the call receives the first extant message on the queue with lowest type value, up to and including the absolute value of the argu- ment. For instance, if msgtyp has a value of -3, the call retrieves the first mes- sage of type 1, if any, or the first message of type 2, if any, or the first message of type 3. It would not receive a message of type 4. This allows you to prioritize message processing according to type. The following sample program illustrates msgsnd ( ) and msgrcv () . Figure 3-11 Sample Program to Illustrate msgsnd () and msgrcv ( ) /* ** msgop.c: Illustrate the msgsnd() and msgrcv() system calls. ** ** This is a simple exerciser of the message send and receive ** routines. It allows the user to attempt to send and receive as many ** messages as desired to or from one message queue. */ ♦include ♦ include ♦include ♦include static intask (); extern void exit(); extern char *malloc(); extern void perror(); char f irst_on_queue [ ] = "— > first message on queue", full fcivf [ ] = "Message buffer overflow. Extra message text discarded."; main () { register int c; int choice; register int i; int msgflg; struct msgbuf *msgp; int msgsz; long msgtyp; int msqid, maxmsgsz, /* rtrn; /* /* message text input */ /* user's selected operation code */ /* loop control for mtext */ /* message flags for the operation */ /* pointer to the message buffer */ /* message size */ /* desired message type */ /* message queue ID to be used */ size of allocated message buffer */ return value from msgrcv or msgsnd */ (void) fprintf (stderr, "All numeric input is expected to follow C conventions:\n") ; (void) fprintf (stderr, "\tOx. . . is interpreted as hexadecimal, \n") ; (void) fprintf (stderr, "\t0... is interpreted as octal, \n") ; (void) fprintf (stderr, "\totherwise, decimal. \n") ; /* Get the message queue ID and set up the message buffer. */ (void) fprintf (stderr, "Enter desired msqid: ") ; (void) scanf ( "%i" , smsqid) ; /* ** Note that includes a definition of struct msgbuf ** with the mtext field defined as: » sun Xr microsystems Revision A of 27 March 1990 ** char mtext [1] ; ** therefore, this definition is only a template, not a directly ** useable structure definition, unless you only want to send ** and receive messages of 0 or 1 byte. ** To handle this, we malloc an area big enough to contain the ** template - the size of the mtext template field + the size of ** the mtext field we want. Then we can use the pointer returned ** by malloc as a struct msgbuf with an mtext field of the size ** we want. ** Note also that sizeof msgp->mtext is valid even though msgp ** isn't pointing to anything yet. Sizeof doesn't dereference msgp, ** it just uses its type to figure out what we are asking about. */ (void) fprintf (stderr, "Enter the message buffer size you want: ") ; (void) scanf ("%i", Smaxmsgsz) ; if (maxmsgsz < 0) { (void) fprintf (stderr, "msgop: %s\n", "The message buffer size must be >= 0."); exit (1) ; /* NOTREACHED */ ) msgp = (struct msgbuf *) malloc ( (unsigned) (sizeof (struct msgbuf) - sizeof msgp->mtext + maxmsgsz) ) ; if (msgp == NULL) { (void) fprintf (stderr, "msgop: %s %d byte messages\n", "could not allocate message buffer for", maxmsgsz); exit (1) ; /* NOTREACHED */ > /* Loop through message operations until the user is ready to quit. */ while (choice = ask()) ( switch (choice) { case 1: /* msgsndO requested: Get the arguments, make the call, and report the results. */ (void) fprintf (stderr, "Valid msgsnd message %s\n", "types are positive integers."); (void) fprintf (stderr, "Enter desired msgp->mtype: ") ; (void) scanf ("%li", &msgp->mtype) ; if (maxmsgsz) { /* Since we've been using scanf, we need the following loop to throw away the rest of the input on the line after the entered mtype before we start reading the mtext. */ while ( (c = getchart)) != '\n' && c != EOF) (void) fprintf (stderr, "Enter a %s:\n", "one line message") ; for (i = 0; ( (c = getchar(J) != ' \n' ) ; i++) ( if (i >= maxmsgsz) { (void) fprintf (stderr, "\n%s\n", full_buf ) ; while ( (c = getchar(J) != ' \n' ) break; > msgp->mtext [ i] = c; } msgsz = i; ) else msgsz = 0; (void) fprintf (stderr, "XnMeaningful msgsnd flag is:\n"); «#> sun microsystems Revision A of 27 March 1990 66 Programming Utilities and Libraries (void) fprintf (stderr, "\tIPC_NOWAIT =\t%#8 . 8o\n", IPC_NOWAIT) ; (void) fprintf (stderr, "Enter desired msgflg: ") ; (void) scanf ( "%i", Smsgflg) ; (void) fprintf (stderr, "%s(%d, msgp, %d, %#o)\n", "msgop: Calling msgsnd", msqid, msgsz, msgflg); (void) fprintf (stderr, "msgp->mtype = %ld\n", msgp->mtype) ; (void) fprintf (stderr, "msgp->mtext = \""); for (i = 0; i < msgsz; i++) (void) fputc (msgp— >mtext [i], stderr); (void) fprintf (stderr, "\"\n") ; rtrn = msgsnd (msqid, msgp, msgsz, msgflg); if (rtrn == -1) perror ("msgop : msgsnd failed"); else (void) fprintf (stderr, "msgop: msgsnd returned %d\n”, rtrn) ; break; case 2: /* msgrcv() requested: Get the arguments, make the call, and report the results. */ for (msgsz = —1; msgsz < 0 | | msgsz > maxmsgsz; (void) scanf ( "%i", smsgsz) ) (void) fprintf (stderr, "%s (0 <= msgsz <= %d) : ", "Enter desired msgsz", maxmsgsz) ; (void) fprintf (stderr, "msgtyp meanings : \n" ) ; (void) fprintf (stderr, "\t 0 %s\n", first_on_queue) ; (void) fprint f ( stderr , "\t>0 %s of given type\n", first_on_queue) ; (void) fprintf (stderr, "\t<0 % s with type <= |msgtyp|\n", first_on_queue) ; (void) fprintf (stderr, "Enter desired msgtyp: "); (void) scanf ("%li", smsgtyp) ; (void) fprintf (stderr, "Meaningful msgrcv flags are:\n"); (void) fprintf (stderr, "\tMSG_NOERROR =\t%#8 . 8o\n", MSG_NOERROR) ; (void) fprintf (stderr, "\tIPC_NOWAIT =\t%#8 .8o\n", IPC_NOWAIT) ; (void) fprint f ( stderr , "Enter desired msgflg: "); (void) scanf ("%i", smsgflg); (void) fprintf (stderr, "%s(%d, msgp, %d, %ld, %#o);\n", "msgop: Calling msgrcv", msqid, msgsz, msgtyp, msgflg) ; rt; rn = msgrcv (msqid, msgp, msgsz, msgtyp, msgflg); if (rtrn == -1) perror ( "msgop : msgrcv failed"); else { (void) fprintf (stderr, "msgop: %s %d\n", "msgrcv returned", rtrn) ; (void) fprintf (stderr, "msgp->mtype = %ld\n", msgp->mtype) ; (void) fprintf (stderr, "msgp->mtext is: \"") ; for (i = 0; i < rtrn; i++) (void) fputc (msgp->mtext [i] , stderr); (void) fprintf (stderr, "\"\n") ; } break; sun ^ microsystems Revision A of 27 March 1990 Chapter 3 — System V Interprocess Communication Facilities 67 default : (void) fprintf (stderr, "msgop: operation unknown\n" ) ; break; } } exit (0) ; /* NOTREACHED */ ) /* ** Ask user what to do next. Return the user's choice code. ** Don't return until the user selects a valid choice. */ static ask ( ) { int response; /* User's response. */ do { (void) fprintf (stderr, "Your options are:\n"); (void) fprintf (stderr, "\tExit =\t0 or Control-D\n") ; (void) fprintf (stderr, "\tmsgsnd =\tl\n") ; (void) fprintf (stderr, "\tmsgrcv =\t2\n"); (void) fprintf (stderr, "Enter your choice: "); /* Preset response so "~D" will be interpreted as exit. */ response = 0; (void) scanf ("%i", Sresponse) ; ) while (response < 0 | I response > 2) ; return (response) ; } 3.4. Semaphores Semaphores provide a mechanism by which processes can query or alter status information. They are often used to monitor and control the availability of sys- tem resources, such as System V shared memory segments. Semaphores may be operated on as individual units, or as elements in a set. A semaphore set consists of a control structure and an array of individual semaphores. By default, a set of semaphores may contain up to 25 elements; this limit can be altered using the SEMMSL system configuration option. Before a process can use a semaphore, the semaphore set must be initialized using semget (2). The semaphore’s owner or creator can change its ownership or permissions using semctl(2). In addition, any process with permission to do so can use semct 1 ( ) to perform control operations. Semaphore operations are performed by the semop(2) system call. This call accepts a pointer to an array of semaphore operation structures; each structure in the operations array contains information about an operation to perform on a semaphore. The operations array is described in detail under Semaphore Operations, below. Any process with read permission can test to see whether a semaphore has a zero value, by supplying a 0 in the sem_op field of the operation structure. Opera- tions to increment or decrement a semaphore require alter permission (that is, write permission). microsystems Revision A of 27 March 1990 68 Programming Utilities and Libraries If an attempt to perform any of the requested operations should fail, none of the semaphores are altered. The process will block (unless the IPC_nowait flag is set), and will remain blocked until one of the following occurs: □ the semaphore operations can all complete, in which case the call succeeds □ the process receives a signal, or □ the semaphore set is removed. If a nonblocking semaphore operation fails, the call returns -1 and sets err no appropriately. Only one process can update a semaphore set at any given time. Simultaneous requests by different processes are performed in an arbitrary order. When an array of operations is given by a semop ( ) call, the updates are made atomically. That is, no updates are committed until all operations in the array can complete successfully. Once a process performs an operation on a semaphore, the system does not keep track of whether or not that operation has been undone. If a process with exclusive use of a semaphore terminates abnormally and neglects to undo the operation or free the semaphore, the semaphore will remain locked in memory. To prevent this, semop ( ) accepts the SEM_UNDO control flag. When this flag is in effect, semop ( ) allocates an undo structure for each semaphore operation. That structure contains the operation needed to return the semaphore to its previ- ous state. When the process dies, the system applies the operations in the undo structures. That way an aborted process need not leave a semaphore set in an inconsistent state. If processes share access to a resource controlled by a semaphore, operations on the semaphore should not be made with SEM_UNDO in effect. If the process that currently has control of the resource terminates abnormally, the resource is presumed to be inconsistent. Another process must be able to recognize this in order to restore the resource to a consistent state. When performing a semaphore operation with SEM_UNDO in effect, you must also have it in effect for the call that would perform the reversing operation. When the process runs normally, the reversing operation updates the undo struc- ture with a complementary value. This insures that, unless the process is aborted, the values applied to the undo structure will eventually cancel out to zero. When the undo structure reaches zero, it is removed. Using SEM_UNDO inconsistently can lead to undue resource consumption, since undo structures which are allo- cated may not be freed (until the system is rebooted). Structure of a Semaphore Set A semaphore set is composed of a control structure with a unique ID, along with an array of semaphores. The identifier for the semaphore or array is referred to as the semid. Asun mlr-rncwetomc microsystems Revision A of 27 March 1990 Chapter 3 — System V Interprocess Communication Facilities 69 Figure 3-12 Structure of a Semaphore The control structure for the semaphore contains the following information: □ The permissions structure. □ A pointer to first semaphore in the array. □ The number of semaphores in the array. □ The time of the last operation on any semaphore the array, o The time of the last update to any semaphore in the array. Each semaphore structure in the array, contains the following information: □ The semaphore value. □ The PID of the process performing the last successful operation. □ The number of processes waiting for the semaphore to increase. □ The number of processes waiting for the semaphore to reach zero. The control structure is defined in the header file: : r struct semid_ds { struct ipc_perm sem_perm; /* permission struct */ struct sem *sem base; /* ptr to first semaphore in set */ ushort sem_nsems ; /* # of semaphores in set */ time t sem otime; /* last semop time */ time t sem ctime; /* last change time */ }; > The sem_perm member of this structure uses ipc_perm (defined in ) as a template. Revision A of 27 March 1990 70 Programming Utilities and Libraries The semaphore structure is defined as: struct sem { ushort semval; /* semaphore text map address */ short sempid; /* pid of last operation */ ushort semncnt ; /* ♦ awaiting semval > cval */ ushort semzcnt ; /* ♦ awaiting semval = 0 */ } ; v. , in that header file as well. Initializing a Semaphore Set The semget ( ) system call is used to initialize or gain access to a semaphore, with semget ( ) When the call succeeds, it returns the semaphore ID (semid). When the call fails, it returns -1, and sets the external variable errno to the appropriate error code, semget ( ) has the following synopsis: Figure 3-13 Synopsis of semget ( ) ( N ♦include ♦include ♦include int semget (key, nsems, semflg) key_t key; int nsems, semflg; As noted above, the key argument is a value associated with the semaphore ID. The nsems argument specifies the number of elements in a semaphore array. The call fails if nsems is greater than the number of elements in an existing array; when the correct count not known, supplying 0 for this argument assures that it will succeed. The semf lg argument is used to specify the initial access permissions and creation control flags. The SEMMNI system configuration option determines the maximum number of semaphore arrays allowed. The SEMMNS option determines the maximum possi- ble number of individual semaphores in across all semaphore sets, semget ( ) fails when one of these limits would be exceeded. Due to fragmentation between semaphore sets, you may not be able to allocate all available semaphores. The following program illustrates the semget ( ) system call. It begins by prompting for a hexadecimal key, an octal permissions code, and control com- mand combinations selected from a menu. All possible combinations are allowed. It then requests the number of semaphores in the array, and issues the system call to initialize the array. If the call succeeds, the program displays the semaphore ID returned. Otherwise, it displays an error message. microsystems Revision A of 27 March 1990 Chapter 3 — System V Interprocess Communication Facilities 7 1 Figure 3-14 Sample Program to Illustrate semget ( ) /* ** semget. c: Illustrate the semgetO system call. * * ** This is a simple exerciser of the semgetO system call. ** It prompts for the arguments, makes the call, and reports the ** results. */ ♦include ♦include ♦include ♦include extern void exit O ; extern void perror() ; main ( ) { key_tkey; /* key to be passed to semgetO */ int semflg; /* semflg to be passed to semgetO */ int nsems; /* nsems to be passed to semgetO */ int semid; /* return value from semgetO */ (void) fprintf (stderr, "All numeric input is expected to follow C conventions: \n") ; (void) fprintf (stderr, "\tOx. . . is interpreted as hexadecimal, \n") ; (void) fprintf (stderr, "\t0... is interpreted as octal, \n") ; (void) fprintf (stderr, "\totherwise, decimal . \n") ; (void) fprintf (stderr, "IPC_PRIVATE — %#lx\n", IPC_PRIVATE) ; (void) fprintf (stderr, "Enter desired key: ") ; (void) scanf("%li", &key) ; (void) fprintf (stderr, "Enter desired nsems value: "); (void) scanf("%i", Snsems) ; (void) fprintf (stderr, "\nExpected flags for semflg are:\n"); (void) fprintf (stderr, "\tIPC_EXCL = \t%^8 .8o\n", IPC_EXCL) ; (void) fprintf (stderr, "\tIPC_CREAT = \t%#8 . 8o\n", IPC_CREAT) ; (void) fprintf (stderr, "\towner read = \t%^8.8o\n", 0400); (void) fprintf (stderr, "\towner alter = \t%#8.8o\n", 0200); (void) fprintf (stderr, "\tgroup read = \t%^8.8o\n", 040); (void) fprintf (stderr, "\tgroup alter = \t%^8.8o\n", 020); (void) fprintf (stderr, "\tother read = \t%#8.8o\n", 04); (void) fprintf (stderr, "\tother alter = \t%^8.8o\n", 02); (void) fprintf (stderr, "Enter desired semflg value: "),- (void) scanf("%i", Ssemflg) ; (void) fprintf (stderr, "\nsemget: Calling semget (%^lx, %d, %^o)\n", key, nsems, semflg) ; if ((semid = semget (key, nsems, semflg)) == —1) { perror ("semget : semget failed"); exit (1) ; ) else { (void) fprintf (stderr, "semget: semget succeeded: semid = %d\n", semid) ; exit (0) ; } /*NOTRE ACHED*/ } Revision A of 27 March 1990 72 Programming Utilities and Libraries Controlling Semaphores with The semctl ( ) system call allows a process to alter permissions and other semctlO characteristic of a semaphore set. Its synopsis is as follows: Figure 3-15 Synopsis of semctl ( ) f #include \ #include ♦include int semctl (semid, semnum, cmd, arg) int semid, cmd; int semnum; union { } arg; \ semun int val; struct semid ds *buf; ushort * array; y semid is a valid sempahore ID. semnum is used to select a semaphore within an array by its index. The cmd argument is one of the following control flags. What you supply for arg depends upon the control flag given in cmd. GETVAL Return the value of a single semaphore. SETVAL Set the value of a single semaphore. In this case, arg is taken as arg . val, an int. GETP ID Return the PID of the process that performed the last operation on the semaphore or array. GETNCNT Return the number of processes waiting for the value of a semaphore to increase. GETZCNT Return the number of processes waiting for the value of a particular semaphore to reach zero. GETALL Return the values for all semaphores in a set. In this case, arg is taken as arg . array, a pointer to an array of unsigned shorts. SETALL Set values for all semaphores in a set. In this case, arg is taken as arg . array, a pointer to an array of unsigned shorts. IPC_STAT Return the status information contained in the control structure for the semaphore set, and place it in the data structure pointed to by arg . buf , a pointer to a buffer of type semid_ds. IPC_SET Set the effective user/group identification and permissions In this case, arg is taken as arg . buf. IPC_RMID Remove the specified semaphore set. Revision A of 27 March 1990 Chapter 3 — System V Interprocess Communication Facilities 73 A process must have an effective user identification of OWNER/ CREATOR or super-user to perform an IPC_SET or IPC_RMID commands. Read/write per- mission is required as applicable for the other control commands. The following program illustrates semctl ( ) . Figure 3-16 Sample Program to Illustrate semctl ( ) /* ** semctl. c: Illustrate the semctl () system call. ★ * ** This is a simple exerciser of the semctl () system call. It ** allows you to perform one control operation on one semaphore set. ** It gives up immediately if any control operation fails, so be careful not ** to set permissions to preclude read permission; you won't be able to reset ** the permissions with this code if you do. */ #include ♦include ♦include ♦include ♦include struct semid_ds semid_ds; static void do_semctl(); static void do_stat(); extern char *malloc(); extern void exit(); extern void perrorO; char warning_message [] = "If you remove read permission for\ yourself, this program will fail frequently!"; main ( ) { union semun arg; /* union to be passed to semctl () */ int cmd, /* command to be given to semctl () */ i, /* work area */ semid, /* semid to be passed to semctl () */ semnum; /* semnum to be passed to semctl () */ (void) fprintf (stderr, "All numeric input is expected to follow C conventions: \n") ; (void) fprintf (stderr, "\tOx... is interpreted as hexadecimal, \n") ; (void) fprintf (stderr, "\t0... is interpreted as octal, \n") ; (void) fprintf (stderr, "\totherwise, decimal. \n" ) ; (void) fprintf (stderr, "Enter desired semid value: "); (void) scanf("%i", Ssemid) ; (void) fprintf (stderr, "Valid semctl cmd values are:\n"); (void) fprintf (stderr, "\tGETALL = %d\n", GETALL) ; (void) fprintf (stderr, "\tGETNCNT = %d\n", GETNCNT) ; (void) fprintf (stderr, "\tGETPID = %d\n", GETPID) ; (void) fprintf (stderr, "\tGETVAL = %d\n", GETVAL) ; (void) fprintf (stderr, "\tGETZCNT = %d\n", GETZCNT) ; (void) fprintf (stderr, "\tIPC_RMID = %d\n", IPC_RMID) ; (void) fprintf (stderr, "\tIPC_SET = %d\n", IPC_SET) ; (void) fprintf (stderr, "\tIPC_STAT = %d\n", IPC_STAT) ; (void) fprintf (stderr, "\tSETALL = %d\n", SETALL) ; (void) fprintf (stderr, "\tSETVAL = %d\n", SETVAL) ; (void) fprintf (stderr, "\nEnter desired cmd: "); (void) scanf("%i", Scmd) ; W sun microsystems Revision A of 27 March 1990 /* Perform some setup operations needed by multiple commands. */ switch (cmd) ( case GETVAL: case SETVAL: case GETNCNT : case GETZCNT : /* Get the semaphore number for these commands. */ (void) fprintf (stderr, "\nEnter desired semnum value: "); (void) scanf("%i", £ semnum) ; break; case GETALL: case SETALL: /* Allocate a buffer for the semaphore values. */ (void) fprintf (stderr, "Get number of semaphores in the set.\n"); arg.buf = £semid_ds; do_semctl (semid, 0, IPC_STAT, arg) ; if (arg. array = (ushort *) malloc ( (unsigned) (semid_ds . sem_nsems * sizeof (ushort) )) ) ( /* Break out if we got what we needed. */ break; } (void) fprintf (stderr, "semctl: unable to allocate space for %d values\n", semid_ds . sem_nsems) ; exit (2) ; /*NOTREACHED */ ) /* Get the rest of the arguments needed for the specified command. */ switch (cmd) ( case SETVAL: /* Set value of one semaphore. */ (void) fprintf (stderr, "\nEnter desired semaphore value: "); (void) scanf ("%i", Sarg.val); do_semctl (semid, semnum, SETVAL, arg); /* Fall through to verify the result. */ (void) fprintf (stderr, "Perform semctl GETVAL command to verify results . \n" ) ; case GETVAL: /* Get value of one semaphore. */ arg.val = 0; do_semctl (semid, semnum, GETVAL, arg) ; break; case GETPID: /* Get PID of last process to successfully complete a semctl (SETVAL) , semctl (SETALL) , or semopO on the semaphore. */ arg.val = 0; do_semctl (semid, 0, GETPID, arg) ; break; case GETNCNT: /* Get number of processes waiting for semaphore value to increase. */ arg.val - 0; do_semctl (semid, semnum, GETNCNT, arg); break; case GETZCNT: /* Get number of processes waiting for semaphore value » sun microsystems Revision A of 27 March 1990 Chapter 3 — System V Interprocess Communication Facilities 75 to become zero. */ arg.val = 0; do_semctl (semid, semnum, GETZCNT, arg) ; break; case SETALL: /* Set the values of all semaphores in the set. */ (void) fprintf (stderr, "There are %d semaphores in the set.\n”, semid_ds . sem_nsems) ; (void) fprintf (stderr, "Enter desired semaphore values:\n"); for (i = 0; i < semid_ds .sem_nsems; i++) ( (void) fprintf (stderr, "Semaphore %d: ", i) ; (void) scanf("%hi", &arg. array [i] ) ; } do_semctl (semid, 0, SETALL, arg) ; /* Fall through to verify the results. */ (void) fprintf (stderr, "Perform semctl GETALL command to verify results. \n") ; case GETALL: /* Get and print the values of all semaphores in the set.*/ do_semctl (semid, 0, GETALL, arg); (void) fprintf (stderr, "The values of the %d semaphores are:\n", semid_ds . sem_nsems) ; for (i = 0; i < semid_ds . sem_nsems; i++) (void) fprintf (stderr, "%d ", arg. array [i] ) ; (void) fprintf (stderr, "\n") ; break; case IPC_SET : /* Modify mode and/or ownership. */ arg.buf = Ssemid_ds; do_semctl (semid, 0, IPC_STAT, arg); (void) fprintf (stderr, "Status before IPC_SET : \n") ; do_stat () ; (void) fprintf (stderr, "Enter desired sem_perm.uid value: ") ; (void) scanf("%hi", &semid_ds . sem_perm.uid) ; (void) fprintf (stderr, "Enter desired sem_perm.gid value: ") ; (void) scanf ("%hi", &semid_ds . sem_perm.gid) ; (void) fprintf (stderr, "%s\n", warning_message) ; (void) fprintf (stderr, "Enter desired sem_perm.mode value: "); (void) scanf ("%hi", &semid_ds .sem_perm.mode) ; do_semctl (semid, 0, IPC_SET, arg) ; /* Fall through to verify changes. */ (void) fprintf (stderr, "Status after IPC_SET : \n" ) ; case IPC_STAT : /* Get and print current status. */ arg.buf = &semid_ds; do_semctl (semid, 0, IPC_STAT, arg); do_stat ( ) ; break; case IPC_RMID: /* Remove the semaphore set. */ arg.val = 0; do_semctl ( semid, 0, IPC_RMID, arg); break; default : /* Pass unknown command to semctl. */ arg.val = 0; do_semctl (semid, 0, cmd, arg); m sun microsystems Revision A of 27 March 1990 break; ) exit (0) ; /*NOTRE ACHED*/ } /* ** Print indication of arguments being passed to semctl(), call semctlO, ** and report the results. ** If semctlO fails, do not return; this example doesn't deal with ** errors, it just reports them. */ stat ic void do_semctl (semid, semnum, cmd, arg) union semun arg; int cmd, semid, semnum; ( register int i; /* work area */ (void) fprintf (stderr , "\nsemctl: Calling semctl(%d, %d, %d, ", semid, semnum, cmd) ; switch (cmd) { case GETALL: (void) fprintf ( stderr, "arg. array = %#x)\n", arg. array) ; break; case IPC_STAT: case IPC_SET: (void) fprintf (stderr, "arg.buf = %#x)\n", arg.buf); break; case SETALL: (void) fprintf ( stderr , "arg. array = [", arg.buf); for (i = 0;i < semid_ds . sem_nsems; ) { (void) fprintf (stderr, "%d", arg. array [i++] ) ; if (i < semid_ds . sem_nsems) (void) fprintf (stderr, ", "); ) (void) fprintf (stderr, "])\n"); break; case SETVAL : default : (void) fprintf ( stderr, "arg.val = %d) \n", arg.val); break; ) i = semctl (semid, semnum, cmd, arg) ; if (i == -1) { perror ( "semctl : semctl failed"); exit (1) ; /* NOTREACHED */ ) (void) fprintf (stderr, "semctl: semctl returned %d\n", i) ; return; ) /* ** Display contents of commonly used pieces of the status structure. */ stat. ic void do_s tat ( ) { (void) fprintf (stderr, "sem_perm.uid = %d\n", semid_ds.sem_perm.uid); (void) fprintf (stderr, "sem_perm.gid = %d\n", semid_ds.sem_perm.gid); (void) fprintf (stderr, "sem_perm.cuid = %d\n", semid_ds.sem_perm.cuid); (void) fprintf (stderr, "sem_perm. cgid = %d\n", semid_ds.sem_perm.cgid); see microsystems Revision A of 27 March 1990 Chapter 3 — System V Interprocess Communication Facilities 77 r~ (void) fprintf (stderr. "sem perm. mode = %#o, ", A semid ds.sem perm mode) ; (void) fprintf (stderr, "access permissions = %#o\n". semid ds.sem perm mode & 0777) ; (void) fprintf (stderr , "sem nsems = %d\n", semid ds.sem nsems) ; (void) fprintf (stderr. "sem otime = %s", semid ds.sem otime ? ctime(&semid ds.sem otime) : "Not Set\n") ; ) (void) fprintf (stderr. "sem ctime = %s", ct ime ( &semid_ds . sem_ctime) ) ; J Performing Semaphore The semop ( ) system call is used to perform operations on a semaphore set. It’s Operations with semop ( ) synopsis is as follows: Figure 3-17 Synopsis o/ semop ( ) f A ♦include ♦include ♦include int semop (semid, sops, nsops) int semid; struct sembuf *sops; unsigned nsops; y The semid argument is the semaphore ID that was returned by a previous semget ( ) call. The sops argument is a pointer to an array of structures, each of which contains the following information about a semaphore operation: □ The semaphore number. □ The operation to be performed. □ Control flags, if any. sembuf is the structure of semaphores in the array, as defined in the header file. The ns ops argument specifies the length of the array, the maximum size of which is determined by the SEMOPM configuration option; this is the maximum number of operations allowed by a single semop ( ) call, 100 by default. The operation to be performed is determined as follows: □ A positive integer means to increment the semaphore value by that amount. □ A negative integer means to increment the semaphore value by that amount. However, a semaphore can never take on a negative value. An attempt to set a semaphore to a value below zero either will either fail or block, depending on whether or not IPC_N0WAIT is in effect. □ A value of zero means to wait for the semaphore value to reach zero. The following control flags can be used with semop ( ) : microsystems Revision A of 27 March 1990 78 Programming Utilities and Libraries IP C_N0WAIT this operation command can be set for any operations in the array. The system call will return unsuccessfully without changing any semaphore values at all if any operation for which IPC_N0WAIT is set cannot be performed successfully. The system call will be unsuccessful when trying to decrement a semaphore more than its current value, or when testing for a semaphore to be equal to zero when it is not. SEM_UNDO this command allows individual operations in the array to be undone when the process exits. The following program illustrates the semop ( ) system call. Figure 3-1 8 Sample Program to Illustrate semop ( ) /* ** semop. c: Illustrate the semopO system call. * * ** This is a simple exerciser of the semop ( ) system call. It allows ** you to set up arguments for semopO, make the call, and reports the ** results repeatedly on one semaphore set. You must have read ** permission on the semaphore set or this exerciser will fail. (It needs ** read permission to get the number of semaphores in the set and report ** their values before and after calls to semopO.) */ #include #include #include ♦include static intask (); extern void exit(); extern void freed; extern char *malloc(); extern void perror ( ) ; static struct semid_ds semid_ds; /* status of semaphore set */ static char error_mesgl [ ] = "semop: Can't allocate space for %d\ semaphore values. Giving up.\n"; static char error_mesg2 [ ] = "semop: Can't allocate space for %d\ sembuf structures. Giving up.\n"; main ( ) { register int i; /* work area */ int nsops; /* number of operations to be performed */ int semid; /* semid of semaphore set */ struct sembuf *sops; /* ptr to operations to be performed */ (void) fprintf (stderr, "All numeric input is expected to follow C convent ions : \n" ) ; (void) fprintf (stderr, "\tOx... is interpreted as hexadecimal, \n") ; (void) fprintf (stderr, "\t0... is interpreted as octal, \n") ; (void) fprintf (stderr, "\totherwise, decimal. \n") ; /* Loop until the invoker doesn't want to do anymore. */ while (nsops = ask(&semid, Ssops) ) ( /* Initialize the array of operations to be performed.*/ for (i = 0; i < nsops; i++) ( (void) fprintf (stderr. m sun microsystems Revision A of 27 March 1990 Chapter 3 — System V Interprocess Communication Facilities 79 "\nEnter desired values for operation %d of %d.\n", i + 1, nsops) ; (void) fprintf (stderr, "sem_num (valid values are 0 <= sem_num < %d) : ", semid_ds . sem_nsems) ; (void) scanf ("%hi", Ssops [i] .sem_num) ; (void) fprintf (stderr, "sem_op: "); (void) scanf ("%hi", Ssops [ i] . sem_op) ; (void) fprintf (stderr, "Expected flags in sem_flg are:\n"); (void) fprintf (stderr, "\tIPC_NOWAIT =\t%#6 . 6o\n", IPC_NOWAIT) ; (void) fprintf (stderr, "\tSEM_UNDO =\t%#6 . 6o\n", SEM_UNDO) ; (void) fprintf (stderr, "sem_flg: "); (void) scanf ("%hi", Ssops [ i] . sem_flg) ; > /* Recap the call to be made. */ (void) fprintf (stderr, "\nsemop: Calling semop(%d, Ssops, %d) with:", semid, nsops) ; for (i = 0; i < nsops; i++) { (void) fprintf (stderr, "\nsops [%d] .sem_num = %d, ", i, sops [i] .sem_num) ; (void) fprintf (stderr, "sem_op = %d, ", sops [i] . sem_op) ; (void) fprintf (stderr, "sem_flg = %#o\n", sops[i] . sem_flg); /* Make the semopO call and report the results. */ if ( (i = semop (semid, sops, nsops)) == —1) { perror ( "semop : semop failed"); } else { (void) fprintf (stderr, "semop: semop returned %d\n", i) ; ) } /*NOTREACHED*/ /* ** Ask user if (s)he wants to continue. ** ** On the first call: ** Get the semid to be processed and supply it to the caller. ** On each call: ** 1. Print current semaphore values. ** 2. Ask user how many operations are to be performed on next call to ** semop. Allocate an array of sembuf structures sufficient for the ** job and set caller supplied pointer to that array. (The array ** is reused on subsequent calls as long as it is big enough. If ** it isn't big enough, it is freed and a larger array is allocated.) */ static ask(semidp, sopsp) int *semidp; /* pointer to semid (only used first time) */ struct sembuf **sopsp; { static union int static int static int semun arg; /* argument to semctl */ i; /* work area */ nsops = 0;/* size of currently allocated sembuf array */ semid = —1; /* semid supplied by user */ w sun microsystems Revision A of 27 March 1990 80 Programming Utilities and Libraries static struct sembuf *sops; if (semid < 0) ( /* pointer to allocated array */ /* First call; get semid from user and the current state of the semaphore set. */ (void) fprintf (stderr, "Enter semid of the semaphore set you want to use: " ); (void) scanf ("%i", Ssemid) ; *semidp = semid; arg.buf = Ssemid_ds; if (semctl (semid, 0, IPC_STAT, arg) == -1) { perror ("semop: semctl (IPC_STAT) failed"); /* Note that if semctl fails, semid_ds remains filled with zeroes, so later test for number of semaphores will be zero. */ (void) fprintf (stderr, "Before and after values will not be printed. \n") ; } else { if ((arg. array = (ushort *)malloc( (unsigned) (sizeof (ushort) * semid_ds . sem_nsems) ) ) == NULL) { (void) fprintf (stderr, error_mesgl, semid_ds . sem_nsems) ; exit ( 1) ; /* Print current semaphore values. */ if (semid_ds . sem_nsems) { (void) fprintf (stderr, "There are %d semaphores in the set.\n", semid_ds . sem_nsems) ; if (semctl (semid, 0, GETALL, arg) == -1) { perror ("semop: semctl (GETALL) failed") ; } else ( (void) fprintf (stderr, "Current semaphore values are:"); for (i = 0; i < semid_ds . sem_nsems; (void) fprintf (stderr, " %d", arg. array [i++] ) ) (void) fprintf (stderr, "\n") ; /* Find out how many operations are going to be done in the next call and allocate enough space to do it. */ (void) fprintf (stderr, "How many semaphore operations do you want %s\n", "on the next call to semop () ?"); (void) fprintf (stderr, "Enter 0 or control-D to quit: "); i = 0; if (scanf ("%i", &i) == EOF || i == 0) exit (0) ; if (i > nsops) { if (nsops) free ( (char *)sops); nsops = i; if ((sops = (struct sembuf *) malloc ( (unsigned) (nsops * sizeof (struct sembuf)))) == NULL) { (void) fprintf (stderr, error_mesg2, nsops); exit (2) ; ) *sopsp = sops; return (i) ; Chapter 3 — System V Interprocess Communication Facilities 8 1 3.5. Shared Memory Structure of a Shared Memory Segment Figure 3 In the SunOS operating system, the most effecient method for implementing shared memory applications is to rely on native virtual memory management and the mmap(2) system call. For shared memory applications that are to be compa- tible with System V, the SunOS operating system also provides the standard Sys- tem V shared memory facilities. Shared memory allows more than one process at a time to attach a segment of physical memory to its virtual address space. When write access is allowed for more than one process, an outside protocol or mechanism such as a semaphore can be used to prevent inconsistencies and collisions. Using System V shared memory, a process creates a shared memory segment using the shmget(2) system call. This call can also be used to obtain the ID of an existing shared segment. The creating process sets the permissions, and the size in bytes for the segment. The original owner/creator of a shared memory segment can assign ownership to another user with the shmct 1(2) system call; it can also revoke this assignment. Other processes with proper permission can perform various control functions on the shared memory segment using shmct 1 ( ) . Once created, a shared segment can be attached to a process’s address space using the shmat ( ) system call; it can be detached using shmdt ( ) . (See shmop(2) for details.) The attaching process must have the appropriate permis- sions for shmat ( ) to succeed. Once attached, the process can read or write to the segment, as allowed by the permission requested in the attach operation. A shared segment may be attached multiple times by the same process. If any of the above-mentioned system calls fails, it returns -1, and sets the exter- nal variable err no to the appropriate value. A shared memory segment is composed of a control structure with a unique ID that points to an area of physical memory. The identifier for the segment is referred to as the shmid. 19 Structure of a Shared Memory Segment control ^ shared memory segment structure The data structure includes the following information about the memory seg- ment: □ Access permissions. □ Segment size. □ The PID of the process performing last operation. microsystems Revision A of 27 March 1990 82 Programming Utilities and Libraries □ The PID of the creator process. □ The current number of processes to which the segment is attached. □ The time of the last attachment. □ The time of the last detachment. □ The time of the last change to the segment. □ Memory map segment descriptor pointer. The structure definition for the shared memory segment control structure can be found in . This structure definition is shown below. — > /* * There is a shared mem id data structure for each segment in the system. */ struct shmid ds { struct ipc_perm shm_perm; /* operation permission struct */ uint shm segsz; /* size of segment in bytes */ ushort shm_lpid; /* pid of last shmop */ ushort shm cpid; /* pid of creator */ ushort shm nattch; /* number of current attaches */ time_t shm atime; /* last shmat time */ time t shm_dtime; /* last shmdt time */ time t shm ctime; /* last change time */ struct anon_map *shm_amp; /* segment anon_map pointer */ Note that the shm_perm member of this structure uses ipc_perm as a tem- plate, as defined in . Using shmget ( ) to Get Access to a Shared Memory Segment The shmget ( ) system call is used to obtain access to a shared memory seg- ment. When the call succeeds, it returns the shared memory segment ID (shmid). When it fails, it returns -1, and sets errno to the appropriate error code, shmget ( ) has the following synopsis: Figure 3-20 Synopsis of shmget ( ) r 1 ♦include ♦include ♦include int shmget (key, size, shmflg) key_t key; int size, shmflg; — y The value passed as the shmf lg argument must be an integer, which incor- porates settings for the segment’s permissions and control flags, as described under System V IPC Permissions, above. microsystems Revision A of 27 March 1990 Chapter 3 — System V Interprocess Communication Facilities 83 The SHMMNI system configuration option determines the maximum number of shared memory segments that are allowed, 100 by default. The system call will fail if the size value is less than SHMMIN or greater than SHMMAX, the configuration options for the minimum and maximum segment sizes. By default, SHMIN is 1, SHMAX is 1048576. The following sample program illustrates the shmget ( ) system call. Figure 3-21 Sample Program to Illustrate shmget ( ) /* ** shmget. c: Illustrate the shmget () system call. ** ** This is a simple exerciser of the shmget () system call. ** It prompts for the arguments, makes the call, and reports the results. */ ♦include ♦include ♦include ♦include extern void exit(); extern void perrorl) ; main ( ) { key_tkey; /* key to be passed to shmget () */ int shmflg; /* shmflg to be passed to shmget () */ int shmid; /* return value from shmget () */ int size;/* size to be passed to shmget () */ (void) fprintf (stderr, "All numeric input is expected to follow C convent ions : \n" ) ; (void) fprintf (stderr, "\t0x... is interpreted as hexadecimal, \n") ; (void) fprintf (stderr, "\t0 — is interpreted as octal, \n") ; (void) fprintf (stderr, "\totherwise, decimal. \n") ; /* Get the key. */ (void) fprintf (stderr, "IPC_PRIVATE == %^lx\n", IPC_PRIVATE) ; (void) fprintf (stderr, "Enter desired key: "); (void) scanf("%li", &key) ; /* Get the size of the segment. */ (void) fprintf (stderr, "Enter desired size: ") ; (void) scanf("%i", Ssize) ; /* Get (void) (void) (void) (void) (void) (void) (void) (void) (void) (void) (void) the shmflg value, fprintf (stderr, fprintf (stderr, fprintf (stderr, fprintf (stderr, fprintf (stderr, fprintf (stderr, fprintf (stderr, fprintf (stderr, fprintf (stderr, fprintf (stderr. */ Expected flags for the shmflg argument are:\n"), \tIPC_CREAT = \t%#8 . 8o\n", IPC_CREAT) ; \tIPC_EXCL = \t%#8 . 8o\n", IPC_EXCL) ; \towner read =\t%#8 . 8o\n", 0400); \towner write =\t%^8.8o\n", 0200); \tgroup read =\t%^8 . 8o\n", 040); \tgroup write =\t%^8 . 8o\n" , 020); \tother read =\t%^8 . 8o\n", 04); \tother write =\t%^8 . 8o\n", 02); Enter desired shmflg: ") ; scanf("%i", Sshmflg) ; /* Make the call and report the results. */ (void) fprintf (stderr, "shmget: Calling shmget (%#lx, %d, %#o)\n", key, size, shmflg) ; if ( (shmid = shmget (key, size, shmflg) ) == -1) { m sun microsystems Revision A of 27 March 1990 84 Programming Utilities and Libraries perror ("shmget : shmget failed"); exit (1) ; ) else { (void) fprintf (stderr, "shmget: shmget returned %d\n", shmid); exit (0) ; } /*NOTRE ACHED*/ } V Controlling a Shared Memory The shmctl ( ) system call is used to alter the permissions and other charac- Segment with shmctl ( ) teristics of a shared memory segment. It synopsis is as follows: Figure 3-22 Synopsis of s hmct 1 ( ) r A ♦include ♦include ♦include int shmctl (shmid, cmd, buf) int shmid, cmd; struct shmid_ds *buf; v J The shmid argument is the ID of the shared memory segment as returned by shmget ( ) . The cmd argument is one of following control commands: SHM_LOCK Lock the specified shared memory segment in memory. The process must have effective ID of super-user to perform this command. SHMJJNLOCK Unlock the shared memory segment. The process must have effective ID of super-user to perform this command. IPC_STAT Return the status information contained in the control structure, and place it in the buffer pointed to by buf . The process must have read permission on the segment to perform this command. IPC_SET Set the effective user and group identification, and access permissions. The process must have an effective ID of owner, creator or super-user to perform this command. IPC_RMID Remove the shared memory segment. The process must have an effective ID of owner, creator or super-user to perform this command. The example program below allows you to illustrate shmctl ( ) . A microsystems Revision A of 27 March 1990 Chapter 3 — System V Interprocess Communication Facilities 85 break; case IPC_RMID: /* Remove the segment when the last attach point is detached. */ break; case SHM_LOCK: /* Lock the shared memory segment. */ break ; case SHM_UNLOCK: /* Unlock the shared memory segment. */ break; default : /* Unknown command will be passed to shmctl. */ break; ) do_shmctl (shmid, cmd, £shmid_ds) ; exit (0) ; /*NOTRE ACHED*/ } /* ** Display the arguments being passed to shmctl(), call shmctlO, and ** report the results. ** If shmctlO fails, do not return; this example doesn't deal with ** errors, it just reports them. */ static void do_shmctl (shmid, cmd, buf) int shmid, cmd; struct shmid_ds *buf ; ( register int rtrn;/* hold area */ (void) fprintf (stderr, "shmctl: Calling shmctl(%d, %d, buf)\n", shmid, cmd) ; if (cmd == IPC_SET) { (void) fprintf (stderr, "\tbuf— >shm_perm.uid == %d\n", buf->shm_perm.uid) ; (void) fprintf (stderr, "\tbuf— >shm_perm.gid == %d\n", buf->shm_perm.gid) ; (void) fprintf (stderr, "\tbuf— >shm_perm.mode == %#o\n", buf— >shm_perm. mode) ; ) if ( (rtrn = shmctl (shmid, cmd, buf)) == —1) { perror ("shmctl: shmctl failed"); exit (1) ; ) else { (void) fprintf (stderr, "shmctl: shmctl returned %d\n", rtrn) ; ) if (cmd ! = IPC_STAT && cmd != IPC_SET) return; /* Print the current status. */ (void) fprintf (stderr, "\nCurrent status:\n"); (void) fprintf (stderr, "\tshm_perm.uid = %d\n", buf->shm_perm.uid) ; (void) fprintf (stderr, "\tshm_perm.gid = %d\n", buf->shm_perm. gid) ; (void) fprintf (stderr, "\tshm_perm.cuid = %d\n", buf->shm_perm.cuid) ; (void) fprintf (stderr, "\tshm_perm.cgid = %d\n", buf— >shm_perm . cgid) ; (void) fprintf (stderr, "\tshm_perm .mode = %#o\n", buf->shm_perm.mode) ; (void) fprintf (stderr, "\tshm_perm.key = %#x\n", buf->shm_perm.key) ; (void) fprintf (stderr, "\tshm_segsz = %d\n", buf— >shm_segsz) ; (void) fprintf (stderr, "\tshm_lpid = %d\n", buf->shm_lpid) ; (void) fprintf (stderr, "\tshm_cpid = %d\n", buf->shm_cpid) ; sun microsystems Revision A of 27 March 1990 Chapter 3 — System V Interprocess Communication Facilities 87 f (void) fprintf (stderr, "\tshm nattch = = %d\n" , buf— >shm nattch) ; A (void) fprintf (stderr, "\tshm atime = % s " , buf— >shm atime ? ctime (&buf->shm at ime ) : "Not Set\n" ) ; (void) fprintf (stderr, "\tshm dtime = %s". buf— >shm dtime ? ctime (Sbuf->shm dtime) : "Not Set\n") ; } (void) fprintf (stderr, "\tshm ctime = %s", ct ime ( &bu f— > shm_ct ime ) ) ; J Attaching and Detaching a Shared Memory Segment with shmat ( ) and shmdt ( ) shmat ( ) and shmdt ( ) are used to attach and detach shared memory seg- ments. Their synopses are as follows: Figure 3-24 Synopses of shmat ( ) and shmdt ( ) / — ' ♦include ♦include ♦include char *shmat (shmid, shmaddr, shmflg) int shmid; char *shmaddr; int shmflg; int shmdt (shmaddr) char * shmaddr; Upon successful completion, the shmat ( ) system call returns a pointer to the head of the shared segment; when unsuccessful, it returns ‘ (char * ) -1 ’ and sets the external variable errno to the appropriate error code. The shmid argument is the ID of an existing shared memory segment. The shmaddr argument is the address at which to attach the segment. If supplied as zero, the system provides a suitable address. For the sake of portability, it is usu- ally better to allow the system to determine the address. The shmflg argument is a control flag used to pass the SHM_RND and SHM_RDONLY flags to the shmat ( ) system call. The shmdt ( ) system call detaches the shared memory segment located at the address indicated by shmaddr. Upon successful completion, schmdt ( ) returns zero; when unsuccessful, it returns —1 and sets the external variable errno to the appropriate error code. The following sample program illustrates shmat ( ) and shmdt ( ) . microsystems Revision A of 27 March 1990 88 Programming Utilities and Libraries Figure 3-25 Sample Program to Illustrate shmat ( ) and shmdt ( ) ** shmop.c: Illustrate the shmat ( ) and shmdt () system calls. ** ** This is a simple exerciser for the shmat () and shmdt () system ** calls. It allows you to attach and detach segments and to ** write strings into and read strings from attached segments. */ ♦include ♦include ♦include ♦include ♦include ♦include ♦define MAXnap 4 /* Maximum number of concurrent attaches. */ static static void extern void static extern void extern char ask () ; catcher ( ) ; exit () ; good_addr ( ) ; perror () ; *shmat () ; static struct state { int shmid; /* Internal record of currently attached segments. /* shmid of attached segment */ char *shmaddr; /* attach point */ int shmflg; /* flags used on attach */ } ap [MAXnap] ; /* State of current attached segments. */ static intnap; /* Number of currently attached segments. */ static jmp_buf segvbuf; /* Process state save area for SIGSEGV catching. */ main ( ) register int action; /* action to be performed */ °bar *addr; /* address work area */ register int i; /* work area */ register struct state *p; /* ptr to current state entry */ voi d (*savefunc) () ; /* SIGSEGV state hold area */ (void) fprintf (stderr, "All numeric input is expected to follow C conventions:\n") ; (void) fprintf (stderr, "\tOx... is interpreted as hexadecimal, \n") ; (void) fprintf (stderr, "\t0... is interpreted as octal, \n") ; (void) fprintf (stderr, "\totherwise, decimal. \n") ; while (action = ask()) ( if (nap) ( (void) fprintf (stderr, "\nCurrently attached segment (s) : \n") ; (void) fprintf (stderr, " shmid address\n") ; (void) fprintf (stderr, " \n"); p = Sap [nap]; while (p — != ap) ( (void) fprintf (stderr, "%6d", p->shmid) ; (void) fprintf (stderr, "%^llx", p->shmaddr) ; (void) fprintf (stderr, " Read%s\n", (p->shmflg S SHM_RDONLY) ? "-Only" : "/Write"); (void) fprintf (stderr, "\nNo segments are currently attached. \n"); « sen microsystems Revision A of 27 March 1990 Chapter 3 — System V Interprocess Communication Facilities 89 switch (action) { case 1: /* Shmat requested. */ /* Verify that we have space for another attach. */ if (nap == MAXnap) { (void) fprintf (stderr, "%s %d %s\n", "This simple example will only allow", MAXnap, "attached segments."); break; ) p = sap[nap++]; /* Get the arguments, make the call, report the results, and update the current state array. */ (void) fprintf (stderr, "Enter shmid of segment to attach: "); (void) scanf("%i", &p->shmid) ; (void) fprintf (stderr, "Enter desired shmaddr: ") ; (void) scanf ("%i", & p-> shmaddr ) ; (void) fprintf (stderr, "Meaningful shmflg values are:\n"); (void) fprintf (stderr, "\tSHM_RDONLY = \t%#8 . 8o\n" , SHM_RDONLY) ; (void) fprintf (stderr, "\tSHM_RND = \t%#8.8o\n", SHM_RND ) ; (void) fprintf (stderr, "Enter desired shmflg value: ") ; (void) scanf ("%i", &p->shmflg) ; (void) fprintf (stderr, "shmop: Calling shmat (%d, %#x, %#o)\n", p->shmid, p->shmaddr, p->shmflg) ; p->shmaddr = shmat (p->shmid, p->shmaddr, p->shmflg) ; if (p->shmaddr == (char *)-l) { perror (" shmop : shmat failed"); nap — ; } else { (void) fprintf (stderr, "shmop: shmat returned %#8.8x\n", p->shmaddr) ; } break; case 2: /* Shmdt requested. */ /* Get the address, make the call, report the results, and make the internal state match. */ (void) fprintf (stderr, "Enter desired detach shmaddr: "); (void) scanf ("%i", Saddr) ; i = shmdt (addr) ; if (i == -1) { perror ("shmop: shmdt failed"); } else ( (void) fprintf (stderr, "shmop: shmdt returned %d\n", i) ; for (p = ap, i = nap; i — ; p++) { if (p— >shmaddr == addr) *p = ap ( — nap] ; } ) break; case 3: /* Read from segment requested. */ if (nap == 0) break; (void) fprintf (stderr, "Enter address of an %s" , W sun microsystems Revision A of 27 March 1990 90 Programming Utilities and Libraries "attached segment: ") ; (void) scanf("%i", Saddr); if (good_addr (addr) ) (void) fprintf (stderr, "String @ %#x is '%s'\n", addr, addr) ; break; case 4: /* Write to segment requested. */ if (nap == 0) break; (void) fprintf (stderr, "Enter address of an %s", "attached segment: "); (void) scanf("%i", Saddr); /* Set up SIGSEGV catch routine to trap attempts to write into a read-only attached segment. */ savefunc = signal (SIGSEGV, catcher); if (set jmp (segvbuf ) ) ( (void) fprintf (stderr, "shmop: %s: %s\n", "SIGSEGV signal caught", "Write aborted."); ) else { if (good_addr (addr) ) { (void) f flush (stdin) ; (void) fprintf (stderr, "%s %s %#x:\n", "Enter one line to be copied", "to shared segment attached @", addr) ; (void) gets (addr); ) (void) f flush (stdin) ; /* Restore SIGSEGV to previous condition. */ (void) signal (SIGSEGV, savefunc); break; } } exit (0) ; /*NOTRE ACHED*/ } /* ** Ask for next action. */ static ask ( ) ( int response; /* user response */ do (void) fprintf (stderr, (void) fprintf (stderr, (void) fprintf (stderr, (void) fprintf (stderr, (void) fprintf (stderr, (void) fprintf (stderr, (void) fprintf (stderr, (void) fprintf (stderr, "Enter the number "Your options are:\n"); "\t "D = exit\n") ; "\t 0 = exit\n"); "\t 1 = shmat\n") ; "\t 2 = shmdt\n") ; "\t 3 = read from segment\n") ; "\t 4 = write to segment\n"); corresponding to your choice: "); /* Preset response so "“D” will be interpreted as exit. */ response = 0; (void) scanf("%i", sresponse) ; ) while (response < 0 | | response > 4); » sun microsystems Revision A of 27 March 1990 return (response) ; } /* ** Catch signal caused by attempt to write into shared memory segment ** attached with SHM_RDONLY flag set. */ /*ARGSUSED*/ static void catcher (sig) { longjmp (segvbuf , 1) ; / *NOTREACHED * / } /* ** Verify that given address is the address of an attached segment. ** Return 1 if address is valid; 0 if not. */ static good_addr (address) char *address; { register struct state *p; /* ptr to state of attached segment */ for (p = ap; p != &ap[nap]; p++) if (p->shmaddr == address) return (1) ; return ( 0 ) ; } Revision A of 27 March 1990 92 Programming Utilities and Libraries Revision A of 27 March 1990 4 4.1. Introduction The sees Command Initializing the SCCS History File: sees create SCCS — Source Code Control System Coordinating write access to source files is important when changes may be made by several people. Maintaining a record of updates allows you to deter- mine when and why changes were made. The Source Code Control System (SCCS) allows you to control write access to source files, and to monitor changes made to those files. SCCS allows only one user at a time to update a file, and records all changes in a history file. SCCS allows you to: □ Retrieve copies of any version of the file from the SCCS history. □ Check out and lock a version of the file for editing, so that only you may make changes to it. SCCS prevents one user from unwittingly “clobbering” changes made by another. □ Check in your updates to the file. When you check in a file, you can also supply comments that summarize your changes. □ Back out changes made to your checked-out copy. □ Inquire about the availability of a file for editing. □ Inquire about differences between selected versions. □ Display the version log summarizing the changes checked in so far. The Source Code Control System is composed of the sccs(l) command, which is a front end for the utility programs inthe/usr/sccs directory. The SCCS utility programs are listed under Reference Tables, at the end of this chapter. The sees create command places your file under SCCS control. It creates a new history file, and uses the complete text of your source file as the initial ver- sion. By default, the history file resides in the SCCS subdirectory; you may have to create this subdirectory if it is not already present: f#sun Xr microsystems 93 Revision A of 27 March 1990 94 Programming Utilities and Libraries The output from SCCS tells you the name of the “created” file, its version number (1.1), and the count of lines. To prevent the accidental loss or damage to an original, sees create makes a second link to it, prefixing the new filename with a comma (referred to as the "comma- file”) When the history file has been initialized successfully, SCCS retrieves a new, read-only version. Once you have verified the version against its comma-file, you can remove that file. / hermes% emp , program. c program. c > (no output means that the files match exactly ) hermes% rm , programme \ . V Do not try to edit the read-only version that SCCS retrieves. Before you can edit the file, you must check it out using the secs edit command described below. To distinguish the history file from a current version, SCCS uses the ‘ s . ’ prefix. Owing to this prefix, the history file is often referred to as the s . file 0 ‘ s-dot-file ’ ’)• For historical reasons, it may also be referred to as the “ SCCS-file .” The format of an SCCS history file is described in sccsf ile(5). Basic secs Subcommands The following secs subcommands perform the basic version-control functions. They are summarized here, and, except for create, are described in detail under sees Subcommands, below. create Initialize the history file and first version, as described above. edit Check out a writable version (for editing). SCCS retrieves a writable copy with you as the owner, and places a lock on the history file so that no one else can check in changes. de It a Check in your changes. This is the complement to the s c c s edit operation. Before recording your changes, SCCS prompts for a com- ment, which it then stores in the history file’s version log. get Retrieve a read-only copy of the file from the s . file. By default, this is the most recent version. While the retrieved version can be used as a source file for compilation, formatting, or display, it is not microsystems Revision A of 27 March 1990 Chapter 4 — SCCS — Source Code Control System 95 Deltas and Versions SIDs ID Keywords intended to be edited or changed in any way. (Attempting to bend the rules by changing permissions of a read-only version can result in your changes being lost.) If you give a directory as a filename argument, sees attempts to perform the subcommand on each s . file in that directory. Thus, the command: sees get SCCS retrieves a read-only version for every s . file in the SCCS subdirectory, prt Display the version log, including comments associated with each version. When you check in a version, SCCS records only the line-by-line differences between the text you check in and the previous version. This set of differences is known as a delta. The version that is retrieved by an edit or get is con- structed from the accumulated deltas checked in so far. The terms “delta” and “version” are often used synonymously. However, their meanings aren’t exactly the same; it is possible to retrieve a version that omits selected deltas (see Excluding Deltas from a Retrieved Version, below). An SCCS delta ID, or SID, is the number used to represent a specific delta. This is a two-part number, with the parts separated by a dot ( . ). The SID of the initial delta is 1 . 1 by default. The first part of the SID is referred to as the release number, and the second, the level number. When you check in a delta, the level number is incremented automatically. The release number can be incremented as needed. SCCS also recognizes two additional fields for branch deltas (described under Branch Deltas, below). Strictly speaking, an SID refers directly to a delta. However, it is often used to indicate the version constructed from a delta and its predecessors. SCCS recognizes and expands certain keywords in a source file, which you can use to include version-dependent information (such as the SID) into the text of the checked-in version. When the file is checked out for editing, ID keywords take the following form: g./'"* o, ol_^ o where C is a capital letter. When you check in the file, SCCS replaces the key- words with the information they stand for. For example, %I% expands to the SID of the current version. You would typically include ID keywords either in a comment or in a string definition. If you do not include at least one ID keyword in your source file, SCCS issues the diagniostic: No Id Keywords (cm7) For more information about ID keywords, refer to Incorporating ID Keywords, below. microsystems Revision A of 27 March 1990 96 Programming Utilities and Libraries 4.2. sees Subcommands Checking Files In and Out The following subcommands are useful when retrieving versions or checking in changes. Checking Out a File for Editing: To edit a source file, you must check it out first using sees edit. 1 SCCS secs edit responds with the delta ID of the version just retrieved, and the delta ID it will assign when you check in your changes. You can then edit it using a text editor. If a writable copy of the file is present, secs edit issues an error message; it does not overwrite the file if anyone has write access to it. Checking in a New Version: Having first checked out your file and completed your edits, you can check in the sees delta changes using secs delta. Checking a file in is also referred to as “making a delta.” Before checking in your updates, SCCS prompts you for comments. These typically include a brief summary of your changes. You can extend the comment to an additional input line by preceding the NEW- LINE with a backslash: microsystems Revision A of 27 March 1990 Chapter 4 — SCCS — Source Code Control System 97 Changed lines count as lines deleted and inserted. Retrieving aversion: sees get SCCS responds by noting the SID of the new version, and the numbers of lines inserted, deleted and unchanged. SCCS removes the working copy. You can retrieve a read-only version using secs get. Think ahead before checking in a version. Making deltas after each minor edit can become excessive. On the other hand, leaving files checked out for so long that you forget about them can inconvenience others. Comments should be meaningful, since you may return to the file one day. It is important to check in all changed files before compiling or installing a module for general use. A good technique is to edit the files you need, make all necessary changes and tests, compile and debug the files until you are satisfied, check them in, retrieve read-only copies with get, and then recompile the module. To get the most recent version of a file, use the command: sees get filename For example: retrieves program . c, and reports the version number and the number of lines retrieved. The retrieved copy of program, c has permissions set to read-only. Do not change this copy of the file, since SCCS will not create a new delta unless the file has been checked out. If you force changes into the retrieved copy, you may lose them the next time someone performs an sees get or an secs edit on the file. Reviewing Pending Changes: secs diffs A hermes % sees diffs program. c program, c 37c37 < if ( ( (cmd_ 3 > - cmd) + 1) == l_lim) { > if ( ( (cmd_j? - cmd) - 1) == 1 lim) { v J Changes made to a checked-out version, but which are not yet checked in, are said to be pending. When editing a file, you can find out what your pending changes are using ‘secs diffs’. The diffs subcommand uses dif f (1) to compare your working copy with the most recently checked-in version. Most of the options to dif f can be used. To invoke the -c option to dif f , use the ‘-C’ argument to ‘sees diffs’. Revision A of 27 March 1990 98 Programming Utilities and Libraries Deleting Pending Changes: sees unedit backs out pending changes. This comes in handy if you damage secs unedit the file while editing it and want to start over, unedit removes the checked-out version, unlocks the history file, and retrieves a read-only copy of the most recent version checked in. After using unedit, it is as if you hadn’t checked out the file at all. To resume editing, use sees edit to check the file out again. (See also. Repairing a Writable Copy, below.) secs delget combines the actions of delta and get: it checks in your changes and then retrieves a read-only copy of the new version. However, if SCCS encounters an error during the delta, it does not perform the get. When processing a list of filenames, delget applies all the deltas it can, and if errors occur, omits all of the gets. Combining delta and edit: sees deledit performs a delta followed by an edit. You can use this to sees deledit check in a version and immediately resume editing. Retrieving a Version by SID: secs get -r Retrieving a Version by Date In some cases you don’t know the SID of the delta you want, but you do know and Time: sees get -c the date on (or before) which it was checked in. You can retrieve the latest ver- sion checked in before a given date and time using the -c option and a date-time argument of the form: -cyy [mm [dd [hh [mm [ss ]]]]] For example: The -r option allows you to specify the SID to retrieve: Combining delta and get: sees delget retrieves whatever version was current as of July 22, 1988 at 12:00 noon. Trail- ing fields can be omitted (defaulting to their highest legal value), and punctuation can be inserted in the obvious places; for example, the above line could be writ- ten as: sees get -c"88/07/22 12:00:00" program. c Repairing a Writable Copy: sees get -k -G Without checking out a new version, sees get -k -Gfilename retrieves a writable copy of the text, and places it in the file specified by ‘ -G’. This can be useful when you want to replace or repair a damaged working copy using dif f and your favorite editor. Revision A of 27 March 1990 Chapter 4 — SCCS — Source Code Control System 99 Incorporating Version- As mentioned above, SCCS allows you to include version-dependent information Dependent Information: ID in a checked-in version through the use of ID keywords. These keywords, which Keywords you insert in the file, are automatically replaced by the corresponding informa- tion when you check in your changes. SCCS ID keywords take the form: q,ro, 'oU'o where C is an upper case letter. For instance, % I % expands to the SID of the most recent delta. %W% includes the filename, the SID, and the unique string @ ( # ) into the file. This string is searched for by the what command in both text and binary files (allowing you to see which source versions a file or program was built from). The %G% keyword expands to the date of the latest delta. Other ID keywords and the strings they expand to are listed in the Identification Keywords, table under Reference Tables at the end of this chapter. Defining a string in this way allows version information to be compiled into the C object file. If you use this technique to put ID keywords into header (.h) files, use a different variable in each header file. This prevents errors from attempts to redefine the (static) variables. To include version dependent information in a C program, you can use a line like this: static char Sccsldf ] = "%W%\t%G%"; > If the file were named program . c, this line would expand to the following when version 1.2 is retrieved: static char Sccsldf ] = "@ (#) program. c 1.2 08/29/80"; Since the string is defined in the compiled program, this technique allows you to include source-file information within the compiled program, which the what command can report: hermes% cd /usr/ucb \ hermes% what sees sees sees . c 1.13 88/02/08 SMI V For shell scripts and the like, you can include ID keywords within comments: If you check in a version containing expanded keywords, the version-dependent information will no longer be updated. To alert you to this situation, SCCS gives you the warning: No Id Keywords (cm7) when a get, edit, or create finds no ID keywords. microsystems Revision A of 27 March 1990 100 Programming Utilities and Libraries Making Inquiries The following subcommands are useful for inquiring about the status of a file or its history. Seeing Which Version Has Been Retrieved: The what Command Since SCCS allows you (or others) to retrieve any version in the file’s history, there is no guarantee that a working copy present in the directory reflects the ver- sion you desire. The what command scans files for SCCS ID keywords. It also scans binary files for keywords, allowing you to see which source versions a pro- gram was compiled from. — hermesl what program. c program program. c : program. c 1.1 88/07/05 SMI; program: program. c 1.1 88/07/05 SMI; V In this case, the file contains a working copy of version 1.1. Determining the Most Recent Version: sees get -g To see the SID of the latest delta, you can use secs get -g: In this case, the most recent delta is 1.2. Since this is more recent than the ver- sion reflected by what in the example above, you would probably want to get the new version. Determining Who Has a File Checked Out: sees info To find out what files are being edited, type: sees info This subcommand displays a list of all the files being edited, along with other information, such as the name of the user who checked the file out. Similarly, you can use secs check silently returns a non-zero exit status if anything is being edited. This can be used within a makefile to force make(l) to halt if it should find that a source file is checked out. If you know that all the files that you have checked out are ready to be checked in, you can use: secs delta 'secs tell -u' to process them all. tell lists only the names of files being edited, one per line. With the -u option, tell reports only those files checked out to you. If you supply a username as an argument to -u, sees tell reports only the files checked out to that user. microsystems Revision A of 27 March 1990 Chapter 4 — SCCS — Source Code Control System 101 sees prt produces a listing of the version log, also referred to as the delta table, which includes the SID, time and date of creation, and the name of the user who checked in each version, along with the number of lines inserted, deleted, and unchanged, and the commentary: r \ hermes% sees prt program. c D 1.2 80/08/29 12:35:31 corrected typo in widget (), null pointer in n_crunch() pers 2 1 00005/00003/00084 D 1.1 79/02/05 00:19:31 zeno 1 0 00087/00000/00000 date and time created 80/06/10 v 00:19:31 by zeno To display only the most recent entry, use the -y option. Updating a Delta Comment: If you forget to include something important in a comment, you can add the sees ede missing information using sees ede -r sid The delta must be the most recent (or the most recent in its branch, see Branches, below), and you must either be the user who checked the delta in, or you must own and have permission to write on both the history file and the SCCS subdirec- tory. When you use ede, SCCS prompts for your comments and inserts the new comment you supply: hermes% sees ede - r 1.2 program. c comments? also taught get_in() to handle control chars v : / Displaying Delta Comments: sees prt The new commentary, as displayed by prt, looks like: f hermes% sees prt program. c — D 1.2 80/08/29 12:35:31 pers 2 1 00005/00003/00084 also taught get_in() to handle control chars *** CHANGED *** 88/08/02 14:54 45 pers corrected typo in widget O , null pointer in n_crunch() D 1.1 79/02/05 00:19:31 zeno 1 0 00087/00000/00000 date and time created 80/06/10 V : 00:19:31 by zeno Comparing Checked-In Versions: secs sccsdiff to see the differences between delta 1.1 and delta 1.2. Most options to dif f can be used. To invoke the -c option to dif f , use the ‘-C’ argument to ‘sccsdiff’. Instead of -r, you can use the -c date-time option to sees. To compare two checked-in versions, use: hermes% secs sccsdiff -rl.l -rl.2 program. c V V microsystems Revision A of 27 March 1990 102 Programming Utilities and Libraries Displaying the Entire History: sees get -m -p If you wish to see a listing of all changes made to the file and the delta in which each was made, you can use the -m and -p options to get: r v hermes% sees get -m -p program. c 1.2 i 84 lines : 1.2 #def ine LJLEN 256 1.1 1 . 1 #include 1.1 v To find out what lines are associated with a particular delta, you can pipe the out- put through grep(lV): ( s sccs get -m — p program.c I grep ' ~ 1 . 2 ' v / You can also use -p, by itself to send the retrieved version to the standard out- put, rather than to the file. Creating Reports : s c c s p r s -d :X: and are listed in the Data Keywords table under Reference Tables at the end of this chapter. There is no limit on the number of times a data keyword may appear in the dataspec argument. A valid dataspec argument is a (quoted) string consisting of text and data keywords. pr s replaces each recognized keyword with the appropriate value from the his- tory file. The format of a data keyword value is either simple, in which case the expanded value is a simple string, or multi-line, in which case the expansion includes I RETURN I characters. A [ TAB ] is specified by ‘\t’ and a I RETURN ] by ‘\n’. Here are some examples: You can use the pr s subcommand with the -d dataspec option to derive reports about files under SCCS control. The dataspec argument offers a rich set of i"datakey words" that correspond to portions of the history file. Data keywords take the form: hermes% sees prs -d"0sers and/or user IDs for :F: are:\n:TJN:" program.c Users and/or user IDs for s. program.c are: zeno pers hermes% sees prs -d’’Newest delta for :M:: Created :D: by :P:." -r program. c Newest delta for program.c: 1.3. Created 88/07/22 by zeno. v / microsystems Revision A of 27 March 1990 Chapter 4 — SCCS — Source Code Control System 103 Deleting Committed Changes Replacing a Delta: sees fix From time to time a delta is checked in that contains small bugs, such as typos, that need correcting but that do not require entries in the file’s audit trail. Or, perhaps the comment for a delta is incomplete or in error, even when the text is correct. In either case, you can make additional updates and replace the version log entry for the most recent delta using sees fix: r — A hermes% sees fix -rl.2 program. c J This checks out version 1.2 of program . c. When you check the file back in, the current changes will replace delta 1.2 in the history file, and SCCS will prompt for a (new) comment. You must supply an SID with *-r\ Also, the delta that is specified must be a leaf (most recent) delta. Although the previously-checked-in delta 1.2 is effectively deleted, SCCS retains a record of it, marked as deleted, in the history file. Before using sees fix it is a good idea to make a copy of the current version, just in case. To remove all traces of the most recent delta, you can use the rmdel subcom- mand. You must specify the SID using -r. In most cases, using fix is prefer- able to rmdel, since fix preserves a record of “deleted” delta, while rmdel does not. 2 Reverting to an Earlier Version To retrieve a writable copy of an earlier version, use ‘get -k’. This can come in handy when you need to backtrack past several deltas. To use an earlier delta as the basis for creating a new one: □ Check out the file as you normally would (using sees edit). □ Retrieve a writable copy of an earlier “good” version (giving it a different filename) using get -k: sees get -k -r sid -Goldname filename The -Gfilename option specifies the name of the newly retrieved version. □ Replace the current version with the older “good” version: mv oldname filename □ And finally, check the file back in. In some cases, it may be simpler just to exclude certain deltas. Or, refer to Branch Deltas, below, for information on how to use SCCS to manage divergent sets of updates to a file. Removing a Delta: sees rmdel 2 Refer to sccs-rmdel(l) for more information. microsystems Revision A of 27 March 1990 104 Programming Utilities and Libraries Excluding Deltas from a Suppose that the changes that were made in delta 1.3 aren’t applicable to the next Retrieved Version version, 1.4. When you retrieve the file for editing, you can use the -x option to exclude delta 1.3 from the working copy: Now, when you check in delta 1.5, that delta will include the changes made in delta 1.4, but not those from delta 1.3. In fact, you can exclude a list of deltas by supplying a comma-separated list to -x, or a range of deltas, separated with a dash. For example, if you want to exclude 1.3 and 1.4, you could use: SCCS excludes the range of deltas from 1.3 to the current highest delta in release 1 . In certain cases when using -x there will be conflicts between versions; for example, it may be necessary to both include and delete a particular line. If this happens, SCCS displays a message telling the range of lines affected. Examine these lines carefully to see if the version SCCS derived is correct. Since each delta (in the sense of “a set of changes”) can be excluded at will, it is most useful to include a related set of changes within each delta. Combining Versions: sees The comb subcommand generates a Bourne Shell script that, when run, con- comb structs a new history file in which selected deltas are combined or eliminated. This can be useful when disk space is at a premium. CAUTION In combining several deltas, the comb-generated script destroys a portion of the file’s version log, including comments. The -p sid option indicates the oldest delta to preserve in the reconstruction. Another option, -c sid-list allows you to specify a list of deltas to include, sid-list is a comma-separated list; you can specify a range between two SIDs by separating them with a dash in the list, -p and -c are exclusive. The -o option attempts to minimize the number of deltas in the reconstruction. The -s option produces a script that compares the size of the reconstruction with that of the original. The comparision is given as a percentage of the original the reconstruction would occupy, based on the number of blocks in each. microsystems Revision A of 27 March 1990 Chapter 4 — SCCS — Source Code Control System 105 NOTE When using comb, it is a good idea to keep a copy of the original history file on hand. While comb is intended to save disk space, it may not always. In some cases, it is possible that the resulting history file may be larger than the original. If no options are specified, comb preserves the minimum number of ancestors needed to preserve the changes made so far. 4.3. Version Control for Although SCCS is typically used for source files containing ASCII text, the Binary Files SunOS version of SCCS allows you to apply version control to binary files as well (files that contain NULL or control characters, or do not end with a [ NEWLINE D . The binary files are encoded 3 into an ASCII representation when checked in; working copies are decoded when retrieved. You can use SCCS to track changes to files such as icons, raster images, and screen fonts. You can use sees create -b to force SCCS to treat a file as a binary file. When you create or delta a binary file, you get the warning message: Not a text file (ad31) You may also get the message: No id keywords (cm7) These messages may safely be ignored. Otherwise, everything proceeds as expected: _____ > hermes% sees create special. font special . font : Not a text file (ad31) No id keywords (cm7) 1.1 20 lines No id keywords (qm7) he rme s % secs get special. font 1.1 2 0 lines hermes% file special. font SCCS/ s . special .font special . font : vfont definition. SCCS/s . special . font : sees v. Use SCCS to control the updates to source files, and make to compile objects consistently. Since the encoded representation of a binary file can vary significantly between versions, history files for binary sources can grow at a much faster rate than those for ASCII sources. However, using the same version control system for all source files makes dealing with them much easier. 3 See uuencode(lC) for details. microsystems Revision A of 27 March 1990 106 Programming Utilities and Libraries 4.4. Maintaining Source Directories Duplicate Source Directories SCCS and make Keeping SIDs Consistent Across Files When using SCCS, it is the history files, and not the working copies, that are the real source files. If you are working on a project and wish to create a duplicate set of sources for some private testing or debugging, you can make a symbolic link to the SCCS subdirectory in your private working directory: <■ ” ' " " ■ ■ ■ ■ ■ ■ ■■ : ' ’ “ 1 — - > hermes% cd /private/ working/cmd . dir hermes% In -a /usr/sre/cmd/SCCS SCCS _ : : ■> This makes it a simple matter to retrieve a private (duplicate) set of working copies, of the source files using: sees get SCCS While working in the duplicate directory, you can also check files in and out — just as you could if you were in the original directory. SCCS is often used with make(l) to maintain a software project. The SunOS version of make provides for automatic retrieval of source files. (Other versions of make provide special rules that accomplish the same purpose.) It is also pos- sible to retrieve earlier versions of all the source files, and to use make to rebuild earlier versions of the project: f — ~ — — — — \ hermes% mkdir old . release ; cd old. release hermes% In -s ../SCCS SCCS hermes% sees get -c ,, 87 / 10 / 01 ,, SCCS SCCS/s .Makefile: 1 . 3 47 lines • • * hermes% make V J As a general rule, no one should check in source files while a build is in progress. When a project is about to be released, all files should be checked in before it is built. This insures that the sources for a released project are stable. With some care, it is possible to keep the SIDs consistent across sources com- posed of multiple files. The trick here is to edit all the files at once. The changes can then be made to whatever files are necessary; check in all the files (even those not changed). This can be done fairly easily by specifying the SCCS subdirectory as the filename argument to both edit and delta: hermes% sees edit SCCS hermes% secs delta SCCS V > With the delta subcommand, you are prompted for comments only once; the comment is applied to all files being checked in. To determine which files have • sun microsystems Revision A of 27 March 1990 Chapter 4 — SCCS — Source Code Control System 1 07 changed, you can compare the “lines added, deleted, unchanged” fields in each file’s delta table. Starting a New Release To create a new release of a program, specify the release number you want to create when you check the file out for editing, using the -r n option to edit; n is the new release number: herntesl sees edit -r2 program. c In this case, when the new version is delta’ed, it will be the first level delta in release 2, with SID 2.1. To change the release number for all SCCS-files in the directory, use: * ' ~ ' hermes% secs edit -r2 SCCS > ' Temporary Files used by SCCS When SCCS modifies an s . file (that is, a history file), it does so by writing to a temporary copy called an x . file. When the update is complete, SCCS uses the x . file to overwrite the old s . file. This insures that the history file is not dam- aged when processing terminates abnormally. The x . file is created in the same directory as the history file, is given the same permissions, and is owned by the effective user. To prevent simultaneous updates to an SCCS file, subcommands that update the history create a lock file, called a z . file, which contains the PID of the process performing the update. Once the update has completed, the z . file is removed. The z . file is created with mode 444 (read-only) in the directory containing the SCCS file, and is owned by the effective user. 4.5. Branches You can think of the deltas applied to an SCCS file as the nodes of a tree; the root is the initial version of the file. The root delta (node) is number ‘ 1 . 1’ by default, and successor deltas (nodes) are named ‘1.2’, ‘1.3’, and so forth. As noted ear- lier, these first two parts of the SID are the release and level numbers. The nam- ing of a successor to a delta proceeds by incrementing the level number. You have also seen how to check out a new release when a major change to the file is made. The new release number applies to all successor deltas as well, unless you specify a new level in a prior release. Thus, the evolution of a particular file may be represented as follows: microsystems Revision A of 27 March 1990 108 Programming Utilities and Libraries Figure 4-1 Evolution of an SCCS File We can call this structure the ‘trunk’ of the SCCS delta tree. It represents the nor- mal sequential development of an SCCS file; changes that are part of any given delta depend upon all the preceding deltas. However, situations can arise when it is convenient to create an alternate branch on the tree. For instance, consider a program which is in production use at ver- sion 1.3, and for which development work on release 2 is already in progress. Thus, release 2 may already have some deltas. Assume that a user reports a prob- lem in version 1.3 which cannot wait until release 2 to be corrected. The changes necessary to correct the problem will have to be applied as a delta to version 1.3. This requires the creation of a new version, but one that is independent of the work being done for release 2. The new delta will thus occupy a node on a new branch of the tree. The SID for a branch delta consists of four parts: the release and level numbers, and the branch and sequence numbers: release . level . branch . sequence The branch number is assigned to each branch that is a descendant of a particular trunk delta; the first such branch is 1, the next one 2, and so on. The sequence number is assigned, in order, to each delta on a particular branch. Thus, 1.3. 1.1 identifies the first delta of the first branch derived from delta 1.3, as shown below. microsystems Revision A of 27 March 1990 Chapter 4 — SCCS — Source Code Control System 1 09 Figure 4-2 Tree Structure with Branch Deltas The concept of branching may be extended to any delta in the tree; the naming of the resulting deltas proceeds in the manner just illustrated. The first two components of the name of a branch delta are always those of the ancestral trunk delta. The branch component is assigned in the order of creation on the branch, independent of its location relative to the trunk. Thus, a branch delta may always be identified as such from its name, and while the trunk delta may be identified from the branch delta’s name, it is not possible to determine the entire path leading from the trunk delta to the branch delta. For example, if delta 1.3 has one branch emanating from it, all deltas on that branch will be named ‘ 1.3. l.n’. If a delta on this branch then has another branch emanating from it, all deltas on the new branch will be named ‘1.3.2 .n'. The only informa- tion that may be derived from the name of delta 1. 3.2.2 is that it is the second chronological delta on the second chronological branch whose trunk ancestor is delta 1.3. In particular, it is nor possible to determine from the name of delta 1. 3.2.2 all of the deltas between it and its trunk ancestor (1.3). microsystems Revision A of 27 March 1990 110 Programming Utilities and Libraries Figure 4-3 Extending the Branching Concept Branch deltas allow the generation of arbitrarily complex tree structures. It is best to keep the use of branches to a minimum. Using Branches You can use branches when you need to keep track of an alternate versions developed in parallel, such as for bug fixes or experimental purposes. Before you can create a branch, you must enable the ‘ ‘branch’ ’ flag in the history file using the sees admin command, as follows: The -f b option sets the b (branch) flag in the history file. Creating a Branch Delta To create a branch from delta 1.3, for program . c you would use the sees edit subcommand shown below: When you check in your edited version, the branch delta will have SID 1.3. 1.1. Subsequent deltas made from this branch will be numbered 1.3. 1.2, and so on. Retrieving Versions From Branch deltas usually aren’t included in the version retrieved by get. To Branch Deltas retrieve a branch version (the version associated with a branch delta), you must specifically request it with the -r option. If you omit the sequence number, as in the next example, SCCS retrieves the highest delta in the branch: Asun w* microsystems Revision A of 27 March 1990 Chapter 4 — SCCS — Source Code Control System 111 hermes% sees get -rl .3.1 program. c 1.3. 1.1 87 .lines," 4,6. Administering SCCS By convention, history files and all temporary SCCS files reside in the SCCS sub- Files directory. In addition to the standard file protection mechanisms, SCCS allows certain releases to be frozen, and access to releases to be restricted to certain users (see sccs-admin(l) for details). History files normally have permissions set to 444 (read-only for everyone), to prevent modification by utilities other than SCCS. In general, it is not a good idea to edit the history files. A history file should have just one link. SCCS utilities update the history file by writing out a modified copy (x . file), and then renaming the copy. Interpreting Error Messages: The help subcommand displays information about SCCS error messages and sees help utilities. help normally expects either the name of an SCCS utility, or the code (in parentheses) from an SCCS error message. If you supply no argument, help prompts for one. The directory /usr /lib/help contains files with the text of the various messages help displays. Altering History File Defaults: There are a number of parameters that can be set using the admin command. The sees admin most interesting of these are flags. Flags can be added by using the -f option. For example: r hermes% sees admin -fdl program. c \ sets the ‘d’ flag to the value ‘1’. This flag can be deleted by using: hermes% secs admin -dd program. c V — — ■ — — - The most useful flags are: b Allow branches to be made using the -b option to s cc s edit (see Branches, above). dSID Default SID to be used on an secs get or secs edit. If this is just a release number it constrains the version to a particular release only. i Give a fatal error if there are no ID keywords in a file. This prevents a ver- sion from being checked in when the ID keywords are missing or expanded by mistake. y The value of this flag replaces the %Y% ID keyword. -t file store descriptive text from file in the s . file. This descriptive text might be the documentation or a design and implementation document. Using the -t option ensures that if the s . file is passed on to someone else, the microsystems Revision A of 27 March 1990 112 Programming Utilities and Libraries documentation will go along with it. If file is omitted, the descriptive text is deleted. To see the descriptive text, use prt -t. The sees admin command can be used safely any number of times on files. A current version need not be retrieved for admin to work. Validating the History File You can use the val subcommand to check certain assertions about a history file, val always checks for the following conditions: □ A corrupted history file. □ The history file can’t be opened for reading, or the file is not an SCCS his- tory. If you use the -r option, val checks to see if the indicated SID exists. Restoring the History File In particularly bad circumstances, the history file itself may get corrupted. The most common way this happens is for someone to edit it. Since the file contains a checksum, you will get errors every time you read a corrupted file. To correct the checksum, use: CAUTION When SCCS says that the history file is corrupted, it may indicate serious damage beyond an incorrect checksum. Be careful to safeguard your current changes before attempting to correct a history file. 4.7. Reference Tables Table 4-1 SCCS ID Keywords Keyword Expands to o, 7 o O Li'S @ ( # ) (search string for the what command) %M% The current module (file) name O.TO. ’ol’o The highest SID applied %w% shorthand for: %z%%M% tab %i% O, /''I o 'oo'S The date of the delta corresponding to the % i% keyword. 2-U 2- o r\ o The current release number. O.V9. o I "o The value of the t flag (set by sees admin). Revision A of 27 March 1990 Chapter 4 — SCCS — Source Code Control System 113 Table 4-2 SCCS Utility Commands SCCS Utility Programs Command Refer to: admin sccs-admin(l) cdc sccs-cdc(l) comb sccs-comb(l) delta sccs-delta(l) get sccs-get(l) help sccs-help(l) prs sccs-prs(l) rmdel sccs-rmdel(l) sact sccs-sact(l) sccsdif f sees -sccsdif f (1) unget sccs-unget(l) val sccs-val(l) what what(l) what is a general-purpose command. Table 4-3 Data Keywords for prs -d keyword Data Item File Section Value Dt : Delta information Delta Table * see below DL: Delta line statistics »» :Li:/:Ld:/ :Lu: Li: Lines inserted by Delta it nnnnn Ld: Lines deleted by Delta it nnnnn Lu : Lines unchanged by Delta ii nnnnn DT: Delta type ii D or R I : SCCS ID string (SID) it : R:.:L:.:B:.:S : R: Release number it nnnn L: Level number ti nnnn B: Branch number ti nnnn S : Sequence number ii nnnn D : Date Delta created it :Dy : / :Dm: / :Dd: Dy: Year Delta created it nn Dm: Month Delta created ii nn Dd: Day Delta created ii nn T : ' Time Delta created it : Th : : : Tm: : : Ts : Th: Hour Delta created it nn Tm: Minutes Delta created it nn Ts : Seconds Delta created ii nn P : Programmer who created Delta tl logname DS : Delta sequence number ti nnnn DP : Predecessor Delta seq-no. ti nnnn DI: Sequence number of deltas it :Dn : / : Dx : / :Dg : Format' microsystems Revision A of 27 March 1990 1 14 Programming Utilities and Libraries Table 4-3 Data Keywords for pr s -d — Continued Keyword Data Item File Section Value Format : Dn : Deltas included (seq #) tt :DS: : DS : . . . S :Dx: Deltas excluded (seq #) tt : DS : : DS : . . . S :Dg: Deltas ignored (seq #) »» : DS : : DS : ... S :MR: MR numbers for delta ti text M : C : Comments for delta it text M :UN : User names User Names text M :FL: Flag list Flags text M : Y : Module type flag if text S :MF: MR validation flag ti yes or no S :MP : MR validation pgm name n text S :KF: Keyword error/waming flag tt yes or no S :BF: Branch flag tt yes or no S : J: Joint edit flag tt yes or no S : LK: Locked releases it :R: ... S :Q : User defined keyword " text s :M: Module name tt text s :FB: Floor boundary tt : R : s : CB : Ceiling boundary it :R: s :Ds : Default SID it : I : s :ND : Null delta flag tt yes or no s : FD : File descriptive text Comments text M : BD : Body Body text M : GB : Gotten body tt text M :W: A form of what(\) string N/A :Z: :M:\t : I : S : A: A form of what( 1) string N/A : Z : : Y: :M: : I : : Z : S : Z : what( 1) string delimiter N/A 0 (#) s :F : SCCS file name N/A text s : PN : SCCS file path name N/A text s t = single-line format, M = multi-line : Dt : = : DT : :I: :D: :T: :P: :DS: :DP: Revision A of 27 March 1990 5 make User’s Guide 5.1. Overview This chapter describes Sun’s ver- sion of the make utility, which includes important features such as hidden dependency checking, com- mand dependency checking, pattern-matching rules, and automatic retrieval of SCCS files. This version can run successfully with makefiles written for previous versions of make. However, makefiles that rely on Sun’s enhancements may not be compati- ble with other versions of this utility. Refer to Appendix A, make Enhance- ments Summary for a complete sum- mary of Sun’s enhancements and compatibility issues. make streamlines the process of generating and maintaining object files and exe- cutable programs. It helps you to compile programs consistently, and eliminates unnecessary recompilation of modules that are unaffected by source code changes. make provides a number of features that simplify compilations, but you can also use it to automate any complicated or repetitive task that isn’t interactive. You can use make to update and maintain object libraries, to run test suites, and to install files onto a filesystem or tape. In conjunction with SCCS, you can use make to insure that a large software project is built from the desired versions in an entire hierarchy of source files. make reads a file that you create, called a makefile, which contains information about what files to build and how to build them. Once you write and test the makefile, you can forget about the processing details; make takes care of them. This gives you more time to concentrate on improving your code; the repetitive portion of the maintenance cycle is reduced to: think — edit — make — test . . . Dependency Checking: make While it is possible to use a shell script to assure consistency in trivial cases, vs. Shell Scripts scripts to build software projects are often inadequate. On the one hand, you don’t want to wait for a simple-minded script to compile every single program or object module when only one of them has changed. On the other hand, having to edit the script for each iteration can defeat the goal of consistency. Although it is possible to write a script of sufficient complexity to recompile only those modules that require it, make does this job better. make allows you to write a simple, structured listing of what to build and how to build it. It uses the mechanism of dependency checking to compare each module with the source or intermediate files it derives from, make only rebuilds a module if one or more of these prerequisite files, called dependency files, has changed since the module was last built. To determine whether a derived file is out of date with respect to its sources, make compares the modification time of the (existing) module with that of its dependency file. If the module is missing, or if it is older than the dependency file, make considers it to be out of date, and issues the commands necessary to rebuild it. 115 Revision A of 27 March 1990 116 Programming Utilities and Libraries Optionally, a module can be treated as out of date if the commands used to build it have changed. Because make does a complete dependency scan, changes to a source file are consistently propagated through any number of intermediate files or processing steps. This lets you specify a hierarchy of steps in a top-down fashion. You can think of a makefile as a recipe, make reads the recipe, decides which steps need to be performed, and executes only those steps that are required to produce the finished module. Each file to build, or step to perform, is called a target. The makefile entry for a target contains its name, a list of targets on which it depends, and a list of commands for building it. The list of commands is called a rule, make treats dependencies as prerequisite targets, and updates them (if necessary) before processing its current target. The rule for a target need not always produce a file, but if it does, the file for which the target is named is referred to as the target file. Each file from which a target is derived (e.g., that the target depends on) is called a dependency file. If the rule for a target produces no file by that name, make performs the rule and considers the target to be up-to-date for the remainder of the run. make assumes that only it will make changes to files being processed during the current run. If a source file is changed by another process while make is run- ning, the files it produces may be in an inconsistent state. Writing a Simple Makefile The basic format for a makefile target entry is: Figure 5-1 If there is no rule for a target entry, make looks for an implicit rule to use. Makefile Target Entry Format r — — \ target . . . : [ dependency . . . ] [ command ] s J If the dependency list is terminated with a semicolon and followed by a command, that command is included in the rule. However, makefiles tend to read better if you avoid this. In the first line, the list of target names is terminated by a colon. This, in turn, is followed by the dependency list if there is one. If several targets are listed, this indicates that each such target is to be built independently using the rule sup- plied. Subsequent lines that start with a I TAB I are taken as the commands lines that comprise the target’s rule. A common error is to use 1 SPACE 1 characters instead of the leading I TAB I . Lines that start with a # are treated as comments up until the next (unescaped) I NEWLINE 1 . and do not terminate the target entry. The target entry is terminated by the next nonempty line that begins with a character other than I TAB 1 or #, or by the end of the file. #sun Xr microsystems Revision A of 27 March 1990 Chapter 5 — make User’s Guide 117 A trivial makefile might consist of just one target: Figure 5-2 A Trivial Makefile test : Is test touch test > The convention is to use the name Makefile, since filenames starting with a capital are listed first by is; this highlights the fact that a makefile is present. When you run make with no arguments, it searches first for a file named makefile, or if there is no file by that name, Makefile. If either of these files is under SCCS control, make checks the makefile against its history file. If it is out of date, make extracts the latest version. If make finds a makefile, it begins the dependency check with the first target entry in that file. Otherwise you must list the targets to build as arguments on the command line, make displays each command it runs while building its targets. r A hemes% make Is test test not found touch test hermes% Is test test l Because the file test was not present (and therefore out of date), make per- formed the rule in its target entry. If you run make a second time, it issues a message indicating that the target is now up to date: r — A hermes% make 'test' is up to date. and skips the rule. make invokes a Bourne shell to pro- cess a command line if that line contains any shell metacharacters, such as a semicolon (; ), redirection symbols (<, >, », |), substitution symbols (*, ?, [], $, =), or quotes, escapes or comments (", etc. : ), If a shell isn’t required to parse the command line, make exec () ’s the command directly. Line breaks within a rule are significant in that each command line is performed by a separate process or shell. This means that a rule such as: test : cd /trap pwd V — behaves differently than you might expect, as shown below. #sun microsystems Revision A of 27 March 1990 118 Pro gramming U tilities and Libraries r ' ” ' ' ’ ' " ' > hermes% make test cd /tmp pwd /us r/ tutorial/ waite /a rcana/minor /pent angles v _ — — o You can use semicolons to specify a sequence of commands to perform in a sin- gle shell invocation: ! test : cd /tmp ; pwd J Or, you can continue the input line onto the next line in the makefile by escaping the [ NEWLINE 1 with a backslash (\). The escaped [ NEWLINE 1 is treated as white space by make. The backslash must be the last character on the line. The semi- colon is req uired by the shell. r A test : cd / tmp ; \ pwd k J Basic Use of Implicit Rules When there is no rule given for a specified target, make attempts to use an impli- cit rule to build it. When make finds a rule for the class of files the target belongs to, it applies the rule listed in the implicit rule’s target entry. In addition to any makefile(s) that you supply, make reads in the default makefile, /usr/include/make/ default .mk, which contains the target entries for a number of implicit rules, along with other information. 4 There are two types of implicit rules. Suffix rules specify a set of commands for building a file with one suffix from another file with the same basename but a different suffix. Pattern-matching rules select a rule based on a target and depen- dency that match respective wild-card patterns. The implicit rules provided by default are suffix rules. In some cases, the use of suffix rules can eliminate the need for writing a makefile entirely. For instance, to build an object file named functions . o from a single C source file named functions . c, you could use the command: hermes% make functions. o cc -sun4 -c functions. c -o functions. o This would work equally well for building the object file nonesuch . o from the source file nonesuch . c. 4 Implicit rules were hard-coded in earlier versions of make. microsystems Revision A of 27 March 1990 Chapter 5 — make User’s Guide 119 Processing Dependencies To build an executable file named functions (with a null suffix) from functions . c, you need only type the command: he rmes% make functions cc -sun4 -o functions functions. c ^ : ''' ' ' : . — — — :: ■ ■ ' J The rule for building a . o file from a . c file is called the . c . o (pronounced “dot-see-dot-oh”) suffix rule. The rule for building an executable program from a . c file is called the . c (“dot-see”) rule. The complete set of default suffix rules is listed in Table 5-1 . Once make begins, it processes targets as it encounters them in its depth-first dependency scan. For example, with the following makefile: batch : a b A touch batch b: touch b a : touch a c : echo "you won't see me" J make starts with the target batch. Since batch has some dependencies that haven’t been checked yet, namely a and b, make defers bat ch until after it has checked them against any dependencies they might have. Since a has no dependencies, make processes it; if the file is not present make performs the rule in its target entry. \ hermes% make touch a p : ; y Next, make works its way back up to the parent target batch. Since there is s till an unchecked dependency b, make descends to b and checks it. Xr microsystems Revision A of 27 March 1990 120 Programming Utilities and Libraries b also has no dependencies, so make performs its rule: r ~ ;* touch b ^ _ Finally, now that all of the dependencies for batch have been checked and built (if needed), make checks batch. Since it rebuilt at least one of the dependencies for batch, make assumes that batch is out of date and rebuilds it; if a or b had not been built in the current make run, but were present in the directory and newer than batch, make’s timestamp comparison would also result in batch being rebuilt: / N • . . touch batch v : ■ ; : ' j Target entries that aren’t encountered in a dependency scan are not processed. Although there is a target entry for c in the makefile, make does not encounter it while performing the dependency scan for batch, so its rule is not performed. Target entries that aren’t encountered in a dependency scan are not processed. You can select an alternate starting target like c by entering it as an argument to the make command. In the next example, the batch target produces no file. Instead, it is used as a label to group a set of targets. microsystems Revision A of 27 March 1990 Chapter 5 — make User’s Guide 121 batch : a b c a : al a2 touch a b: touch b c : touch c al : touch al a2 : touch a2 In this case, the targets are checked and processed as shown in the following diagram: □ make checks batch for dependencies and notes that there are three, and so defers it. □ make checks a, the first dependency, and notes that it has two dependencies of its own. So, continuing in like fashion, make: 1. Checks al, and if necessary, rebuilds it. 2. Checks a2, and if necessary, rebuilds it. 3. Determines whether to build a. 4. Checks b and rebuilds it if need be. 5. Checks and rebuilds c if needed. 6. After traversing its dependency tree, make checks and processes the topmost target, batch. If batch contained a rule, make would per- form that rule. Since batch has no rule, make performs no action, but notes that batch has been rebuilt; any targets depending on batch would also be rebuilt. microsystems Revision A of 27 March 1990 122 Programming Utilities and Libraries Null Rules If a target entry contains no rule, make attempts to select an implicit rale to build it. If make cannot find an appropriate implicit mle and there is no SCCS history from which to retrieve it, make concludes that the target has no corresponding file, and regards the missing rule as a null rule. With this makefile: You can use a dependency with a null rule to force the target’s rule to be executed. The conventional name for such a dependency is FORCE. r S haste : FORCE echo "haste makes waste" FORCE : k. J make performs the mle for making haste, even if a file by that name is up to date: r S herrr.es% touch haste hermes% make haste echo "haste makes waste" haste makes waste Unknown Targets If a target is named either on the command line or in a dependency list, and it: □ is not a file present in the working directory, □ has no target or dependency entry, □ does not belong to a class of files for which an implicit rale is defined, and □ has no SCCS history file, □ there is no mle specified for the . DEFAULT special target make stops processing and issues an error message. 5 he rate s% make believe make: Fatal error: Don' V b know how to make target 'believe' . Running Commands Silently You can inhibit the display of a command line within a rule by inserting an @ as the first non- [ TAB 1 character on that line. For example, the following target: c s quiet : @ echo you only see me once v y produces: hermes% make quiet \ you only see me once L J 5 However, if the -k option is in effect, make will continue with other targets that do not depend on the one in which the error occurred. microsystems Revision A of 27 March 1990 Chapter 5 — make User’s Guide 123 Special-function targets begin with adot(.). Target names that begin with a dot are never used as the starting target, unless specifically requested as an argument on the command line. If you want to inhibit the display of commands during a particular make run, you can use the -s option. If you want to inhibit the display of all command lines in every run, add the special target . SILENT to your makefile: > .SILENT: quiet : echo you only see me once V z Ignoring a Command’s Exit Status make normally issues an error message and stops when a command returns a nonzero exit code. For example, if you have the target: > rmxyz : rm xyz i and there is no file named xyz, make halts after rm returns its exit status. — > hermes% Is xyz xyz not found hermes% make rmxyz rm xyz rm: xyz: No such file or directory *** Error code 1 make: Fatal error: Command failed for target 'rmxyz' V If - and @ are the first two such characters, both take effect. To continue processing regardless of the command’s exit code, use a dash char- acter (-) as the first non- 1 TAB I character: > rmxyz : -rm xyz s ) In this case you get a warning message indicating the exit code make received: r \ hermes% make rmxyz rm xyz rm: xyz: No such file or directory *** Error code 1 (ignored) V_ ___ — _ — Unless you are testing a makefile, it Although it is generally ill-advised to do so, you can have make ignore error is usually a bad idea to ignore non- codes entirely with the -i option. You can also have make ignore exit codes zero error codes on a global basis. , . . ° w when processing a given makefile, by including the . IGNORE special target, though this too should be avoided. If you are processing a list of targets, and you want make to continue with the next target on the list rather than stopping entirely after encountering a non-zero Revision A of 27 March 1990 124 Programming Utilities and Libraries Automatic Retrieval of SCCS Files Suppressing SCCS Retrieval Passing Parameters: Simple make Macros return code, use the -k option. When source files are named in the dependency list, make treats them just like any other target. Because the source file is presumed to be present in the direc- tory, there is no need to add an entry for it to the makefile. When a target has no dependencies, but is present in the directory, make assumes that that file is up to date. If, however, a source file is under SCCS control, make does some addi- tional checking to assure that the source file is up to date. If the file is missing, or if the history file is newer, make automatically issues an sees get -s filename -Gfilename command to retrieve the most recent version: 6 However, if the source file is writable by anyone, make does not retrieve a new version. s — hermes% Is SCCS/* SCCS/ s . functions . c hermes% rm -f functions . C hermes% make functions secs get -s functions ;c -Gf unctions .c cc -sun4 -o functions functions. c i : ■ . . . J make only checks the timestamp of the retrieved version against the timestamp of the history file. It does not check to see if the version present in the directory is the most recently checked-in version. So, if someone has done a get by date (s cc s get -c), make would not discover this fact, and you might unwit- tingly build an older version of the program or object file. To be absolutely sure that you are compiling the latest version, you can precede make with an “sees get SCCS” or an “sees clean” command. The command for retrieving SCCS files is specified in the rule for the . SCCS_GET special target in the default makefile. To suppress automatic retrieval, simply add an entry for this target with an empty rule to your makefile: A # Suppress sees retrieval. . SCCS_GET : l J make’s macro substitution comes in handy when you want to pass parameters to commands lines within a makefile. Suppose that you sometimes wish to compile an optimized version of the program program using cc’s -0 option. You can lend this sort of flexibility to your makefile by adding a macro reference, such as the one below, to the target for functions: 6 With other versions of make automatic SCCS retrieval was a feature only of certain implicit rules. Also, unlike earlier versions, make only looks for history (s .) files in the SCCS subdirectory; history files in the current directory are ignored. #sun microsystems Revision A of 27 March 1990 Chapter 5 — make User’s Guide 125 functions: functions. c cc -sun 4 $ (CFLAGS) -o functions f unctions. c The macro reference acts as a placeholder for a value that you define, either in the makefile itself, or as an argument to the make command. If you then supply make with a definition for the CFLAGS macro, make replaces its references with the value you have defined. There is a reference to the cflags macro in both the . c and the . c . o implicit rules. The command-line definition must be a single argument, hence the if a macro is undefined, make expands its references to an empty string, quotes in this example. You can also include macro definitions in the makefile itself. A typical use is to set CFLAGS to -0, so that make produces optimized object code by default: CFLAGS= -O functions: functions. c cc -sun4 $ (CFLAGS) -o functions functions. c V , C i — : N hermes% rm functions henries! make functions "CFIAGS= -0" cc -sun4 -O -o functions functions .c V -J A macro definition supplied as a command line argument to make overrides other definitions in the makefile. 7 For instance, to compile functions for debugging with dbx or dbxtool, you can define the value of CFLAGS to be -g on the command line: henries! rm functions hermes% make CFLAGS=-g cc -sun4 -g -o functions functions.c V i To compile a profiling variant for use with gprof , supply both -0 and -pg in the value for CFLAGS. A macro reference must include parentheses when the name of the macro is longer than one character. If the macro name is only one character, the parentheses can be omitted. You can use curly braces, { and } , instead of parentheses. For example, ‘ $X’, ‘ $ (X ) ’, and ‘ $ { X } ’ are equivalent. Command Dependency In addition to the normal dependency checking, you can use the special target Checking and . KEEP_STATE . KEEP_STATE to activate command dependency checking. 8 When activated, make not only checks each target file against its dependency files, it compares each command line in the rule with those it ran the last time the target was built. This information is stored in a state file in the working directory. 7 Conditionally defined macros are an exception to this. Refer to Conditional Macro Definitions for details. 8 This feature is not available in earlier versions of make. microsystems Revision A of 27 March 1990 126 Programming Utilities and Libraries Suppressing or Forcing Command Dependency Checking for Selected Lines The State File With the makefile: CFLAGS= -0 . KEEP_STATE : functions: f unctions. c cc -sun4 -o functions f unctions. c the following commands work as shown: — — — — - — \ hermes% make : cc -sun4 -O -o functions functions . c hermes% make CFLAGS=-g cc -sun4 -g -o functions f unctions. c hermes% make "CFLAGS— -O -pg" cc -sun4 -O -pg -o functions f unctions. c V — This assures you that make compiles a program with the options you want, even if a different variant is present and otherwise up to date. The first make run with . KEEP_STATE in effect recompiles all targets. The KEEP_STATE variable, when imported from the environment, has the same effect as the . KEEP_STATE target. To suppress command dependency checking for a given command line, insert a question mark as the first character after the I TAB 1 . Command dependency checking is automatically suppressed for lines containing the dynamic macro $ ?. This macro stands for the list of dependencies that are newer than the current target, and can be expected to differ between any two make runs. 9 To force make to perform command dependency checking on a line containing this macro, prefix the command line with a ! character (follow- ing the [ TAB ) ). When . KEEP_STATE is in effect, make writes out a state file named .make . state, in the current directory. This file lists all targets that have ever been processed while . KEEP_STATE has been in effect, along with the rules to build them, in makefile format. In order to assure that this state file is maintained consistently, once you have added . KEEP_STATE to a makefile, we recommend that you leave it in effect. 10 9 See Implicit Rules and Dynamic Macros for more information. 10 Since this target is ignored in earlier versions of make, it does not introduce any compatibility problems. Other versions simply treat it as a superfluous target that no targets depend cat, with an empty rule and no dependencies of its own. Since it starts with a dot, it is not used as the starting target. microsystems Revision A of 27 March 1990 Chapter 5 — make User’s Guide 1 27 Hidden Dependencies and When a C source file contains # in elude directives for interpolating headers, . KEEP_STATE the target depends just as much on those headers as it does on the sources that include them. Because such headers may not be listed explicitly as sources in the compilation command line, they are called hidden dependencies. When . KEEP_STATE is in effect, make receives a report from the various compilers and compilation preprocessors indicating which hidden dependency files were interpolated for each target. 11 It adds this information to the dependency list in the state file. In subsequent runs, these additional dependencies are processed just like regular dependencies. This feature maintains the hidden dependency list for each target automatically; it insures that the dependency list for each target is always accurate and up to date. It also eliminates the need for the complicated schemes found in some earlier makefiles to generate complete dependency lists. A slight inconvenience can arise the first time make processes a target with hid- den dependencies, because there is as yet no record of them in the state file. If a header is missing, and make has no record of it, make won’t know that it needs to retrieve it from SCCS before compiling the target. So, even though there is an SCCS history file, the current version won’t be retrieved because it doesn’t yet appear in a dependency list or the state file. So, when the C preprocessor attempts to interpolate the header, it won’t find it; the compilation fails. Supposing that an tinclude directive for interpolating the header hidden . h is added to functions . c, and that the file hidden . h is somehow removed before the subsequent make run. The results would be: hennesl rm -f hidden. h hermes% make functions cc -sun4 -O -o functions f unctions. c f unctions. c: 2: Can't find include file hidden. h make: Fatal error: Command failed for target 'functions' • « « v _> A simple workaround might be to make sure that the new header is extant before you run make. Or, if the compilation should fail (and assuming the header is under SCCS), you could retrieve it from SCCS manually: f \ hermes% sees get hidden. h 1.1 10 lines hermesl make functions cc -sun4 -O -d functions functions .c In all future cases, should the header turn up missing, make will know to build or retrieve it for you, because it will be listed in the state file as a hidden depen- dency. 11 Also unavailable with earlier versions of make. microsystems Revision A of 27 March 1990 128 Programming Utilities and Libraries Note that with hidden dependency checking, the $ ? macro includes the names of hidden dependency files. This may cause unexpected behavior in existing makefiles that rely on $ ?. The problem with both of these approaches is that the first make in the local directory may fail due to a random condition in some other (include) directory. This might entail forcing someone to monitor a (first) build. To avoid this, you can use the . IN IT target to retrieve known hidden dependencies files from SCCS. . IN IT is a special target that, along with its dependencies, is built at the start of the make mn. To be sure that hidden . h is present, you could add the following line to your makefile; Displaying Information About a make Run There is an exception to this how- ever. make executes any command line containing a reference to the MAKE macro (i.e., $ (MAKE) or $ { MAKE } ), regardless of -n. So, it would be a very bad idea to include a line like: "$ (MAKE) ; rm -f *” in your makefile. make -n displays: make has some other options that you can use to keep abreast of what it’s doing and why: d Displays the criteria by which make determines that a target is be out- of-date. Unlike -n, it does process targets, as shown below. This options also displays the value imported from the environment (null by default) for the MAKEFLAGS macro, which is described in detail in a later section. Setting an environment variable named makeflags can lead to complications, since make adds its value to the list of options. To prevent puzzling surprises, avoid setting this variable. Running make with the -n option displays the commands make is to perform, without executing them. This comes in handy when verifying that the macros in a makefile are expanded as expected. With the following makefile: Hidden Dependencies and . INIT #sun microsystems Revision A of 27 March 1990 Chapter 5 — make User’s Guide 129 hermes% make -d MAKEFLAGS value: : Building main.o using suffix rule for .c.o because it is out of date relative to main.c: cc -0 -mc68020 -c main.c Building functions because it is out of date relative to main.o Building data.o using suffix rule for .c.o because it is out of date relative to data.ci cc -0 -mc68020 -c data.c Building functions because it is out of date relative to data.o cc -0 -mc68020 -o functions main.o data.o -dd This option displays all dependencies make checks, including any hid- den dependencies, in vast detail. -D Displays the text of the makefile as it is read. -DD Displays the makefile and the default makefile, the state file, and hidden dependency reports for the current make run. Several -f options indicate the con- catenation of the named makefiles. -f makefile make uses the named makefile (instead of makefile or Makefile), -p Displays the complete set of macro definitions and target entries. -p Displays the complete dependency tree for each target encountered. Due to its potentially troublesome side effects, we recommend against using the -t (touch) option for make. There is an option that can be used to shortcut make processing, the -t option. When run with -t , make does not perform the rule for building a target. Instead it uses touch to alter the modification time for each target that it encounters in the dependency scan. It also updates the state file to reflect what it built. This often creates more problems than it supposedly solves, and so we recommend that you exercise extreme caution if you do use it. Note that if there is no file corresponding to a target entry touch creates it. clean is the conventional name for a target that removes derived files. It is useful when you want to start a build from scratch. The folio wing is one example of how not to use make -t. Suppose you have a target named clean that performed housekeeping in the directory by removing target files produced by make: r clean : rm functions main.o data.o N 1 / If you give the nonsensical command: hermes% make -t clean touch clean hermes% make clean 'clean' is up to date. Vw ... J you then have to remove the file clean before your housekeeping target can work once again. microsystems Revision A of 27 March 1990 130 Programming Utilities and Libraries For a complete listing of all make options, refer to make(l) in the SunOS Refer- ence Manual. 5.2. Compiling Programs with make Compilation Strategies In previous examples you have seen how to compile a simple C program from a single source file, using both explicit target entries and implicit rules. Most C programs, however, are compiled from several source files. Many include library routines, either from one of the standard system libraries or from a user-supplied library. Although it may be easier to recompile and link a single-source program using a single cc command, it is usually more convenient to compile programs with multiple sources in stages — first, by compiling each source file into a separate object ( . o) file, and then by linking the object files to form an execut- able (a . out) file. This method requires more disk space, but subsequent (repetitive) recompilations need be performed only on those object files for which the sources have changed, which saves time. A Simple Makefile The makefile below is not all that elegant, but it does the job. Figure 5-3 Simple Makefile for Compiling C Sources: Everything Explicit In this example, make produces the object files main . o and data . o, and the executable file functions: Chapter 5 — make User’s Guide 131 Using make’s Predefined The next example performs exactly the same function, but demonstrates the use Macros of make ’s predefined macros for the indicated compilation commands. Using predefined macros eliminates the need to edit makefiles when the underlying compilation environment changes. They also provide access to the CFLAGS macro (and other FLAGS macros) for supplying compiler options from the com- mand line. Predefined macros are also used extensively within make’s implicit rules. The predefined macros in the following makefile are listed below. 12 They are generally useful for compiling C programs. Macro names that end in the string flags are used to pass options to a related compiler-command macro It is good practice to use these macros for consistency and porta- bility. It is also good practice to note the desired default values for them in the makefile. COMPILE . c The cc command line; composed of the values of CC, CFLAGS, CPPFLAGS, and TARGE T_ARC H , as follows, along with the -c option. COMPILE. c=$ (CC) $ (CFLAGS) $ (CPPFLAGS) -target $ (TARGET_ARCH :-%=%) -c The root of the macro name, COMP ile, is a convention used to indicate that the macro stands for a compilation command line (to generate an object, or . o file). The . c suffix is a mnemonic device to indicate that the command line applies to . c (C source) files. The complete list of all predefined LINK . c The basic cc command line to link object files, like macros is shown in Table 1 .2, COMPILE . c, but without the -c option and with a reference below ' to the LD FLAGS macro: — LINK . c=$ (CC) $ (CFLAGS) $ (CPPFLAGS) $ (LDFLAGS) -target $ (TARGET_ARCH : -%=%) V CC CFLAGS CPPFLAGS LDFLAGS The value cc. (You can redefine the value to be the pathname of an alternate C compiler.) Options for the cc command; none by default. Options for cpp; none by default. Options for the link editor, Id; none by default. TARGE T_ARC H The target- architecture argument to cc for use when cross- compiling. The default is set by make to the value returned by the arch command. This macro must be defined when using Sun’s optional cross-compilers. Refer to Cross- Compilation on the Sun Workstation for details. 12 Predefined macros are used more extensively than in earlier versions of make. Not all of the predefined macros shown here are available with earlier versions. Revision A of 27 March 1990 132 Programming Utilities and Libraries Figure 5-4 Makefile for Compiling C Sources Using Predefined Macros Since the command lines for compiling main . o and data . o from their respec- tive . c files are now functionally equivalent to the . c . o suffix rule, their target entries are, in a sense, redundant; make performs the same compilation whether they appear in the makefile or not. This next version of the makefile eliminates them, relying on the . c . o rule to compile the individual object files. Figure 5-5 Makefile for Compiling C Sources Using Suffix Rules As make processes the dependencies main . o and data . o, it finds no target entries for them. So, it checks for an appropriate implicit rule to apply. In this case, make selects the . c . o rule for building a . o file from a dependency file that has the same basename and a . c suffix. A complete list of suffix rules appears in Table 3-1 . Using Implicit Rules to Simplify a Makefile: Suffix Rules #sun ^sr microsystems Revision A of 27 March 1990 Chapter 5 — make User’s Guide 133 make uses the order of appearance in the suffixes list to determine which dependency file and suffix rule to use. For instance, if there were both main . c and main . s files in the directory, make would use the .c.o rule, since .c is ahead of . s in the list. Figure 5-6 Like clean, all is a target name used by convention. It builds "all" the targets in its dependency list. Normally, all is the first target; make and make all are usually equivalent. First, make scans its suffixes list to see if the suffix for the target file appears. In the case of main . o, . o appears in the list. Next, make checks for an suffix rule to build it with, and a dependency file to build it from. The dependency file has the same basename as the target, but a different suffix. In this case, while check- ing the .c.o rule, make finds a dependency file named main . c, so it uses that rule. The suffixes list is a special-function target named .SUFFIXES. The various suffixes are included in the definition for the SUFFIXES macro; the dependency list for . SUFFIXES is given as a reference to this macro: The Standard Suffixes List SUFFIXES= CO l o o o s' .S .S' .In .f .f' \ A .F .F' .1 .1' .mod .mod' . sym .def .def' .p .p~ \ .r .r~ .y ,y~ .h ,h~ .sh .sh' .cps . cps~ .SUFFIXES: $ (SUFFIXES) The following example shows a makefile for compiling a whole set of executable programs, each having just one source file. Each executable is to be built from a source file that has the same basename, and the . c suffix appended. For instance demo_l is built from demo_l . c. ( ' # Makefile for a set of C programs, one source # per program. The source file names have ".c" # appended. CFLAGS= -O . KEEP_STATE : all: demo_l demo_2 demo_3 demo_4 demo_5 ^ In this case, make does not find a suffix match for any of the targets (demo_l through demo_5). So, it treats each as if it had a null suffix. It then searches for an suffix rule and dependency file with a valid suffix. In the case of demo_2, it would find a file named demo_2 . c. Since there is a target entry for a . c rule, along with a corresponding . c file, make uses that rule to build demo_2 from demo 2 . c. To prevent ambiguity, when a target with a null suffix has an explicit depen- dency, make does not build it using a suffix rule. This makefile: program: zap A zap : > produces no output: microsystems Revision A of 27 March 1990 134 Programming Utilities and Libraries ' 'j hermesl make program hermes% V When to Use Explicit Target Whenever you build a target from multiple dependency files, you must provide Entries vs. Implicit Rules make with an explicit target entry that contains a mle for doing so. When build- ing a target from a single dependency file, it is often convenient to use an impli- cit rule. As the previous examples show, make readily compiles a single source file into a corresponding object file or executable. However, it has no built-in knowledge about how to link a list of object files into an executable program. Also, make only compiles those object files that it encounters in its dependency scan. It needs a starting point — a target for which each object file in the list (and ulti- mately, each source file) is a dependency. So, for a target built from multiple dependency files, make needs an explicit mle that provides a collating order, along with a dependency list that accounts for its dependency files. If each of those dependency files is built from just one source, you can rely on implicit rules for them. Implicit Rules and Dynamic Macros Because they aren’t explicitly defined in a makefile, the conven- tion is to document dynamic macros with the $-sign prefix attached (in other words, by showing the macro reference). make maintains a set of macros dynamically, on a target-by-target basis. These macros are used quite extensively, especially in the definitions of implicit rules. So, it is important to understand what they mean. They are: $ @ The name of the current target. $ ? The list of dependencies newer than the target. $< The name of the dependency file, as if selected by make for use with an implicit mle. $ * The basename of the current target (the target name stripped of its suffix). $ % For libraries, the name of the member being processed. See Building Object Libraries, below, for more information. Implicit mles make use of these dynamic macros in order to supply the name of a target or dependency file to a command line within the mle itself. For instance, in the . c . o rule, shown in the next example. r "n The macro output_option has an empty value by default. While similar to cflags in function, it is provided as a separate macro intended for passing an argument to . . the -o compiler option to force $< is replaced by the name of the dependency file (in this case the . c file) for compiler output to a given filename. the current target. . c . o : $ (COMPILE. c) $< $ (OUTPUT OPTION) Revision A of 27 March 1990 Chapter 5 — make User’s Guide 135 In the . c rule: . c : $ (LINK . c) $< -O $@ ^ J Dynamic Macro Modifiers Dynamic Macros and the Dependency List: Delayed Macro References $ @ is replaced with the name of the current target. Because values for the $< and $ * macros depend upon both the order of suffixes in the suffixes list, you may get surprising results when you use them in an expli- cit target entry. See Suffix Replacement in Macro References for a strictly deter- ministic method for deriving a filename from a related filename. Dynamic macros can be modified by including F and D in the reference. If the target being processed is in the form of a pathname, $ ( @F ) indicates the filename part, while $ ( @D ) indicates the directory part. If there are no / charac- ters in the target name, then $ ( @D ) is assigned the dot character ( . ) as its value. For example, with the target named /tmp/t est, $ ( @D ) has the value / tmp; $ ( @F ) has the value test. Dynamic macros are assigned while processing any and all targets. They can be used within the target’s rule as is, or in the dependency list by prepending an additional $ character to the reference. A reference beginning with $ $ is called a delayed reference to a macro. For instance, the entry: > x.o y.o z.o: $$@.BAK cp $@ . BAK $@ s could be used to derive x . o from x.o. BAK, and so forth for y . o and z . o. Dependency List Read Twice This technique works because make reads the dependency list twice, once as part of its initial reading of the entire makefile, and again as it processes a target’s dependencies. In each pass through the list, it performs macro expansion. Since the dynamic macros aren’t defined in the initial reading, unless references to them are delayed until the second pass, they are expanded to null strings. The string $ $ is a reference to the predefined macro ‘ $ ’. This macro, conveniently enough, has the value ‘ $ ’; when make resolves it in the initial reading, the string $ $ @ is resolved to $ @ . In dependency scan, when the resulting $ @ macro refer- ence has a value dynamically assigned to it, make resolves the reference to that value. Note that make only evaluate the target-name portion of a target entry in the first pass. A delayed macro reference as a target name will produce incorrect results. The makefile: microsystems Revision A of 27 March 1990 136 Programming Utilities and Libraries produces the results shown below. Rules Evaluated Once make evaluates the rule portion of a target entry only once per application of that command, at the time that the rule is executed. Here again, a delayed reference to a make macro will produce incorrect results. There is no transitive closure for suffix rules. If you had a suffix rule for build- ing, say, a . Y file from a . X file, and another for building a . z file from a . Y file, make would not combine their rules to build a . Z file from a . X file. You must specify the intermediate steps as targets, although their entries may have null rules: In this example trans . Z will be built from t rans . Y if it exists. Without the appearance of trans . Y as a target entry, make might fail with a “don’t know how to build” error, since there would be no dependency file to use. The target entry for trans . Y guarantees that make will attempt to build it when it is out of date or missing. Since no rule is supplied in the makefile, make will use the appropriate implicit rule, which in this case would be the . X . Y rule. If trans . X exists (or can be retrieved from SCCS), make rebuilds both trans . Y and trans . Z as needed. Although make supplies you with a number of useful suffix rules, you can also add new ones of your own. However, pattern-matching rules , 13 which are described in the next section, are to be preferred when adding new implicit rules. Unless you need to write implicit rules that are compatible with earlier versions of make, you may safely skip the remainder of this section, which describes the traditional method of adding implicit rules to makefiles. Adding a suffix rule is a two-step process. First, you must add the suffixes of both target and dependency file to the suffixes list by providing them as depen- dencies to the . SUFFIXES special target. Because dependency lists 13 Not available with earlier versions of make. Revision A of 27 March 1990 Adding Suffix Rules Pattern-matching rules, which are described in the previous section, are often easier to use than suffix rules. The procedure for adding implicit rules is given here for com- patibility with previous versions of make. No Transitive Closure for Suffix Rules Chapter 5 — make User’s Guide 137 Pattern-Matching Rules: an Alternative to Suffix Rules accumulate, you can add suffixes to the list simply by adding another entry for this target, for example: .SUFFIXES: .ms .tr A Second, you must add a target entry for the suffix rule: .ms.tr: troff -t -ms $< > $@ V J A makefile with these entries can be used to format document source files con- taining ms macros ( . ms files) into trof f output files ( . tr files): % hermes% make doc.tr troff -t -ms doc. ms > doc.tr (p J Entries in the suffixes list are contained in the SUFFIXES macro. To insert suffixes at the head of the list, first clear its value by supplying an entry for the . SUFFIXES target that has no dependencies. This is an exception to the rule that dependency lists accumulate. You can clear a previous definition for this target by supplying a target entry with no dependencies and no rule like this: .SUFFIXES: You can then add another entry containing the new suffixes, followed by a refer- ence to the SUFFIXES macro, as shown below. .SUFFIXES: .SUFFIXES: .ms .tr $ (SUFFIXES) A pattern-matching rule is similar to an implicit rule in function. Pattern- matching rules are easier to write, and more powerful, because you can specify a relationship between a target and a dependency based on prefixes (including pathnames) and suffixes, or both. A pattern-matching rule is a target entry of the form: tpkts : dpids rule where tp and ts are the optional prefix and suffix in the target name, respectively, dp and ds are the (optional) prefix and suffix in the dependency name, and % is a wild card that stands for a basename common to both. #sun XT microsystems Revision A of 27 March 1990 138 Programming Utilities and Libraries If there is no rule for building a target, make searches for a pattern-matching rule, before checking for a suffix rule. If make can use a pattern-matching rule, it does so. If the target entry for a pattern-matching rule contains no rule, make processes the target file as if it had an explicit target entry with no rule; make therefore searches for a suffix rule, attempts to retrieve a version of the target file from SCCS, and finally, treats the target as having a null rule (flagging that target as updated in the current run). A pattern-matching rule for formatting a trof f source file into a t rof f output file looks like: make’s Default Suffix Rules The tables below show the standard set of suffix rules and predefined macros sup- and Predefined Macros plied to make in the default makefile, /usr / include /make /default . mk. make checks for pattern-matching rules ahead of suffix rules. While this allows you to override the stan- dard implicit rules, doing so is not recommended. Table 5-1 make’s Standard Suffix Rules Suffix Rule Name Command Line( s) $ (COMPILE. s) -o $@ $< $ (COMPILE. s) -o $% $< $(AR) $ (ARFLAGS) $@ $% $ (RM) $% $ (COMPILE. S) -o $@ $< $ (COMPILE. S) -o $% $< $ (AR) $ (ARFLAGS) $@ $% $ (RM) $% $ (LINK . c) -o $@ $< $ (LDLIBS) $ (LINT . c) $ (OUTPUT OPTION) -i $< $ (COMPILE. c) $ (OUTPUT OPTION) $< $ (COMPILE. c) -o $% $< $ (AR) $ (ARFLAGS) $@ $% $ (RM) $% $ (LINK.f ) -o $@ $< $ (LDLIBS) $ (COMPILE. f) $ (OUTPUT OPTION) $< $ (COMPILE. f) -o $% $< $ (AR) $ (ARFLAGS) $@ $% $ (RM) $% $ (LINK.F) -o $@ $< $ (LDLIBS) $ (COMPILE. F) $ (OUTPUT OPTION) $< $ (COMPILE. F) -o $% $< $ (AR) $ (ARFLAGS) $@ $% $ (RM) $% Chapter 5 — mak e User’ s Guide 139 Table 5-1 make’s Standard Suffix Rules — Continued Use Suffix Rule Name Command Line( s) lex Files .1 $ (RM) $*.c $ (LEX. 1) $< > $*.c $ (LINK . c) -o $@ $*.c $ (LDLIBS) $ (RM) $*.c . 1 . c $ (RM) $@ $ ( LEX . 1 ) $< > $@ . 1 . In $ (RM) $*.c $ ( LEX . 1 ) $< > $*.c $ (LINT . c) -o $@ -i $*.c $ (RM) $*.c .1.0 $ (RM) $*.c $ (LEX. 1) $< > $*.c $ (COMPILE. c) -o $@ $*.c $ (RM) $* . c Modula 2 Files .mod . mod . o .def .sym $ (COMPILE. mod) -o $@ -e $@ $< $ (COMPILE. mod) -o $@ $< $ (COMPILE. def) -o $@ $< NeWS . cps .h $(CPS) $ (CPSFLAGS) $* . cps Pascal •P $ (LINK.p) -o $@ $< $ (LDLIBS) Files .p.o $ (COMPILE. p) $ (0UTPUT_0PTI0N) $< Ratfor . r $ (LINK . r) -o $@ $< $ (LDLIBS) Files . r . o $ (COMPILE. r) $ (0UTPUT_0PTI0N) $< . r .a $ (COMPILE. r) -o $% $< $ (AR) $ (ARFLAGS) $@ $% $ (RM) $% Shell Scripts . sh $ (RM) $@ cat $< >$@ chmod +x $@ yacc Files • y $ (YACC.y) $< $ (LINK . c) -o $@ y.tab.c $ (LDLIBS) $ (RM) y.tab.c .y.c $ (YACC.y) $< mv y.tab.c $@ .y.ln $ (YACC.y) $< $ (LINT . c) -o $@ -i y.tab.c $ (RM) y.tab.c .y.o $ (YACC.y) $< $ (COMPILE. c) -o $@ y.tab.c $ (RM) y.tab.c Revision A of 27 March 1990 140 Programming Utilities and Libraries Table 5-2 make's Predefined and Dynamic Macros Use Macro Default Value Library Archives AR ARFLAGS ar rv Assembler Commands AS ASFLAGS COMPILE . s COMPILE . S as $ (AS) $ (ASFLAGS) $ (TARGE T_ARCH) $ (CC) $ (ASFLAGS) $ (CPPFLAGS) -target $ (TARGET ARCH:-%=%) -c C Compiler Commands cc CFLAGS CPPFLAGS COMPILE . c LINK . c cc $(CC) $ (CFLAGS) $ (CPPFLAGS) $ ( TARGE T_ARCH) -c $ (CC) $ (CFLAGS) $ (CPPFLAGS) $ (LDFLAGS) -target $ (TARGET ARCH : -%=%) C++ Compiler Commands ccc CCFLAGS COMPILE.cc LINK . cc cc $