Comment: | eecg.toronto.edu wiki reference |
---|---|
Downloads: | Tarball | ZIP archive | SQL archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA1: |
b9bda3f6f081f25cd67c069c74fac602 |
User & Date: | martin_vahi on 2017-05-12 03:16:17 |
Other Links: | manifest | tags |
2017-05-17 07:34 | additional wiki references check-in: 54983c9b9b user: martin_vahi tags: trunk | |
2017-05-12 03:16 | eecg.toronto.edu wiki reference check-in: b9bda3f6f0 user: martin_vahi tags: trunk | |
2017-05-10 12:47 | wiki references check-in: cf7f03865c user: martin_vahi tags: trunk | |
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/COMMENTS.txt version [39a0921e6a].
> > > > > > > > > > > > > > > > |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
The origin: http://www.eecg.toronto.edu/parallel/publications.html (archival copy: https://archive.is/rZUhA ) ftp://ftp.cs.toronto.edu/pub/parallel Many, may be NOT all, of the HTML-files that the wget downloaded, were modified so that they reference the files from the ./manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_eduwww.eecg.toronto.edu in stead of containing URLs to the original FTP-site. |
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/ABSTRACTS version [f912e52af2].
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 |
--------------------------------------------------------------------- File: Ravi_Stumm_ICPP95.ps.Z Title: Hierarchical Ring Topologies and the effect of their Bisection Bandwidth Constraints Authors: G. Ravindran and M. Stumm Where : Proc. Intl. Conf. on Parallel Processing, pp.I/51-55, 1995 Keywords: Multiprocessor architectures, Interconnection networks, Hierarchical rings, Bisection bandwidth Abstract: Ring-based hierarchical networks are interesting alternatives to popular direct networks such as 2D meshes or tori. They allow for simple router designs, wider communications paths, and faster networks than their direct network counterparts. However, they have a constant bisection bandwidth, regardless of system size. In this paper, we present the results of a simulation study to determine how large hierarchical ring networks can become before their performance deteriorates due to their bisection bandwidth constraint. We show that a system with a maximum of 128 processors can sustain most memory access behaviors, but that larger systems can be sustained, only if their bisection bandwidth is increased. --------------------------------------------------------------------- File: Ravi_Stumm_JIEICE96.ps.Z Title : A Comparison of Blocking and Non-blocking Packet Switching Techniques in Hierarchical Ring Networks Authors: G. Ravindran and M. Stumm Where : IEICE Trans. Inf. & Syst., vol. E79-D, No. 8, August 1996 keywords: Networks, Switching, Wormhole, Virtual Cut-through, Hierarchical Ring Networks, Slotted Rings Abstract : This paper presents the results of a simulation study of blocking and non-blocking switching for hierarchical ring networks. The switching techniques include wormhole, virtual cut-through, and slotted ring. We conclude that slotted ring network performs better than the more popular wormhole and virtual cut-through networks. We also show that the size of the node buffers is an important parameter and that choosing them too large can hurt performance in some cases. Slotted rings have the advantage that the choice of buffer size is easier in that larger than necessary buffers do not hurt performance and hence a single choice of buffer size performs well for all system configurations. In contrast, the optimal buffer size for virtual cut-through and wormhole switching nodes varies depending on the system configuration and the level in the hierarchy in which the switching node lies. --------------------------------------------------------------------- File: Zhou_Brecht_SM91.ps.Z Title: Processor Pool-Based Scheduling for Large-Scale NUMA Multiprocessors Where: Appears in: Proceedings of the 1991 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, May (1991), pp. 133-142. Authors: Songnian Zhou and Timothy Brecht Keywords: NUMA, Schedulling, multiprocessor performance Abstract: Large-scale Non-Uniform Memory Access (NUMA) multiprocessors are gaining increased attention due to their potential for achieving high performance through the replication of relatively simple components. Because of the complexity of such systems, scheduling algorithms for parallel applications are crucial in realizing the performance potential of these systems. In particular, scheduling methods must consider the scale of the system, with the increased likelihood of creating bottlenecks, along with the NUMA characteristics of the system, and the benefits to be gained by placing threads close to their code and data. We propose a class of scheduling algorithms based on processor pools. A processor pool is a software construct for organizing and managing a large number of processors by dividing them into groups called pools. The parallel threads of a job are run in a single processor pool, unless there are performance advantages for a job to span multiple pools. Several jobs may share one pool. Our simulation experiments show that processor pool-based scheduling may effectively reduce the average job response time. The performance improvements attained by using processor pools increase with the average parallelism of the jobs, the load level of the system, the differentials in memory access costs, and the likelihood of having system bottlenecks. As the system size increases, while maintaining the workload composition and intensity, we observed that processor pools can be used to provide significant performance improvements. We therefore conclude that processor pool-based scheduling may be an effective and efficient technique for scalable systems. --------------------------------------------------------------------- File: Brecht_SEDMS93.ps.Z Title: On the Importance of Parallel Application Placement in NUMA Multiprocessors Authors: Timothy Brecht Where: Proceedings of the Fourth Symposium on Experiences with Distributed and Multiprocessor Systems (SEDMS IV), San Diego, CA, September, 1993. Keywords: NUMA, multiprocessor scheduling, multiprocessor performance Abstract: The thesis of this paper is that scheduling decisions in large-scale, shared-memory, NUMA (Non-Uniform Memory Access) multiprocessors must consider not only how many processors, but also which processors to allocate to each application. We call the problem of assigning parallel processes of an application to processors application placement. We explore the importance of placement decisions by measuring the execution time of several parallel applications using different placements on a shared-memory NUMA multiprocessor. The results of these experiments lead us to conclude that, as expected, in small- scale mildly NUMA multiprocessors, placement decisions have only a minor affect on the execution time of parallel applications. However, the results also show that placement decisions in large-scale multiprocessors are critical and localization that considers the architectural clusters inherent in these systems is essential. Our experiments also show that the importance of placement decisions increases substantially with the size and NUMAness of the system and that the placement of individual processes of an application within the subset of chosen processors also significantly impacts performance. --------------------------------------------------------------------- File: Kumar_Kulkarni_ICPP91.ps.Z (does not contain figures) Title: Generalized Unimodular Loop Transformations for Distributed Memory Multiprocessors Authors: K G Kumar*, D Kulkarni+ and A Basu Center for Development of Advanced Computing 2/1 Brunton Road, Bangalore 560 025, India * Now at IBM TJ Watson, York Town Heights, NY 10598 + Now at Dept of Computer Science, University of Toronto, Toronto, ON M5S 1A4 Where: International Conference of Parallel Processing -91 Keywords: Parallelizing Compilers, Restructuring Transformations, Loop Partitioning, Iteration Spaces, Dependence Vectors. Abstract In this paper, we present a generalized unimodular loop transformation as a simple, systematic and elegant method for partitioning the iteration spaces of nested loops for execution on distributed memory multiprocessors. We present a methodology for deriving the transformations that internalize multiple dependences in a multidimensional iteration space without resulting in a deadlocking situation. We then derive the general expression for the bounds of the transformed loops in terms of the bounds of the original space and the transformation matrix elements. ------------------------------------------------------------------- File: Kumar_Kulkarni_ICS92.ps.Z Title: Deriving Good Transformations for Mapping Nested Loops on Hierarchical Parallel Machines in Polynomial Time Authors: K G Kumar*, D Kulkarni+ and A Basu Center for Development of Advanced Computing 2/1 Brunton Road, Bangalore 560 025, India * IBM TJ Watson, York Town Heights, NY 10598 + Dept of Computer Science, University of Toronto, Toronto, ON M5S 1A4 Where: International Conference on Supercomputing 92 Keywords: Parallelizing Compilers, Restructuring Transformations, Loop Partitioning, Iteration Spaces, Dependence Vectors. We present a computationally efficient method for deriving the most appropriate transformation and mapping of a nested loop for a given hierarchical parallel machine. This method is in the context of our systematic and general theory of unimodular loop transformations for the problem of iteration space partitioning \cite{kandk6}. Finding an optimal mapping or an optimal associated unimodular transformation is NP-complete. We present a polynomial time method for obtaining a `good' transformation using a simple parameterized model of the hierarchical machine. We outline a systematic methodology for obtaining the most appropriate mapping. ------------------------------------------------------------------- File: Li_Tandri_et_ICPP93.ps.Z Title: LOCALITY AND LOOP SCHEDULING ON NUMA MULTIPROCESSORS Authors: Hui Li, Sudarsan Tandri Michael Stumm, and Kenneth C. Sevcik Where: International Conference on Parallel Processing 93 Keywords: NUMA multiprocessors, Locality, Scheduling Abstract: An important issue in the parallel execution of loops is how to partition and schedule the loops onto the available processors. While most existing dynamic scheduling algorithms manage load imbalances well, they fail to take locality into account and therefore perform poorly on parallel systems with non-uniform memory access times. In this paper, we propose a new loop scheduling algorithm, Locality-based Dynamic Scheduling (LDS), that exploits locality, and dynamically balances the load. -------------------------------------------------------------- File: Sandhu_et_al_PPOPP.ps.Z Title: The shared regions approach to software cache coherence on multiprocessors Where: Appears in: Proceedings of the 1993 ACM SIGPLAN Symposium on Principles and Pranctice of Parallel Programming, May (1993). Authors: Harjinder Sandhu, Benjamin Gamsa and Songnian Zhou Keywords: NUMA, cache coherence, multiprocessor performance Abstract: The effective management of caches is critical to the performance of applications on shared-memory multiprocessors. In this paper, we discuss a technique for software cache coherence that is based upon the integration of a program-level abstraction for shared data with software cache management. The program-level abstraction, called {\it Shared Regions}, explicitly relates synchronization objects with the data they protect. Cache coherence algorithms are presented which use the information provided by shared region primitives, and ensure that shared regions are always cacheable by the processors accessing them. Measurements and experiments of the Shared Region approach on a shared-memory multiprocessor are shown. Comparisons with other software based coherence strategies, including a user-controlled strategy and an operating system-based strategy, show that this approach is able to deliver better performance, with relatively low corresponding overhead and only a small increase in the programming effort. Compared to a compiler-based coherence strategy, the Shared Regions approach still performs better than a compiler that can achieve 90\% accuracy in allowing cacheing, as long as the regions are a few hundred bytes or larger, or they are re-used a few times in the cache. ------------------------------------------------------------------- File: Wilton_Vranesic_SPDP.ps.Z Title: Architectural Support for Block Transfers in a Shared-Memory Multiprocessor Authors: Steven J.E. Wilton and Zvonko G. Vranesic To appear in the Fifth IEEE Symposium on Parallel and Distributed Processing, Irving, Texas, December 1993 Keywords: Shared-memory multiprocessor, block transfer support Abstract: This paper examines how the performance of a shared-memory multiprocessor can be improved by including hardware support for block transfers. A system similar to the Hector multiprocessor developed at the University of Toronto is used as a base architecture. It is shown that such hardware support can improve the performance of initialization code by as much as 50%, but that the amount of improvement depends on the memory access behavior of the program and the way in which the operating system issues block transfer requests. ---------------------------------------------------------------------- File: Sevcik_Zhou_PERF93.ps.Z Title: Performance Benefits and Limitations of Large NUMA Multiprocessors Authors: Kenneth C. Sevcik and Songnian Zhou Where: appeared in the Proceedings of Performance '93 , Rome, Italy, September 27 to October 1, 1993, pp. 183-204, Elsevier Science Publ. Abstract: Please see the ps file. ---------------------------------------------------------------------- File: Harz_Sevcik_SC93.ps.Z Title: Hot Spot Analysis in Large Scale Shared Memory Multiprocessors Authors: Karim Harzallah and Kenneth C. Sevcik Where: will appear in the Proceedings of the Supercomputing '93 Conference, November, 1993, Portland, Oregon. Abstract: Please see the ps file. ----------------------------------------------------------------------- File: Sevcik_JPE.ps.Z Title: Application Scheduling and Processor Allocation in Multiprogrammed Parallel Processing Systems Authors: Kenneth C. Sevcik Where: This paper will appear in a special issue of the journal "Performance Evaluation" on the performance evaluation of parallel systems in late 1993 or early 1994. Abstract: Please see the ps file. ----------------------------------------------------------------------- File : Holliday_Stumm_IEEETC.ps.Z Title: Performance Evaluation of Hierarchical Ring-Based Shared Memory Multiprocessors Authors: Mark Holliday Dept. of Computer Science, Duke University, Durham, NC 27706 Michael Stumm Dept. of Electrical and Computer Engineering University of Toronto, Toronto, Canada M5S 1A4 Date: November 1992; revised April 1993 Published: Technical Report CS-1992-18, Duke University Accepted for publication in IEEE Transactions on Computers Keywords: communication locality; hierarchical ring-based networks; hot spots; large scale parallel systems; memory banks; performance evaluation; prefetching; shared memory multiprocessors; simulation. Abstract: This paper investigates the performance of word-packet, slotted unidirectional ring-based hierarchical direct networks in the context of large-scale shared memory multiprocessors. Slotted unidirectional rings are attractive because their electrical characteristics and simple interfaces allow for fast cycle times and large bandwidths. For large-scale systems, it is necessary to use multiple rings for increased aggregate bandwidth. Hierarchies are attractive because the topology ensures unique paths between nodes, simple node interfaces and simple inter-ring connections. To ensure that a realistic region of the design space is examined, the architecture of the network used in the Hector prototype is adopted as the initial design point. A simulator of that architecture has been developed and validated with measurements from the prototype. The system and workload parameterization reflects conditions expected in the near future. The results of our study show the importance of system balance on performance. Large-scale systems inherently have large communication delays for distant accesses, so processor efficiency will be low, unless the processors can operate with multiple outstanding transactions using techniques such as prefetching, asynchronous writes and multiple hardware contexts. However with multiple outstanding transactions and only one memory bank per processing module, memory quickly saturates. Memory saturation can be alleviated by having multiple memory banks per processing module, but this shifts the bottleneck to the ring subsystem. While the topology of the ring hierarchy affects performance, the ring subsystem will inherently limit the throughput of the system. Hence increasing the number of outstanding transactions per processor beyond a certain point only has a limiting effect on performance, since it causes some of the rings to become congested. An adaptive maximum number of outstanding transactions appears necessary to adjust for the appropriate tradeoff between concurrency and contention as the communication locality changes. We show the relationships between processor, ring and memory speeds, and their effects on performance. Using block transfers for prefetching seems unlikely to be advantageous in that the improvement in the cache hit ratio needed to compensate for the increased network utilization is substantial. ------------------------------------------------------------------------- File : Curran_Stumm_CS.ps.Z Title: A Comparison of basic CPU Scheduling Algorithms for Multiprocessor Unix Authors: Stephen Curran and Michael Stumm Department of Electrical and Computer Engineering University of Toronto, Toronto, Canada M5S 1A4 Published: Computer Systems, 3(4), Oct., 1990, pp. 551--579. Abstract: In this paper, we present the results of a simulation study comparing three basic algorithms that schedule independent tasks in multiprocessor versions of Unix. Two of these algorithms, namely Central Queue and Initial Placement, are obvious extensions to the standard uniprocessor scheduling algorithm and are in use in a number of multiprocessor systems. A third algorithm, Take, is a variation on Initial Placement, where processors are allowed to raid the task queues of the other processors. Our simulation results show the difference between the performance of the three algorithms to be small when scheduling a typical Unix workload running on a small, bus-based, shared memory multiprocessor. They also show that the Take algorithm performs best for those multiprocessors on which tasks incur overhead each time they migrate. In particular, the Take algorithm appears to be more stable than the other two algorithms under extreme conditions. ----------------------------------------------------------------------- File: Stumm_Unrau_Krieger_USENIX92.ps.Z Title: HIERARCHICAL CLUSTERING: A STRUCTURE FOR SCALABLE MULTIPROCESSOR OPERATING SYSTEM DESIGN Authors: Michael Stumm, Ron Unrau, and Orran Krieger Where: Extended version of Clustering Micro-Kernels for Scalability, Proc.\ of the Usenix Workshop on Micro-Kernels and Other Kernel Architectures, April, 1992. Abstract: Please see the ps file. ---------------------------------------------------------------------- File: Stumm_Vranesic_White_IPPS93.ps.Z Title: EXPERIENCE WITH THE HECTOR MULTIPROCESSOR Authors: Michael Stumm, Zvonko Vranesic, Ron White Where: Extended version of paper with same title in Proc.\ Intl.\ Parallel Processing Symposium Parallel Systems Fair, 1993, pp.\ 9--16. Abstract: Please see the ps file. ---------------------------------------------------------------------- File: Krieger_Stumm_Unrau_USENIX92.ps.Z Title: The Alloc Stream Facility: A redesign of application-level Stream I/O Authors: O. Krieger, M. Stumm, and R. Unrau Where: Extended version of ``Exploiting the advantages of mapped files for stream I/O'' in Proc.\ of the Winter 1992 Usenix Conference, January, 1992. Abstract: This paper describes the design and implementation of a new application level I/O facility, called the Alloc Stream Facility. The Alloc Stream Facility has several key advantages. First, performance is substantially improved as a result of a)~the structure of the facility that allows it to take advantage of system specific features like mapped files, and b)~a reduction in data copying and the number of I/O system calls. Second, the facility is designed for multi-threaded applications running on multiprocessors and allows for a high degree of concurrency. Finally, the facility can support a variety of I/O interfaces, including stdio, emulated Unix I/O, ASI, and C++ streams, in a way that allows applications to freely intermix calls to the different interfaces, resulting in improved code reusability. We show that on several Unix workstation platforms the performance of Unix applications using the Alloc Stream Facility can be substantially better that when the applications use the original I/O facilities. ---------------------------------------------------------------------- File: Krieger_Stumm_DAGS93.ps.Z Title: HFS: A Flexible File System for large-scale Multiprocessors Authors: Orran Krieger and Michael Stumm Where: Proceedings of the 1993 DAGS/PC Symposium Abstract: The {H{\sc urricane}} File System (HFS) is a new file system being developed for large-scale shared memory multiprocessors with distributed disks. The main goal of this file system is scalability; that is, the file system is designed to handle demands that are expected to grow linearly with the number of processors in the system. To achieve this goal, HFS is designed using a new structuring technique called Hierarchical Clustering. HFS is also designed to be flexible in supporting a variety of policies for managing file data and for managing file system state. This flexibility is necessary to support in a scalable fashion the diverse workloads we expect for a multiprocessor file system. ---------------------------------------------------------------------- File: Krieger_etal_ICPP93.ps.Z Title: A fair fast scalable reader-writer lock Authors: O. Krieger, M. Stumm, R. Unrau, and J. Hanna, Where: Proc. Intl. Conf. on Parallel Processing, 1993. Abstract: A reader-writer lock allows either multiple readers to inspect shared data or a single writer exclusive access to that data. On shared memory multiprocessors, the cost of acquiring and releasing these locks can have a large impact on the performance of parallel applications. Other researchers have shown how to implement scalable locks, that is, locks that can become contended without resulting in memory or interconnection network contention. This paper describes a new algorithm for a reader-writer lock that, while being scalable in the contended case, has a low overhead in the uncontended case. This is important because most parallel applications are written so that locks are typically uncontended. The only atomic operation required by this algorithm is fetch_and_store and hence it can be used on most current multiprocessor systems. Experimental results are provided. ---------------------------------------------------------------------- File: Kulkarni_Stumm_Tutorial.ps.Z Title: Loop and Data Transformations: A tutorial Authors: Dattatraya Kulkarni and Michael Stumm Where: Internal document, a tutorial guide. Abstract: Hierarchically structured machines appear to be becoming the dominant parallel computing structure. These systems have non-uniform access times. We address the problem of restructuring a possibly sequential program to execute efficiently on such parallel machines. This restructuring involves transforming and partitioning the loop structures and the data to so as to improve {\it parallelism}, {\it static} and {\it dynamic locality}, and {\it load balance}. The objective of this paper is to present previous and ongoing work on loop and data transformations and motivate a {\it unified} framework to restructuring of a sequence of loops and data so as to execute efficiently on parallel machines with several levels of hierarchy. ---------------------------------------------------------------------- File: Baru_Zilio_PADS93.ps.Z Title: Data reorganization in parallel database systems Author: Chaitanya Baru & Daniel C. Zilio Where : Proc. of the IEEE Workshop on Advances in Parallel and Distributed Systems}, Princeton, NJ, pp.102-107, Oct. 1993. Abstract: Parallel database systems are suitable for use in applications with high capacity and high performance and availability requirements. The trend in such systems is to provide efficient on-line capability for performing various system administration functions such as, index creation and maintenance, backup/restore, reorganization, and gathering of statistics. For some of these functions, the on-line capability can be efficiently supported by the use of ``incremental algorithms", i.e., algorithms that achieve the function in several, relatively small (i.e., less time-consuming) steps, rather than in a single, large step. Incremental algorithms ensure that only small parts of the database become inaccessible for short durations as opposed to non-incremental algorithms which may lock large portions of the database or the entire database for a longer duration. In this paper, we discuss issues in providing concurrent data reorganization capability using incremental algorithms in parallel database systems. ---------------------------------------------------------------------- File: Kulkarni_Stumm_292.ps.Z Title: Computational Alignment: A new, unified program transformation for local and global optimization Authors: Dattatraya Kulkarni and Michael Stumm Where: CSRI Tech report 292, ISSN 0834-1648 Abstract: {\small {\em Computational Alignment} is a new class of program transformations suitable for both local and global optimization. Computational Alignment transforms all of the computations of a {\em portion} of the loop body in order to align them to other computations either in the same loop or in another loop. It extends along a new dimension and is significantly more powerful than linear transformations because $i)$ it can transform subsets of dependences and references; $ii)$ it is sensitive to the location of data in that it can move the computation relative to data; $iii)$ it applies to imperfect loop nests; and $iv)$ it is the first loop transformation that can change {\it access vectors}. Linear transformations are just a special case of Computational Alignment. Computational Alignment is highly suitable for global optimization because it can transform given loops to access data in similar ways. Two important subclasses of Computational Alignment are presented as well, namely, {\em Freeing} and {\em Isomerizing} Computational Alignment.} ------------------------------------------------------------- File: Brecht_PhD_303.ps.Z Title: Multiprogrammed Parallel Application Scheduling in NUMA Multiprocessors Authors: Timothy B. Brecht Where: Ph.D. Dissertation - CSRI Technical Report CSRI-303 Abstract: The invention, acceptance, and proliferation of multiprocessors are primarily a result of the quest to increase computer system performance. The most promising features of multiprocessors are their potential to solve problems faster than previously possible and to solve larger problems than previously possible. Large-scale multiprocessors offer the additional advantage of being able to execute multiple parallel applications simultaneously. The execution time of a parallel application is directly related to the number of processors it is allocated and, in shared-memory non-uniform memory access time (NUMA) multiprocessors, which processors it is allocated. As a result, efficient and effective scheduling becomes critical to overall system performance. In fact, it is likely to be a contributing factor in ultimately determining the success or failure of shared-memory NUMA multiprocessors. The subjects of this dissertation are the problems of processor allocation and application placement. The processor allocation problem involves determining the number of processors to allocate to each of several simultaneously executing parallel applications and possibly dynamically adjusting those allocations to improve overall system performance. The performance metric used is mean response time. We show that by differentiating between applications based on the amount of remaining work they have to execute, performance can be improved significantly. Then we propose techniques for estimating an application's expected remaining work along with policies for using these estimates to make improved processor allocation decisions. An experimental evaluation demonstrates the promise of this approach. The placement problem involves determining which of the many processors to assign to each application. Using experiments conducted on a representative system, we demonstrate that in large-scale NUMA multiprocessors the execution time of parallel applications is significantly affected by the placement of the application. This motivates the need for new techniques designed explicitly for NUMA multiprocessors. We introduce such a technique, called processor pool-based scheduling, that is designed to localize the execution of parallel applications within a NUMA architecture and to isolate different parallel applications from each other. An experimental evaluation of this scheduling method shows that it can be used to significantly reduce mean response time over methods that do not consider the placement of parallel applications. ------------------------------------------------------------------- File: Gamsa_MASc.ps.Z Title: Region-Oriented Main Memory Management in Shared-Memory NUMA Multiprocessors Authors: Benjamin Gamsa Where: M.Sc. Thesis Abstract: In Non-Uniform Memory Access time (NUMA) multiprocessors, distribution of the memory modules facilitates architectural scaling, but creates complications for the programmers who must be concerned with the physical distribution of their data in order to obtain good performance. In order to reduce the impact of remote accesses, in this thesis we propose that data be partitioned into Shared Regions that reflect the granularity of data sharing in programs, and that special synchronization calls be added to enforce proper ordering of accesses to the shared data as well as to manage replication and consistency transparently to the programmer. Results from measurements on a 16-processor NUMA multiprocessor and from a model of the system indicate that the Shared Regions approach is successful in obtaining the necessary locality critical to performance, while incurring only minimal overhead. Data distribution methods are also observed to have a significant impact on the performance of the system, especially in the larger multiprocessors modeled. ------------------------------------------------------------------- File: Unrau_PhD.ps.Z Title: Scalable Memory Management through Hierarchical Symmetric Multiprocessing Authors: Ronald C. Unrau Where: Ph.D. Disseration Abstract: This dissertation examines scalability issues in the design of operating systems for large-scale, shared-memory multiprocessors. In particular, the thesis focuses on structuring issues as they relate to memory management. From a set of simple, well-known queuing network formulas, we derive a set of properties that describe sufficient conditions for an operating system to scale. From these properties we first develop a set of guidelines for designing scalable systems, and then develop a new structuring philosophy for shared-memory multiprocessor operating systems, called Hierarchical Symmetric Multiprocessing (HSM). HSM manages the system resources in clusters, using tight coupling within a cluster, and loose coupling across clusters. Distributed systems principles are applied by distributing and replicating system services and data objects to increase locality, increase concurrency, and to avoid centralized bottlenecks, thus making the system scalable. However, tight coupling is used within a cluster, so the system performs well for local interactions. HSM maximizes locality which is key to good performance in large systems, and systems based on HSM can easily be adapted to different hardware configurations and architectures by changing the size of the clusters. Finally, HSM leads to a modular system composed from easy-to-design and hence efficient building blocks. Memory management is a particularly challenging service to implement within the HSM framework, because it must provide the applications with an integrated and coherent view of a single system, while distributing and replicating services in order to fully exploit the hardware potential. We describe in detail the implementation of an HSM structured memory management subsystem, and evaluate the performance of our implementation on Hector, a prototype scalable shared memory multiprocessor. ------------------------------------------------------------------- File: Wu_MASc.ps.Z Title: Processor Scheduling in Multiprogrammed Shared Memory NUMA Multiprocessors Authors: Chee-Shong Wu Where: M.Sc. Thesis Abstract: In a multiprogrammed multiprocessor, the scheduler is not only responsible for deciding when to activate an application and when to suspend it, but is also responsible for determining how many processors to allocate to each application. In a scalable Non- Uniform Memory Access (NUMA) multiprocessor, it must further resolve the problem of which processor(s) to allocate to which application since the memory reference times are not the same for all processor-memory pairs. In this thesis, we study the problem of how to characterize parallel applications and how to apply this knowledge in scheduling for NUMA systems. We also study the performance of several scheduling algorithms in an NUMA environment. These algorithms differ in the frequency of reallocations. We propose two policies, the Static policy and the Immediate Start Static policy, that utilize application characteristics when making scheduling decisions. The performance of these two policies is compared with that of the Dynamic policy, on an NUMA multiprocessor, Hector. --------------------------------------------------------------------- File: Parsons_Sevcik_IPPS95.ps.Z Title: Multiprocessor Scheduling for High-Variability Service Time Distributions Where: IPPS '95 Workshop on Job Scheduling Strategies for Parallel Processing reprinted in Springer-Verlag Lecture Notes in Computer Science, Vol 949, pages 127--145. Authors: Eric W. Parsons and Kenneth C. Sevcik Keywords: Scheduling, multiprocessor performance Abstract: Many disciplines have been proposed for scheduling and processor allocation in multiprogrammed multiprocessors for parallel processing. These have been, for the most part, designed and evaluated for workloads having relatively low variability in service demand. But with reports that variability in service demands at high performance computing centers can actually be quite high, these disciplines must be reevaluated. In this paper, we examine the performance of two well-known static scheduling disciplines, and propose preemptive versions of these that offer much better mean response times when the variability in service demand is high. We argue that, in systems in which dynamic repartitioning in applications is expensive or impossible, these preemptive disciplines are well suited for handling high variability in service demand. --------------------------------------------------------------------- File: Okrieg_PhD.ps.Z Title: HFS: A flexible file system for shared-memory multiprocessors Where: PhD Dissertation, Department of Electrical and Computer Engineering, University of Toronto Authors: Orran Krieger Keywords: File System, I/O, Hurricane, Hector NUMA multiprocessor Abstract: The Hurricane File System (HFS) is designed for large-scale, shared-memory multiprocessors. Its architecture is based on the principle that a file system must support a wide variety of file structures, file system policies and I/O interfaces to maximize performance for a wide variety of applications. HFS uses a novel, object-oriented building-block approach to provide the flexibility needed to support this variety of file structures, policies, and I/O interfaces. File structures can be defined in HFS that optimize for sequential or random access, read-only, write-only or read/write access, sparse or dense data, large or small file sizes, and different degrees of application concurrency. Policies that can be defined on a per-file or per-open instance basis include locking policies, prefetching policies, compression/decompression policies and file cache management policies. In contrast, most existing file systems have been designed to support a single file structure and a small set of policies. We have implemented large portions of HFS as part of the Hurricane operating system running on the \hec\ shared-memory multiprocessor. We demonstrate that the flexibility of HFS comes with little processing or I/O overhead. Also, we show that HFS is able to deliver the full I/O bandwidth of the disks on our system to the applications. --------------------------------------------------------------------- File: Vranesic_etal_IEEEC.ps.Z Title: Hector -- A hierarchically structured shared memory multiprocessor Where: IEEE Computer, 24(1): 72-80, January, 1991. Authors: Z. Vranesic, M. Stumm, D. Lewis and R. White Keywords: Shared memory multiprocessors, slotted rings, NUMA, Scalability. Abstract: Please see the ps file. --------------------------------------------------------------------- File: Kulkarni_etal_317.ps.Z Title: A Generalized Theory of Linear Loop Transformations Where: CSRI Tech Report 317 Authors: D. Kulkarni, M. Stumm, R. Unrau Keywords: Computational Alignment, Computation Decomposition, Linear loop transformations, SPMD code generation. Abstract: In this paper we present a new theory of linear loop transformations called {\em Computation Decomposition and Alignment\/} (CDA). A CDA transformation has two components: {\em Computation Decomposition\/} first decomposes the computations in the loop into computations of finer granularity, from iterations to instances of subexpressions. {\em Computation Alignment\/} subsequently, linearly transforms each of these sets of computations, possibly by using a different transformation for each set. This framework subsumes all existing linear transformation frameworks in that it reduces to a conventional linear loop transformation when the smallest granularity is an iteration, and it reduces to some of the more recently extended frameworks when the smallest granularity is a statement instance. The possibility of being able to align computations at arbitrary granularities adds a new dimensions to performance optimization on high performance computing platforms. We describe Computation Decomposition and Alignment and provide examples of CDA transformations. We present some heuristics to derive appropriate CDA transformations, given a desired optimization objective. We present the results of experiments run on the KSR1 multiprocessor and various RS6000 and Sparc platforms that demonstrate that CDA can result in substantial performance improvements. --------------------------------------------------------------------- File: Kulkarni_Stumm_ACJ95.ps.Z Title: Linear Loop Transformations in Optimizing Compilers for Parallel Machines Where: To appear in the Australian Computer Journal Authors: D. Kulkarni, M. Stumm Keywords: Linear loop transformations Abstract: We present the linear loop transformation framework which is the formal basis for state of the art optimization techniques in restructuring compilers for parallel machines. The framework unifies most existing transformations and provides a systematic set of code generation techniques for arbitrary compound loop transformations. The algebraic representation of the loop structure and its transformation give way to quantitative techniques for optimizing performance on parallel machines. We discuss in detail the techniques for generating the transformed loop and deriving the desired linear transformation. --------------------------------------------------------------------- File: Manjikian_Abdelrahman_315.ps.Z Title: Fusion of Loops for Parallelism and Locality Where: CSRI Tech Report 315 Authors: N. Manjikian and T. Abdelrahaman Keywords: Loop fusion, cache performance, locality, NUMA Abstract: Loop fusion improves data locality and reduces synchronization in data-parallel applications. However, loop fusion is not always legal. Even when legal, fusion may introduce loop-carried dependences which reduce parallelism. In addition, performance losses result from cache conflicts in fused loops. We present new, systematic techniques which: (1) allow fusion of loop nests in the presence of fusion-preventing dependences, (2) allow parallel execution of fused loops with minimal synchronization, and (3) eliminate cache conflicts in fused loops. We evaluate our techniques on a 56-processor KSR2 multiprocessor, and show performance improvements of up to 20% for representative loop nest sequences. The results also indicate a performance tradeoff as more processors are used, suggesting a careful evaluation of the profitability of fusion. <!---------------------------------------------------------------------> <HR> <A NAME="Kulkarni_Stumm_LCR95">.</A> <HR> <B>Title:</B> <A HREF="ftp://ftp.cs.toronto.edu/pub/parallel/Kulkarni_Stumm_LCR95.ps.Z">CDA Loop Transformations</A> <P> <B>Authors:</B> Dattatraya Kulkarni and Michael Stumm <P> <B>Where:</B> Proceedings of the Third workshop on languages, compilers and run- time systems for scalable computers}, Troy, NY, May 1995, Kluwer Academic. <P> <B>Abstract:</B> <P> In this paper we present a new loop transformation technique called {\em Computation Decomposition and Alignment\/} (CDA). {\em Computation Decomposition\/} first decomposes the iteration space into finer computation spaces. {\em Computation Alignment\/} subsequently, linearly transforms each computation space independently. CDA is a general framework in that linear transformations and its recent extensions are just special cases of CDA. CDA's fine grained loop restructuring can incur considerable computational effort, but can exploit optimization opportunities that earlier frameworks cannot. We present four optimization contexts in which CDA can be useful. Our initial experiments demonstrate that CDA adds a new dimension to performance optimization. <!---------------------------------------------------------------------> <HR> <A NAME="Kulkarni_Stumm_EuroPar95">.</A> <HR> <B>Title:</B> <A HREF="ftp://ftp.cs.toronto.edu/pub/parallel/Kulkarni_Stumm_EuroPar95.ps.Z">Implementing Flexible Computation Rules with Subexpression-level Loop Transformations</A> <P> <B>Authors:</B> Dattatraya Kulkarni, Michael Stumm and Ronald C. Unrau <P> <B>Where:</B>Proceedings of the Euro-Par95, Stockholm, Aug 28-31, 1995. <P> <B>Abstract:</B> <P> Computation Decomposition and Alignment (CDA) is a new loop transformation framework that extends the linear loop transformation framework and the more recently proposed Computation Alignment frameworks by linearly transforming computations at the granularity of subexpressions. It can be applied to achieve a number of optimization objectives, including the removal of data alignment constraints, the elimination of ownership tests, the reduction of cache conflicts, and improvements in data access locality. In this paper we show how CDA can be used to effectively implement flexible computation rules with the objective of minimizing communication and, whenever possible, eliminating intrinsics that test whether computations need to be executed or not. We describe CDA, show how it can be used to implement flexible computation rules, and present an algorithm for deriving appropriate CDA transformations. <!---------------------------------------------------------------------> <HR> <A NAME="Unrau_etal_EuroPar95">.</A> <HR> <B>Title:</B> <A HREF="ftp://ftp.cs.toronto.edu/pub/parallel/Unrau_etal_EuroPar95.ps.Z">On the Scalability of Demand-Driven Parallel Systems </A> <P> <B>Authors:</B> Ronald C. Unrau and Michael Stumm and Orran Krieger <P> <B>Where:</B>Proceedings of the Euro-Par95, Stockholm, Aug 28-31, 1995. <P> <B>Abstract:</B> <P> Demand-driven systems follow the model where customers enter the system, request some service, and then depart. Examples are databases, transaction processing systems and operating systems, which form the system software layer between the applications and the hardware. Achieving scalability at the system software layer is critical for the scalability of the system as a whole, and yet this layer has largely been ignored. In this paper, we characterize the scalability of the system software layer of demand-driven parallel systems based on fundamental metrics of quantitative system performance analysis. We develop a set of sufficient conditions so that if a system satisfies these conditions, then the system is scalable. We further argue that in practice these conditions are also necessary. In the remainder of the paper, we use the necessary and sufficient conditions to develop a set of practical design guidelines, to study the effect of application workloads, and to examine the scalability behavior of a system with only a limited number of processors. <!---------------------------------------------------------------------> <HR> <A NAME="Parsons_etal_IWOOS95">.</A> <HR> <B>Title:</B> <A HREF="ftp://ftp.cs.toronto.edu/pub/parallel/Parsons_etal_IWOOS95.ps.Z">(De-)Clustering Objects for Multiprocessor System Software </A> <P> <B>Authors:</B> Eric Parsons, Ben Gamsa, Orran Krieger, Michael Stumm <P> <B>Where:</B> IWOOS95 (Fourth International Workshop on Object Orientation in Operating Systems 95) <P> <B>Abstract:</B> <P> Designing system software for large-scale shared-memory multiprocessors is challenging because of the level of performance demanded by the application workload and the distributed nature of the system. Adopting an object-oriented approach for our system, we have developed a framework for de-clustering objects, where each object may migrate, replicate, and distribute all or part of its data across the system memory using the policies that will best meet the locality requirements for that data. The mechanism for object invocation hides the internal structure of an object, allowing a request to be made directly to the most suitable part of the object on a per-processor basis without any knowledge of how the object is de-clustered. Method invocation is very efficient, both within and across address spaces, involving no remote memory accesses in the common case. We describe the design and implementation of this framework in Tornado, our multiprocessor operating system. <!---------------------------------------------------------------------> <HR> <A NAME="Ben_etal_OOPSLAW94">.</A> <HR> <B>Title:</B> <A HREF="ftp://ftp.cs.toronto.edu/pub/parallel/Ben_etal_OOPSLAW94.ps.Z">The Importance of Performance-Oriented Flexibility in System Software for Large-Scale Shared-Memory Multiprocessors </A> <P> <B>Authors:</B> Orran Krieger, Ben Gamsa, Karen Reid, Paul Lu, Eric Parsons and Michael Stumm <P> <B>Where:</B> OOPSLA Workshop on Flexible System Software. October 1994. <P> <B>Abstract:</B> <P> See paper for abstract. <!---------------------------------------------------------------------> <HR> <A NAME="Orran_etal_SPDPW95">.</A> <HR> <B>Title:</B> <A HREF="ftp://ftp.cs.toronto.edu/pub/parallel/Orran_etal_SPDPW95.ps.Z"> Exploiting Mapped Files for Parallel I/O </A> <P> <B>Authors:</B> Orran Krieger, Karen Reid and Michael Stumm <P> <B>Where:</B> SPDP Workshop on Modeling and Specification of I/O (MSIO), October 1995 <P> <B>Abstract:</B> <P> Harnessing the full I/O capabilities of a large-scale multiprocessor is difficult and requires a great deal of cooperation between the application programmer, the compiler and the operating (/file) system. Hence, the parallel I/O interface used by the application to communicate with the system is crucial in achieving good performance. We present a set of properties we believe that a good I/O interface should have and consider current parallel I/O interfaces from the perspective of these properties. We describe the advantages and disadvantages of mapped-file I/O and argue that if properly implemented it can be a good basis for a parallel I/O interface that can fulfill the suggested properties. To demonstrate that such an implementation is feasible, we describe methodology used in our previous work on the Hurricane operating system and in our current work on the Tornado operating system to implement mapped files. |
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Baru_Zilio_PADS93.ps.Z version [562e569ca3].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Ben_etal_OOPSLAW94.ps.Z version [62f95e96a8].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Brecht_PhD_303.ps.Z version [c786620163].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Brecht_SEDMS93.ps.Z version [a46f67e4c9].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Curran_Stumm_CS.ps.Z version [1e41b84548].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Gamsa_MASc.ps.Z version [384073afb3].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Gamsa_etal_ICPP94.ps.Z version [5c90cfb8a3].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Harz_Sevcik_SC93.ps.Z version [e8dfae8a65].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Holliday_Stumm_IEEETC.ps.Z version [5500d6f174].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Krieger_Stumm_DAGS93.ps.Z version [fec8d531f0].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Krieger_Stumm_Unrau_USENIX92.ps.Z version [a50dc80555].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Krieger_etal_ICPP93.ps.Z version [be37bd6a4b].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Krieger_etal_IEEEComp94.ps.Z version [5727cfd3f1].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Kulkarni_Stumm_292.ps.Z version [12fc51f1ea].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Kulkarni_Stumm_ACJ95.ps.Z version [60d0400ff3].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Kulkarni_Stumm_CDA.ps.Z version [e3cb470a06].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Kulkarni_Stumm_LCR95.ps.Z version [b25e953c9e].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Kulkarni_Stumm_Tutorial.ps version [0f96759886].
more than 10,000 changes
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Kulkarni_Stumm_Tutorial.ps.Z version [00c2672e37].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Kulkarni_Stumm_Unrau_EuroPar95.ps.Z version [333b693b11].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Kulkarni_etal_317.ps.Z version [43337cf4e0].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Kumar_Kulkarni_ICPP91.ps.Z version [1044ec917d].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Kumar_Kulkarni_ICS92.ps.Z version [d380050e5e].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Li_Tandri_et.ps.Z version [048f032c7c].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Manjikian_Abdelrahaman_315.ps.Z version [cd2fdcd627].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/New_Kulkarni_Stumm_Tutorial.ps.Z version [bb636c1af0].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/ABSTRACTS.Z version [83ff0aee29].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Baru_Zilio_PADS93.ps.Z version [ef4d38005d].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Ben_etal_OOPSLAW94.ps.Z version [071bfc4652].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Brecht_PhD_303.ps.Z version [26f316a9a8].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Brecht_SEDMS93.ps.Z version [1fb4aa619c].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Curran_Stumm_CS.ps.Z version [1126490152].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Gamsa_MASc.ps.Z version [ecab87ac53].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Gamsa_etal_ICPP94.ps.Z version [37e7a07dbe].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Harz_Sevcik_SC93.ps.Z version [cdaa8a7875].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Holliday_Stumm_IEEETC.ps.Z version [9eac90ad8c].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Krieger_Stumm_DAGS93.ps.Z version [a8661224c1].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Krieger_Stumm_Unrau_USENIX92.ps.Z version [a50dc80555].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Krieger_etal_ICPP93.ps.Z version [a93baad145].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Krieger_etal_IEEEComp94.ps.Z version [ce1ab5a5a0].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Kulkarni_Stumm_292.ps.Z version [d4c59465ba].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Kulkarni_Stumm_ACJ95.ps.Z version [6c3b324030].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Kulkarni_Stumm_CDA.ps.Z version [e3cb470a06].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Kulkarni_Stumm_LCR95.ps.Z version [162ba80033].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Kulkarni_Stumm_Tutorial.ps.Z version [6828d2a140].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Kulkarni_Stumm_Unrau_EuroPar95.ps.Z version [9dac4bd560].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Kulkarni_etal_317.ps.Z version [34eccfe707].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Kumar_Kulkarni_ICPP91.ps.Z version [3940104ad0].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Kumar_Kulkarni_ICS92.ps.Z version [53a23b5e0b].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Li_Tandri_et.ps.Z version [0386572be6].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Manjikian_Abdelrahaman_315.ps.Z version [78d886e478].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/New_Kulkarni_Stumm_Tutorial.ps.Z version [bb636c1af0].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Okrieg_PhD.ps.Z version [09970b958c].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Orran_etal_SPDPW95.ps.Z version [19877a720d].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Parsons_Sevcik_IPPS95.ps.Z version [b4abb772ee].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Parsons_etal_IWOOS95.ps.Z version [ddc5af20a0].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/README.Z version [2cf48fe915].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Sandhu_et_al_PPOPP.ps.Z version [7d2dc2e38a].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Sevcik_JPE.ps.Z version [4c05ad6c60].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Sevcik_Zhou_PERF93.ps.Z version [32a44a59ed].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Stumm_Unrau_Krieger_USENIX92.ps.Z version [0eb56da9b7].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Stumm_Vranesic_White_IPPS93.ps.Z version [3e88aab020].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Tandri_Abdel_PDPTA.ps version [2089955127].
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643 2644 2645 2646 2647 2648 2649 2650 2651 2652 2653 2654 2655 2656 2657 2658 2659 2660 2661 2662 2663 2664 2665 2666 2667 2668 2669 2670 2671 2672 2673 2674 2675 2676 2677 2678 2679 2680 2681 2682 2683 2684 2685 2686 2687 2688 2689 2690 2691 2692 2693 2694 2695 2696 2697 2698 2699 2700 2701 2702 2703 2704 2705 2706 2707 2708 2709 2710 2711 2712 2713 2714 2715 2716 2717 2718 2719 2720 2721 2722 2723 2724 2725 2726 2727 2728 2729 2730 2731 2732 2733 2734 2735 2736 2737 2738 2739 2740 2741 2742 2743 2744 2745 2746 2747 2748 2749 2750 2751 2752 2753 2754 2755 2756 2757 2758 2759 2760 2761 2762 2763 2764 2765 2766 2767 2768 2769 2770 2771 2772 2773 2774 2775 2776 2777 2778 2779 2780 2781 2782 2783 2784 2785 2786 2787 2788 2789 2790 2791 2792 2793 2794 2795 2796 2797 2798 2799 2800 2801 2802 2803 2804 2805 2806 2807 2808 2809 2810 2811 2812 2813 2814 2815 2816 2817 2818 2819 2820 2821 2822 2823 2824 2825 2826 2827 2828 2829 2830 2831 2832 2833 2834 2835 2836 2837 2838 2839 2840 2841 2842 2843 2844 2845 2846 2847 2848 2849 2850 2851 2852 2853 2854 2855 2856 2857 2858 2859 2860 2861 2862 2863 2864 2865 2866 2867 2868 2869 2870 2871 2872 2873 2874 2875 2876 2877 2878 2879 2880 2881 2882 2883 2884 2885 2886 2887 2888 2889 2890 2891 2892 2893 2894 2895 2896 2897 2898 2899 2900 2901 2902 2903 2904 2905 2906 2907 2908 2909 2910 2911 2912 2913 2914 2915 2916 2917 2918 2919 2920 2921 2922 2923 2924 2925 2926 2927 2928 2929 2930 2931 2932 2933 2934 2935 2936 2937 2938 2939 2940 2941 2942 2943 2944 2945 2946 2947 2948 2949 2950 2951 2952 2953 2954 2955 2956 2957 2958 2959 2960 2961 2962 2963 2964 2965 2966 2967 2968 2969 2970 2971 2972 2973 2974 2975 2976 2977 2978 2979 2980 2981 2982 2983 2984 2985 2986 2987 2988 2989 2990 2991 2992 2993 2994 2995 2996 2997 2998 2999 3000 3001 3002 3003 3004 3005 3006 3007 3008 3009 3010 3011 3012 3013 3014 3015 3016 3017 3018 3019 3020 3021 3022 3023 3024 3025 3026 3027 3028 3029 3030 3031 3032 3033 3034 3035 3036 3037 3038 3039 3040 3041 3042 3043 3044 3045 3046 3047 3048 3049 3050 3051 3052 3053 3054 3055 3056 3057 3058 3059 3060 3061 3062 3063 3064 3065 3066 3067 3068 3069 3070 3071 3072 3073 3074 3075 3076 3077 3078 3079 3080 3081 3082 3083 3084 3085 3086 3087 3088 3089 3090 3091 3092 3093 3094 3095 3096 3097 3098 3099 3100 3101 3102 3103 3104 3105 3106 3107 3108 3109 3110 3111 3112 3113 3114 3115 3116 3117 3118 3119 3120 3121 3122 3123 3124 3125 3126 3127 3128 3129 3130 3131 3132 3133 3134 3135 3136 3137 3138 3139 3140 3141 3142 3143 3144 3145 3146 3147 3148 3149 3150 3151 3152 3153 3154 3155 3156 3157 3158 3159 3160 3161 3162 3163 3164 3165 3166 3167 3168 3169 3170 3171 3172 3173 3174 3175 3176 3177 3178 3179 3180 3181 3182 3183 3184 3185 3186 3187 3188 3189 3190 3191 3192 3193 3194 3195 3196 3197 3198 3199 3200 3201 3202 3203 3204 3205 3206 3207 3208 3209 3210 3211 3212 3213 3214 3215 3216 3217 3218 3219 3220 3221 3222 3223 3224 3225 3226 3227 3228 3229 3230 3231 3232 3233 3234 3235 3236 3237 3238 3239 3240 3241 3242 3243 3244 3245 3246 3247 3248 3249 3250 3251 3252 3253 3254 3255 3256 3257 3258 3259 3260 3261 3262 3263 3264 3265 3266 3267 3268 3269 3270 3271 3272 3273 3274 3275 3276 3277 3278 3279 3280 3281 3282 3283 3284 3285 3286 3287 3288 3289 3290 3291 3292 3293 3294 3295 3296 3297 3298 3299 3300 3301 3302 3303 3304 3305 3306 3307 3308 3309 3310 3311 3312 3313 3314 3315 3316 3317 3318 3319 3320 3321 3322 3323 3324 3325 3326 3327 3328 3329 3330 3331 3332 3333 3334 3335 3336 3337 3338 3339 3340 3341 3342 3343 3344 3345 3346 3347 3348 3349 3350 3351 3352 3353 3354 3355 3356 3357 3358 3359 3360 3361 3362 3363 3364 3365 3366 3367 3368 3369 3370 3371 3372 3373 3374 3375 3376 3377 3378 3379 3380 3381 3382 3383 3384 3385 3386 3387 3388 3389 3390 3391 3392 3393 3394 3395 3396 3397 3398 3399 3400 3401 3402 3403 3404 3405 3406 3407 3408 3409 3410 3411 3412 3413 3414 3415 3416 3417 3418 3419 3420 3421 3422 3423 3424 3425 3426 3427 3428 3429 3430 3431 3432 3433 3434 3435 3436 3437 3438 3439 3440 3441 3442 3443 3444 3445 3446 3447 3448 3449 3450 3451 3452 3453 3454 3455 3456 3457 3458 3459 3460 3461 3462 3463 3464 3465 3466 3467 3468 3469 3470 3471 3472 3473 3474 3475 3476 3477 3478 3479 3480 3481 3482 3483 3484 3485 3486 3487 3488 3489 3490 3491 3492 3493 3494 3495 3496 3497 3498 3499 3500 3501 3502 3503 3504 3505 3506 3507 3508 3509 3510 3511 3512 3513 3514 3515 3516 3517 3518 3519 3520 3521 3522 3523 3524 3525 3526 3527 3528 3529 3530 3531 3532 3533 3534 3535 3536 3537 3538 3539 3540 3541 3542 3543 3544 3545 3546 3547 3548 3549 3550 3551 3552 3553 3554 3555 3556 3557 3558 3559 3560 3561 3562 3563 3564 3565 3566 3567 3568 3569 3570 3571 3572 3573 3574 3575 3576 3577 3578 3579 3580 3581 3582 3583 3584 3585 3586 3587 3588 3589 3590 3591 3592 3593 3594 3595 3596 3597 3598 3599 3600 3601 3602 3603 3604 3605 3606 3607 3608 3609 3610 3611 3612 3613 3614 3615 3616 3617 3618 3619 3620 3621 3622 3623 3624 3625 3626 3627 3628 3629 3630 3631 3632 3633 3634 3635 3636 3637 3638 3639 3640 3641 3642 3643 3644 3645 3646 3647 3648 3649 3650 3651 3652 3653 3654 3655 3656 3657 3658 3659 3660 3661 3662 3663 3664 3665 3666 3667 3668 3669 3670 3671 3672 3673 3674 3675 3676 3677 3678 3679 3680 3681 3682 3683 3684 3685 3686 3687 3688 3689 3690 3691 3692 3693 3694 3695 3696 3697 3698 3699 3700 3701 3702 3703 3704 3705 3706 3707 3708 3709 3710 3711 3712 3713 3714 3715 3716 3717 3718 3719 3720 3721 3722 3723 3724 3725 3726 3727 3728 3729 3730 3731 3732 3733 3734 3735 3736 3737 3738 3739 3740 3741 3742 3743 3744 3745 3746 3747 3748 3749 3750 3751 3752 3753 3754 3755 3756 3757 3758 3759 3760 3761 3762 3763 3764 3765 3766 3767 3768 3769 3770 3771 3772 3773 3774 3775 3776 3777 3778 3779 3780 3781 3782 3783 3784 3785 3786 3787 3788 3789 3790 3791 3792 3793 3794 3795 3796 3797 3798 3799 3800 3801 3802 3803 3804 3805 3806 3807 3808 3809 3810 3811 3812 3813 3814 3815 3816 3817 3818 3819 3820 3821 3822 3823 3824 3825 3826 3827 3828 3829 3830 3831 3832 3833 3834 3835 3836 3837 3838 3839 3840 3841 3842 3843 3844 3845 3846 3847 3848 3849 3850 3851 3852 3853 3854 3855 3856 3857 3858 3859 3860 3861 3862 3863 3864 3865 3866 3867 3868 3869 3870 3871 3872 3873 3874 3875 3876 3877 3878 3879 3880 3881 3882 3883 3884 3885 3886 3887 3888 3889 3890 3891 3892 3893 3894 3895 3896 3897 3898 3899 3900 3901 3902 3903 3904 3905 3906 3907 3908 3909 3910 3911 3912 3913 3914 3915 3916 3917 3918 3919 3920 3921 3922 3923 3924 3925 3926 3927 3928 3929 3930 3931 3932 3933 3934 3935 3936 3937 3938 3939 3940 3941 3942 3943 3944 3945 3946 3947 3948 3949 3950 3951 3952 3953 3954 3955 3956 3957 3958 3959 3960 3961 3962 3963 3964 3965 3966 3967 3968 3969 3970 3971 3972 3973 3974 3975 3976 3977 3978 3979 3980 3981 3982 3983 3984 3985 3986 3987 3988 3989 3990 3991 3992 3993 3994 3995 3996 3997 3998 3999 4000 4001 4002 4003 4004 4005 4006 4007 4008 4009 4010 4011 4012 4013 4014 4015 4016 4017 4018 4019 4020 4021 4022 4023 4024 4025 4026 4027 4028 4029 4030 4031 4032 4033 4034 4035 4036 4037 4038 4039 4040 4041 4042 4043 4044 4045 4046 4047 4048 4049 4050 4051 4052 4053 4054 4055 4056 4057 4058 4059 4060 4061 4062 4063 4064 4065 4066 4067 4068 4069 4070 4071 4072 4073 4074 4075 4076 4077 4078 4079 4080 4081 4082 4083 4084 4085 4086 4087 4088 4089 4090 4091 4092 4093 4094 4095 4096 4097 4098 4099 4100 4101 4102 4103 4104 4105 4106 4107 4108 4109 4110 4111 4112 4113 4114 4115 4116 4117 4118 4119 4120 4121 4122 4123 4124 4125 4126 4127 4128 4129 4130 4131 4132 4133 4134 4135 4136 4137 4138 4139 4140 4141 4142 4143 4144 4145 4146 4147 4148 4149 4150 4151 4152 4153 4154 4155 4156 4157 4158 4159 4160 4161 4162 4163 4164 4165 4166 4167 4168 4169 4170 4171 4172 4173 4174 4175 4176 4177 4178 4179 4180 4181 4182 4183 4184 4185 4186 4187 4188 4189 4190 4191 4192 4193 4194 4195 4196 4197 4198 4199 4200 4201 4202 4203 4204 4205 4206 4207 4208 4209 4210 4211 4212 4213 4214 4215 4216 4217 4218 4219 4220 4221 4222 4223 4224 4225 4226 4227 4228 4229 4230 4231 4232 4233 4234 4235 4236 4237 4238 4239 4240 4241 4242 4243 4244 4245 4246 4247 4248 4249 4250 4251 4252 4253 4254 4255 4256 4257 4258 4259 4260 4261 4262 4263 4264 4265 4266 4267 4268 4269 4270 4271 4272 4273 4274 4275 4276 4277 4278 4279 4280 4281 4282 4283 4284 4285 4286 4287 4288 4289 4290 4291 4292 4293 4294 4295 4296 4297 4298 4299 4300 4301 4302 4303 4304 4305 4306 4307 4308 4309 4310 4311 4312 4313 4314 4315 4316 4317 4318 4319 4320 4321 4322 4323 4324 4325 4326 4327 4328 4329 4330 4331 4332 4333 4334 4335 4336 4337 4338 4339 4340 4341 4342 4343 4344 4345 4346 4347 4348 4349 4350 4351 4352 4353 4354 4355 4356 4357 4358 4359 4360 4361 4362 4363 4364 4365 4366 4367 4368 4369 4370 4371 4372 4373 4374 4375 4376 4377 4378 4379 4380 4381 4382 4383 4384 4385 4386 4387 4388 4389 4390 4391 4392 4393 4394 4395 4396 4397 4398 4399 4400 4401 4402 4403 4404 4405 4406 4407 4408 4409 4410 4411 4412 4413 4414 4415 4416 4417 4418 4419 4420 4421 4422 4423 4424 4425 4426 4427 4428 4429 4430 4431 4432 4433 4434 4435 4436 4437 4438 4439 4440 4441 4442 4443 4444 4445 4446 4447 4448 4449 4450 4451 4452 4453 4454 4455 4456 4457 4458 4459 4460 4461 4462 4463 4464 4465 4466 4467 4468 4469 4470 4471 4472 4473 4474 4475 4476 4477 4478 4479 4480 4481 4482 4483 4484 4485 4486 4487 4488 4489 4490 4491 4492 4493 4494 4495 4496 4497 4498 4499 4500 4501 4502 4503 4504 4505 4506 4507 4508 4509 4510 4511 4512 4513 4514 4515 4516 4517 4518 4519 4520 4521 4522 4523 4524 4525 4526 4527 4528 4529 4530 4531 4532 4533 4534 4535 4536 4537 4538 4539 4540 4541 4542 4543 4544 4545 4546 4547 4548 4549 4550 4551 4552 4553 4554 4555 4556 4557 4558 4559 4560 4561 4562 4563 4564 4565 4566 4567 4568 4569 4570 4571 4572 4573 4574 4575 4576 4577 4578 4579 4580 4581 4582 4583 4584 4585 4586 4587 4588 4589 4590 4591 4592 4593 4594 4595 4596 4597 4598 4599 4600 4601 4602 4603 4604 4605 4606 4607 4608 4609 4610 4611 4612 4613 4614 4615 4616 4617 4618 4619 4620 4621 4622 4623 4624 4625 4626 4627 4628 4629 4630 4631 4632 4633 4634 4635 4636 4637 4638 4639 4640 4641 4642 4643 4644 4645 4646 4647 4648 4649 4650 4651 4652 4653 4654 4655 4656 4657 4658 4659 4660 4661 4662 4663 4664 4665 4666 4667 4668 4669 4670 4671 4672 4673 4674 4675 4676 4677 4678 4679 4680 4681 4682 4683 4684 4685 4686 4687 4688 4689 4690 4691 4692 4693 4694 4695 4696 4697 4698 4699 4700 4701 4702 4703 4704 4705 4706 4707 4708 4709 4710 4711 4712 4713 4714 4715 4716 4717 4718 4719 4720 4721 4722 4723 4724 4725 4726 4727 4728 4729 4730 4731 4732 4733 4734 4735 4736 4737 4738 4739 4740 4741 4742 4743 4744 4745 4746 4747 4748 4749 4750 4751 4752 4753 4754 4755 4756 4757 4758 4759 4760 4761 4762 4763 4764 4765 4766 4767 4768 4769 4770 4771 4772 4773 4774 4775 4776 4777 4778 4779 4780 4781 4782 4783 4784 4785 4786 4787 4788 4789 4790 4791 4792 4793 4794 4795 4796 4797 4798 4799 4800 4801 4802 4803 4804 4805 4806 4807 4808 4809 4810 4811 4812 4813 4814 4815 4816 4817 4818 4819 4820 4821 4822 4823 4824 4825 4826 4827 4828 4829 4830 4831 4832 4833 4834 4835 4836 4837 4838 4839 4840 4841 4842 4843 4844 4845 4846 4847 4848 4849 4850 4851 4852 4853 4854 4855 4856 4857 4858 4859 4860 4861 4862 4863 4864 4865 4866 4867 4868 4869 4870 4871 4872 4873 4874 4875 4876 4877 4878 4879 4880 4881 4882 4883 4884 4885 4886 4887 4888 4889 4890 4891 4892 4893 4894 4895 4896 4897 4898 4899 4900 4901 4902 4903 4904 4905 4906 4907 4908 4909 4910 4911 4912 4913 4914 4915 4916 4917 4918 4919 4920 4921 4922 4923 4924 4925 4926 4927 4928 4929 4930 4931 4932 4933 4934 4935 4936 4937 4938 4939 4940 4941 4942 4943 4944 4945 4946 4947 4948 4949 4950 4951 4952 4953 4954 4955 4956 4957 4958 4959 4960 4961 4962 4963 4964 4965 4966 4967 4968 4969 4970 4971 4972 4973 4974 4975 4976 4977 4978 4979 4980 4981 4982 4983 4984 4985 4986 4987 4988 4989 4990 4991 4992 4993 4994 4995 4996 4997 4998 4999 5000 5001 5002 5003 5004 5005 5006 5007 5008 5009 5010 5011 5012 5013 5014 5015 5016 5017 5018 5019 5020 5021 5022 5023 5024 5025 5026 5027 5028 5029 5030 5031 5032 5033 5034 5035 5036 5037 5038 5039 5040 5041 5042 5043 5044 5045 5046 5047 5048 5049 5050 5051 5052 5053 5054 5055 5056 5057 5058 5059 5060 5061 5062 5063 5064 5065 5066 5067 5068 5069 5070 5071 5072 5073 5074 5075 5076 5077 5078 5079 5080 5081 5082 5083 5084 5085 5086 5087 5088 5089 5090 5091 5092 5093 5094 5095 5096 5097 5098 5099 5100 5101 5102 5103 5104 5105 |
%!PS-Adobe-2.0 %%Creator: dvips 5.512 Copyright 1986, 1993 Radical Eye Software %%Title: pdpta.dvi %%CreationDate: Thu Nov 23 17:27:55 1995 %%Pages: 10 %%PageOrder: Ascend %%BoundingBox: 0 0 612 792 %%DocumentFonts: Times-Bold Times-Roman Times-Italic Courier %%EndComments %DVIPSCommandLine: dvips -o pdpta.ps pdpta.dvi %DVIPSSource: TeX output 1995.08.11:1234 %%BeginProcSet: tex.pro /TeXDict 250 dict def TeXDict begin /N{def}def /B{bind def}N /S{exch}N /X{S N} B /TR{translate}N /isls false N /vsize 11 72 mul N /@rigin{isls{[0 -1 1 0 0 0] concat}if 72 Resolution div 72 VResolution div neg scale isls{Resolution hsize -72 div mul 0 TR}if Resolution VResolution vsize -72 div 1 add mul TR matrix currentmatrix dup dup 4 get round 4 exch put dup dup 5 get round 5 exch put setmatrix}N /@landscape{/isls true N}B /@manualfeed{statusdict /manualfeed true put}B /@copies{/#copies X}B /FMat[1 0 0 -1 0 0]N /FBB[0 0 0 0]N /nn 0 N /IE 0 N /ctr 0 N /df-tail{/nn 8 dict N nn begin /FontType 3 N /FontMatrix fntrx N /FontBBox FBB N string /base X array /BitMaps X /BuildChar{ CharBuilder}N /Encoding IE N end dup{/foo setfont}2 array copy cvx N load 0 nn put /ctr 0 N[}B /df{/sf 1 N /fntrx FMat N df-tail}B /dfs{div /sf X /fntrx[sf 0 0 sf neg 0 0]N df-tail}B /E{pop nn dup definefont setfont}B /ch-width{ch-data dup length 5 sub get}B /ch-height{ch-data dup length 4 sub get}B /ch-xoff{128 ch-data dup length 3 sub get sub}B /ch-yoff{ch-data dup length 2 sub get 127 sub}B /ch-dx{ch-data dup length 1 sub get}B /ch-image{ch-data dup type /stringtype ne{ctr get /ctr ctr 1 add N}if}B /id 0 N /rw 0 N /rc 0 N /gp 0 N /cp 0 N /G 0 N /sf 0 N /CharBuilder{save 3 1 roll S dup /base get 2 index get S /BitMaps get S get /ch-data X pop /ctr 0 N ch-dx 0 ch-xoff ch-yoff ch-height sub ch-xoff ch-width add ch-yoff setcachedevice ch-width ch-height true[1 0 0 -1 -.1 ch-xoff sub ch-yoff .1 add]{ch-image}imagemask restore}B /D{/cc X dup type /stringtype ne{]}if nn /base get cc ctr put nn /BitMaps get S ctr S sf 1 ne{dup dup length 1 sub dup 2 index S get sf div put}if put /ctr ctr 1 add N} B /I{cc 1 add D}B /bop{userdict /bop-hook known{bop-hook}if /SI save N @rigin 0 0 moveto /V matrix currentmatrix dup 1 get dup mul exch 0 get dup mul add .99 lt{/QV}{/RV}ifelse load def pop pop}N /eop{SI restore showpage userdict /eop-hook known{eop-hook}if}N /@start{userdict /start-hook known{start-hook} if pop /VResolution X /Resolution X 1000 div /DVImag X /IE 256 array N 0 1 255 {IE S 1 string dup 0 3 index put cvn put}for 65781.76 div /vsize X 65781.76 div /hsize X}N /p{show}N /RMat[1 0 0 -1 0 0]N /BDot 260 string N /rulex 0 N /ruley 0 N /v{/ruley X /rulex X V}B /V{}B /RV statusdict begin /product where{ pop product dup length 7 ge{0 7 getinterval dup(Display)eq exch 0 4 getinterval(NeXT)eq or}{pop false}ifelse}{false}ifelse end{{gsave TR -.1 -.1 TR 1 1 scale rulex ruley false RMat{BDot}imagemask grestore}}{{gsave TR -.1 -.1 TR rulex ruley scale 1 1 false RMat{BDot}imagemask grestore}}ifelse B /QV{ gsave transform round exch round exch itransform moveto rulex 0 rlineto 0 ruley neg rlineto rulex neg 0 rlineto fill grestore}B /a{moveto}B /delta 0 N /tail{dup /delta X 0 rmoveto}B /M{S p delta add tail}B /b{S p tail}B /c{-4 M} B /d{-3 M}B /e{-2 M}B /f{-1 M}B /g{0 M}B /h{1 M}B /i{2 M}B /j{3 M}B /k{4 M}B /w{0 rmoveto}B /l{p -4 w}B /m{p -3 w}B /n{p -2 w}B /o{p -1 w}B /q{p 1 w}B /r{ p 2 w}B /s{p 3 w}B /t{p 4 w}B /x{0 S rmoveto}B /y{3 2 roll p a}B /bos{/SS save N}B /eos{SS restore}B end %%EndProcSet %%BeginProcSet: texps.pro TeXDict begin /rf{findfont dup length 1 add dict begin{1 index /FID ne 2 index /UniqueID ne and{def}{pop pop}ifelse}forall[1 index 0 6 -1 roll exec 0 exch 5 -1 roll VResolution Resolution div mul neg 0 0]/Metrics exch def dict begin Encoding{exch dup type /integertype ne{pop pop 1 sub dup 0 le{pop}{[}ifelse}{ FontMatrix 0 get div Metrics 0 get div def}ifelse}forall Metrics /Metrics currentdict end def[2 index currentdict end definefont 3 -1 roll makefont /setfont load]cvx def}def /ObliqueSlant{dup sin S cos div neg}B /SlantFont{4 index mul add}def /ExtendFont{3 -1 roll mul exch}def /ReEncodeFont{/Encoding exch def}def end %%EndProcSet %%BeginProcSet: special.pro TeXDict begin /SDict 200 dict N SDict begin /@SpecialDefaults{/hs 612 N /vs 792 N /ho 0 N /vo 0 N /hsc 1 N /vsc 1 N /ang 0 N /CLIP 0 N /rwiSeen false N /rhiSeen false N /letter{}N /note{}N /a4{}N /legal{}N}B /@scaleunit 100 N /@hscale{@scaleunit div /hsc X}B /@vscale{@scaleunit div /vsc X}B /@hsize{/hs X /CLIP 1 N}B /@vsize{/vs X /CLIP 1 N}B /@clip{/CLIP 2 N}B /@hoffset{/ho X}B /@voffset{/vo X}B /@angle{/ang X}B /@rwi{10 div /rwi X /rwiSeen true N}B /@rhi {10 div /rhi X /rhiSeen true N}B /@llx{/llx X}B /@lly{/lly X}B /@urx{/urx X}B /@ury{/ury X}B /magscale true def end /@MacSetUp{userdict /md known{userdict /md get type /dicttype eq{userdict begin md length 10 add md maxlength ge{/md md dup length 20 add dict copy def}if end md begin /letter{}N /note{}N /legal{ }N /od{txpose 1 0 mtx defaultmatrix dtransform S atan/pa X newpath clippath mark{transform{itransform moveto}}{transform{itransform lineto}}{6 -2 roll transform 6 -2 roll transform 6 -2 roll transform{itransform 6 2 roll itransform 6 2 roll itransform 6 2 roll curveto}}{{closepath}}pathforall newpath counttomark array astore /gc xdf pop ct 39 0 put 10 fz 0 fs 2 F/|______Courier fnt invertflag{PaintBlack}if}N /txpose{pxs pys scale ppr aload pop por{noflips{pop S neg S TR pop 1 -1 scale}if xflip yflip and{pop S neg S TR 180 rotate 1 -1 scale ppr 3 get ppr 1 get neg sub neg ppr 2 get ppr 0 get neg sub neg TR}if xflip yflip not and{pop S neg S TR pop 180 rotate ppr 3 get ppr 1 get neg sub neg 0 TR}if yflip xflip not and{ppr 1 get neg ppr 0 get neg TR}if}{noflips{TR pop pop 270 rotate 1 -1 scale}if xflip yflip and{TR pop pop 90 rotate 1 -1 scale ppr 3 get ppr 1 get neg sub neg ppr 2 get ppr 0 get neg sub neg TR}if xflip yflip not and{TR pop pop 90 rotate ppr 3 get ppr 1 get neg sub neg 0 TR}if yflip xflip not and{TR pop pop 270 rotate ppr 2 get ppr 0 get neg sub neg 0 S TR}if}ifelse scaleby96{ppr aload pop 4 -1 roll add 2 div 3 1 roll add 2 div 2 copy TR .96 dup scale neg S neg S TR}if}N /cp{pop pop showpage pm restore}N end}if}if}N /normalscale{Resolution 72 div VResolution 72 div neg scale magscale{DVImag dup scale}if 0 setgray}N /psfts{S 65781.76 div N}N /startTexFig{/psf$SavedState save N userdict maxlength dict begin /magscale false def normalscale currentpoint TR /psf$ury psfts /psf$urx psfts /psf$lly psfts /psf$llx psfts /psf$y psfts /psf$x psfts currentpoint /psf$cy X /psf$cx X /psf$sx psf$x psf$urx psf$llx sub div N /psf$sy psf$y psf$ury psf$lly sub div N psf$sx psf$sy scale psf$cx psf$sx div psf$llx sub psf$cy psf$sy div psf$ury sub TR /showpage{}N /erasepage{}N /copypage{}N /p 3 def @MacSetUp}N /doclip{psf$llx psf$lly psf$urx psf$ury currentpoint 6 2 roll newpath 4 copy 4 2 roll moveto 6 -1 roll S lineto S lineto S lineto closepath clip newpath moveto}N /endTexFig{end psf$SavedState restore}N /@beginspecial{ SDict begin /SpecialSave save N gsave normalscale currentpoint TR @SpecialDefaults count /ocount X /dcount countdictstack N}N /@setspecial{CLIP 1 eq{newpath 0 0 moveto hs 0 rlineto 0 vs rlineto hs neg 0 rlineto closepath clip}if ho vo TR hsc vsc scale ang rotate rwiSeen{rwi urx llx sub div rhiSeen{ rhi ury lly sub div}{dup}ifelse scale llx neg lly neg TR}{rhiSeen{rhi ury lly sub div dup scale llx neg lly neg TR}if}ifelse CLIP 2 eq{newpath llx lly moveto urx lly lineto urx ury lineto llx ury lineto closepath clip}if /showpage{}N /erasepage{}N /copypage{}N newpath}N /@endspecial{count ocount sub{pop}repeat countdictstack dcount sub{end}repeat grestore SpecialSave restore end}N /@defspecial{SDict begin}N /@fedspecial{end}B /li{lineto}B /rl{ rlineto}B /rc{rcurveto}B /np{/SaveX currentpoint /SaveY X N 1 setlinecap newpath}N /st{stroke SaveX SaveY moveto}N /fil{fill SaveX SaveY moveto}N /ellipse{/endangle X /startangle X /yrad X /xrad X /savematrix matrix currentmatrix N TR xrad yrad scale 0 0 1 startangle endangle arc savematrix setmatrix}N end %%EndProcSet TeXDict begin 40258431 52099146 1000 300 300 (/stumm/a0/tandri/pdpta/pdpta.dvi) @start /Fa 175[27 7[27 1[27 70[{}3 45.833332 /Courier rf /Fb 80[25 25 51[20 23 23 33 23 23 13 18 15 23 23 23 23 36 13 23 1[13 23 23 15 20 23 20 23 20 3[15 1[15 2[33 2[33 28 25 30 1[25 33 33 41 28 33 1[15 33 1[25 28 33 30 30 33 5[13 3[23 23 4[23 2[11 15 11 1[23 15 15 3[23 2[15 33[{}60 45.833332 /Times-Roman rf /Fc 81[29 51[23 26 2[26 29 16 23 23 2[29 29 42 16 2[16 29 29 16 26 29 26 29 29 13[29 2[36 42 1[48 6[36 1[42 39 1[36 11[29 29 29 29 29 2[15 19 45[{}36 58.333336 /Times-Italic rf /Fd 134[30 2[30 30 30 30 30 1[30 30 30 30 30 30 1[30 30 30 30 30 30 30 30 30 12[30 6[30 3[30 2[30 30 30 30 30 30 14[30 4[30 30 1[30 30 30 40[{}36 50.000000 /Courier rf /Fe 134[22 22 33 1[25 14 19 19 25 25 25 25 36 14 22 1[14 25 25 14 22 25 22 25 25 9[41 2[28 25 30 1[30 36 1[41 28 33 22 17 36 2[30 36 33 1[30 7[25 4[25 25 25 25 2[12 17 5[17 39[{}47 50.000000 /Times-Italic rf /Ff 1 1 df<FFFFF0FFFFF014027D881B>0 D E /Fg 4 117 df<1F0006000600060006000C000C000C00 0C0018F01B181C08180838183018301830306030603160616062C022C03C10177E9614>104 D<0300038003000000000000000000000000001C002400460046008C000C001800180018003100 3100320032001C0009177F960C>I<383C0044C6004702004602008E06000C06000C06000C0C00 180C00180C40181840181880300880300F00120E7F8D15>110 D<030003000600060006000600 FFC00C000C000C001800180018001800300030803080310031001E000A147F930D>116 D E /Fh 3 3 df<FFFFFFFCFFFFFFFC1E027C8C27>0 D<70F8F8F87005057C8E0E>I<C00003E0 000770000E38001C1C00380E00700700E00381C001C38000E700007E00003C00003C00007E0000 E70001C3800381C00700E00E00701C003838001C70000EE00007C000031818799727>I E /Fi 4 62 df<00200040008001000300060004000C000C001800180030003000300070006000 60006000E000E000E000E000E000E000E000E000E000E000E000E000E000E00060006000600070 00300030003000180018000C000C0004000600030001000080004000200B327CA413>40 D<800040002000100018000C000400060006000300030001800180018001C000C000C000C000E0 00E000E000E000E000E000E000E000E000E000E000E000E000E000C000C000C001C00180018001 80030003000600060004000C00180010002000400080000B327DA413>I<000180000001800000 018000000180000001800000018000000180000001800000018000000180000001800000018000 00018000000180000001800000018000FFFFFFFEFFFFFFFE000180000001800000018000000180 000001800000018000000180000001800000018000000180000001800000018000000180000001 800000018000000180001F227D9C26>43 D<FFFFFFFEFFFFFFFE00000000000000000000000000 00000000000000000000000000000000000000FFFFFFFEFFFFFFFE1F0C7D9126>61 D E /Fj 16 111 df<003F000000E180000380C020070060400E0070401C0070403C0070803C00 3880780039007800390078003A00F0003A00F0003C00F0003800F0003800700038007000780030 00B800380338401C1C188007E00F001B157E941F>11 D<70F8FCFC740404040408081010204006 0F7C840E>59 D<00000080000000018000000001C000000003C000000003C000000007C0000000 0BC00000000BC000000013C000000033C000000023C000000043C000000043E000000081E00000 0181E000000101E000000201E000000201E000000401E000000C01E000000801E000001001E000 001FFFF000002000F000006000F000004000F000008000F000008000F000010000F000030000F0 00020000F000040000F8000C0000F8001E0000F800FF800FFF8021237EA225>65 D<007FFFF8000007800F00000780078000078003C0000F0001C0000F0001C0000F0001E0000F00 01E0001E0001C0001E0003C0001E0003C0001E000780003C000F00003C001E00003C003C00003C 01F000007FFFE00000780078000078003C000078001E0000F0001E0000F0000E0000F0000F0000 F0000F0001E0001E0001E0001E0001E0001E0001E0003C0003C0003C0003C000780003C000F000 03C001C00007C00F8000FFFFFC000023227EA125>I<007FFFF0000007801C000007800F000007 800700000F000380000F000380000F000380000F000380001E000780001E000780001E00078000 1E000F00003C000F00003C001E00003C003C00003C007000007801E000007FFF00000078000000 007800000000F000000000F000000000F000000000F000000001E000000001E000000001E00000 0001E000000003C000000003C000000003C000000003C000000007C0000000FFFC00000021227E A11F>80 D<007FFFE0000007803C000007800E000007800700000F000780000F000380000F0003 C0000F0003C0001E000780001E000780001E000780001E000F00003C001E00003C003C00003C00 7000003C01C000007FFE00000078078000007801C000007801E00000F000F00000F000F00000F0 00F00000F000F00001E001E00001E001E00001E001E00001E001E00003C003C00003C003C04003 C003C04003C001C08007C001C080FFFC00E3000000003C0022237EA125>82 D<3FFE01FF8003C0003C0003C000300003C0001000078000200007800020000780002000078000 20000F000040000F000040000F000040000F000040001E000080001E000080001E000080001E00 0080003C000100003C000100003C000100003C0001000078000200007800020000780002000078 000200007000040000F000040000F0000800007000080000700010000070002000003800400000 38008000001C01000000060600000001F800000021237DA121>85 D<FFF8007FC00F80000F000F 00000C000F000008000F000010000F800010000780002000078000600007800040000780008000 07800080000780010000078002000007C002000003C004000003C00C000003C008000003C01000 0003C010000003C020000003E040000003E040000001E080000001E180000001E100000001E200 000001E200000001E400000001F800000000F800000000F000000000E000000000E000000000C0 00000000C000000022237DA11C>I<007FFC03FF0007E000F80007C000E00003C000800003E001 000001E002000001F006000001F00C000000F018000000F81000000078200000007C400000007C 800000003D000000003E000000001E000000001F000000001F000000002F000000006F80000000 C78000000187C000000103C000000203C000000403E000000801E000001001F000002000F00000 4000F800008000F80001800078000300007C000F8000FC00FFE007FFC028227FA128>88 D<00001E00000063800000C7800001C7800001C300000180000003800000038000000380000003 80000007000000070000000700000007000000FFF800000E0000000E0000000E0000000E000000 0E0000000E0000001C0000001C0000001C0000001C0000001C0000003800000038000000380000 0038000000380000007000000070000000700000007000000060000000E0000000E0000000E000 0000C0000070C00000F1800000F1000000620000003C000000192D7EA218>102 D<000F0C00389C00605C00C03801C0380380380780380700700F00700F00700F00701E00E01E00 E01E00E01E00E01E01C00E01C00E03C00605C0031B8001E3800003800003800007000007000007 00700E00F00C00F018006070003FC000161F809417>I<00F0000FE00000E00000E00000E00001 C00001C00001C00001C000038000038000038000038000070000071F0007218007C0C00F00E00F 00E00E00E00E00E01C01C01C01C01C01C01C01C0380380380380380700380704700708700E0870 0E08700610E006206003C016237DA21C>I<00E000E001E000C000000000000000000000000000 00000000001E0023004380438083808380870007000E000E000E001C001C003800382038407040 7040308031001E000B227EA111>I<0000E00001E00001E00000C0000000000000000000000000 000000000000000000000000000000001E00002300004380008380010380010380010380000700 000700000700000700000E00000E00000E00000E00001C00001C00001C00001C00003800003800 00380000380000700000700000700070E000F0C000F180006300003E0000132C81A114>I<00F0 000FE00000E00000E00000E00001C00001C00001C00001C0000380000380000380000380000700 000700F00703080704380E08780E10780E20300E40001C80001F00001FC0001C70003838003838 00381C00381C10703820703820703820701840E00C8060070015237DA219>I<3C07C046186047 20308740388780388700388700380E00700E00700E00700E00701C00E01C00E01C01C01C01C138 01C23803823803823801847001883000F018157E941D>110 D E /Fk 134[21 1[30 1[21 12 16 14 1[21 21 21 32 12 2[12 21 21 14 18 21 18 21 18 3[14 1[14 17[14 5[28 8[12 21 21 5[21 21 1[12 10 14 45[{}32 41.666668 /Times-Roman rf /Fl 203[15 15 15 15 49[{}4 29.166668 /Times-Roman rf /Fm 203[17 17 17 17 17 48[{}5 33.333332 /Times-Roman rf /Fn 138[39 23 27 31 1[39 35 39 59 20 39 1[20 1[35 23 31 39 31 39 35 9[71 4[51 1[43 6[27 2[43 1[51 51 11[35 35 35 35 35 35 35 49[{}32 70.833336 /Times-Bold rf /Fo 69[22 8[25 1[28 28 3[22 47[22 25 25 36 25 25 14 19 17 25 25 25 25 39 14 25 14 14 25 25 17 22 25 22 25 22 3[17 1[17 30 2[47 36 36 30 28 33 1[28 36 36 44 30 36 19 17 36 36 28 30 36 33 33 36 3[28 1[14 14 25 25 25 25 25 25 25 25 25 25 1[12 17 12 2[17 17 17 39[{}75 50.000000 /Times-Roman rf /Fp 139[17 19 22 14[22 28 25 31[36 65[{}7 50.000000 /Times-Bold rf /Fq 2 104 df<0000F80003C0000F00001E00003C0000 780000780000780000780000780000780000780000780000780000780000780000780000780000 780000780000780000780000780000780000780000F00000F00001E000078000FE0000FE000007 800001E00000F00000F00000780000780000780000780000780000780000780000780000780000 7800007800007800007800007800007800007800007800007800007800007800003C00001E0000 0F000003C00000F8153C7CAC1E>102 D<F800000F000003C00001E00000F00000780000780000 780000780000780000780000780000780000780000780000780000780000780000780000780000 7800007800007800007800007800003C00003C00001E000007000001F80001F8000700001E0000 3C00003C0000780000780000780000780000780000780000780000780000780000780000780000 780000780000780000780000780000780000780000780000780000F00001E00003C0000F0000F8 0000153C7CAC1E>I E /Fr 134[29 2[29 29 16 23 19 1[29 29 29 45 16 29 1[16 29 29 19 26 29 26 29 26 11[42 36 32 5[52 7[36 42 39 1[42 54 5[16 4[29 29 2[29 2[15 19 15 44[{}37 58.333336 /Times-Roman rf /Fs 134[42 3[46 28 32 37 1[46 42 46 69 23 2[23 46 42 1[37 46 37 46 42 13[46 2[51 2[78 8[60 60 67[{}23 83.333336 /Times-Bold rf end %%EndProlog %%BeginSetup %%Feature: *Resolution 300dpi TeXDict begin %%EndSetup %%Page: 1 1 1 0 bop 80 177 a Fs(Computation)19 b(and)i(Data)e(Partitioning)g(on)h (Scalable)341 280 y(Shar)o(ed)g(Memory)g(Multipr)o(ocessors)403 451 y Fr(Sudarsan)15 b(T)l(andri)29 b(and)g(T)l(arek)14 b(S.)g(Abdelrahman) 316 526 y(Department)h(of)g(Electrical)f(and)h(Computer)g(Engineering)284 601 y(The)f(University)h(of)g(T)l(oronto,)f(T)l(oronto,)g(Canada,)f(M5S)i (1A4)478 675 y(e-mail:)g Fq(f)p Fr(tandri,tsa)p Fq(g)p Fr(@eecg.toronto.edu) 833 865 y Fp(Abstract)217 945 y Fo(In)g(this)h(paper)f(we)h(identify)f(the)h (factors)f(that)h(af)o(fect)f(the)h(derivation)e(of)i(com-)217 999 y(putation)10 b(and)h(data)g(partitions)g(on)g(scalable)g(shared)g (memory)g(multiprocessors)217 1053 y(\(SSMMs\).)18 b(W)l(e)12 b(show)h(that)f(these)h(factors)f(necessitate)i(an)e(SSMM-conscious)217 1107 y(approach.)17 b(In)10 b(addition)g(to)g(remote)g(memory)f(access,)k (which)d(is)h(the)f(sole)h(factor)217 1161 y(on)19 b(distributed)g(memory)f (multiprocessors,)k(cache)d(af)o(\256nity)m(,)i(memory)e(con-)217 1216 y(tention)12 b(and)h(false)g(sharing)f(are)h(important)f(factors)g(that) h(must)g(be)g(considered.)217 1270 y(Experimental)g(evidence)h(is)g (presented)g(to)g(demonstrate)f(the)h(impact)f(of)h(these)217 1324 y(factors)i(on)g(performance)g(using)g(three)h(applications)f(on)h(the)f (KSR1)h(and)f(the)217 1378 y(Hector)c(multiprocessors.)4 1540 y Fn(1)71 b(Intr)o(oduction)4 1667 y Fo(Scalable)12 b(shared)g(memory)f (multiprocessors)g(\(SSMMs\))g(are)h(becoming)f(increasingly)h(popular)f(and) h(a)4 1721 y(viable)e(alternative)f(to)h(distributed)f(memory)g (multiprocessors)h(\(DMMs\).)17 b(The)11 b(Stanford)e(DASH)g([20],)4 1775 y(FLASH)i([14)o(],)h(the)f(KSR1)f([24],)h(T)m(oronto')m(s)f(Hector)h ([26)o(],)h(NUMAchine)f([1)o(],)h(and)f(the)f(Cray)h(T3D)h([23)o(])4 1830 y(are)d(some)g(SSMMs)g(currently)e(in)i(use)g(or)f(under)g(development.) 17 b(Processors)9 b(in)f(a)h(SSMM)g(share)g(a)g(single)4 1884 y(coherent)f(address)g(space.)17 b(However)n(,)9 b(shared)f(memory)g(is)g (physically)g(distributed)g(to)f(allow)h(scalability)l(,)4 1938 y(as)17 b(shown)f(in)g(Figure)f(1.)29 b(This)17 b(distribution)e(of)g (shared)i(memory)e(results)h(in)g(non-uniform)e(memory)4 1992 y(access)f(latencies,)g(depending)f(on)f(the)h(distance)h(between)f(a)g (processor)f(and)h(memory)m(.)17 b(Consequently)m(,)4 2046 y(careful)12 b(placement)g(and)g(management)g(of)g(data)h(is)g(essential)g (for)e(scaling)i(performance.)77 2122 y(W)l(e)i(believe)f(that)g(data)g (distribution)732 2104 y Fm(1)764 2122 y Fo(is)g(a)h(good)f(paradigm)f(for)h (managing)f(data)i(in)f(data-parallel)4 2176 y(applications)h(on)g(SSMMs)g ([3)o(,)h(21].)25 b(The)16 b(division)e(of)h(array)f(data)h(allows)g(a)g (compiler)f(to)h(place)g(data)4 2230 y(in)g(the)g(physical)f(memory)g(of)h (the)g(processor)f(that)h(uses)h(it)e(the)h(most,)h(and)f(also)g(allows)g (the)g(compiler)4 2284 y(to)k(partition)f(the)h(computations)g(of)f(parallel) h(loops.)38 b(W)l(e)19 b(have)g(experimented)g(with)f(programmer)4 2339 y(speci\256ed)12 b(data)f(distributions)g(on)g(the)h(Hector)f (multiprocessor)f(and)i(have)g(found)e(them)h(to)h(be)f(ef)o(fective)4 2393 y(in)e(improving)e(performance.)16 b(However)n(,)10 b(the)e(task)h(of)g (selecting)g(a)g(good)f(data)h(distribution)f(requires)g(the)4 2447 y(programmer)i(to)h(understand)g(both)f(the)i(parallel)e(machine)h (architecture)g(and)g(the)g(data)g(access)i(patterns)4 2501 y(in)19 b(the)f(program.)37 b(Porting)17 b(programs)h(to)h(various)g (machines)g(and)f(tuning)h(them)f(for)g(performance)4 2555 y(becomes)g(a)f(tedious)g(and)g(laborious)g(process.)33 b(Consequently)m(,)19 b(it)e(is)h(desirable)f(to)g(derive)f(data)i(and)4 2609 y(computation)h (partitions)g(automatically)h(using)g(a)g(compiler)m(.)40 b(The)21 b(objective)e(of)h(this)g(paper)g(is)g(to)4 2664 y(describe)13 b(the)f(factors)g(that)g(af)o(fect)g(the)g(derivation)g(of)g(computation)f (and)i(data)f(partitions)g(on)g(SSMMs.)77 2739 y(On)19 b(DMMs,)k(the)c(main)g (factor)f(that)i(af)o(fects)f(the)g(performance)f(of)h(an)g(application)g(is) g(the)g(cost)4 2793 y(of)d(interprocessor)f(communication.)28 b(Consequently)m(,)17 b(scalable)g(performance)e(can)h(be)g(achieved)g(by)p 4 2838 737 2 v 62 2869 a Fl(1)79 2884 y Fk(In)10 b(this)f(paper)i(we)g(use)f (the)g(terms)h(data)g(distributi)o(ons)c(and)k(data)f(partitions)f (interchangeably)m(.)p eop %%Page: 2 2 2 1 bop 175 533 a @beginspecial 114 @llx 408 @lly 476 @urx 553 @ury 3600 @rwi @setspecial %%BeginDocument: numaarch1.ps /arrowHeight 10 def /arrowWidth 5 def /IdrawDict 51 dict def IdrawDict begin /reencodeISO { dup dup findfont dup length dict begin { 1 index /FID ne { def }{ pop pop } ifelse } forall /Encoding ISOLatin1Encoding def currentdict end definefont } def /ISOLatin1Encoding [ /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /space/exclam/quotedbl/numbersign/dollar/percent/ampersand/quoteright /parenleft/parenright/asterisk/plus/comma/minus/period/slash /zero/one/two/three/four/five/six/seven/eight/nine/colon/semicolon /less/equal/greater/question/at/A/B/C/D/E/F/G/H/I/J/K/L/M/N /O/P/Q/R/S/T/U/V/W/X/Y/Z/bracketleft/backslash/bracketright /asciicircum/underscore/quoteleft/a/b/c/d/e/f/g/h/i/j/k/l/m /n/o/p/q/r/s/t/u/v/w/x/y/z/braceleft/bar/braceright/asciitilde /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/dotlessi/grave/acute/circumflex/tilde/macron/breve /dotaccent/dieresis/.notdef/ring/cedilla/.notdef/hungarumlaut /ogonek/caron/space/exclamdown/cent/sterling/currency/yen/brokenbar /section/dieresis/copyright/ordfeminine/guillemotleft/logicalnot /hyphen/registered/macron/degree/plusminus/twosuperior/threesuperior /acute/mu/paragraph/periodcentered/cedilla/onesuperior/ordmasculine /guillemotright/onequarter/onehalf/threequarters/questiondown /Agrave/Aacute/Acircumflex/Atilde/Adieresis/Aring/AE/Ccedilla /Egrave/Eacute/Ecircumflex/Edieresis/Igrave/Iacute/Icircumflex /Idieresis/Eth/Ntilde/Ograve/Oacute/Ocircumflex/Otilde/Odieresis /multiply/Oslash/Ugrave/Uacute/Ucircumflex/Udieresis/Yacute /Thorn/germandbls/agrave/aacute/acircumflex/atilde/adieresis /aring/ae/ccedilla/egrave/eacute/ecircumflex/edieresis/igrave /iacute/icircumflex/idieresis/eth/ntilde/ograve/oacute/ocircumflex /otilde/odieresis/divide/oslash/ugrave/uacute/ucircumflex/udieresis /yacute/thorn/ydieresis ] def /Times-Roman reencodeISO def /none null def /numGraphicParameters 17 def /stringLimit 65535 def /Begin { save numGraphicParameters dict begin } def /End { end restore } def /SetB { dup type /nulltype eq { pop false /brushRightArrow idef false /brushLeftArrow idef true /brushNone idef } { /brushDashOffset idef /brushDashArray idef 0 ne /brushRightArrow idef 0 ne /brushLeftArrow idef /brushWidth idef false /brushNone idef } ifelse } def /SetCFg { /fgblue idef /fggreen idef /fgred idef } def /SetCBg { /bgblue idef /bggreen idef /bgred idef } def /SetF { /printSize idef /printFont idef } def /SetP { dup type /nulltype eq { pop true /patternNone idef } { dup -1 eq { /patternGrayLevel idef /patternString idef } { /patternGrayLevel idef } ifelse false /patternNone idef } ifelse } def /BSpl { 0 begin storexyn newpath n 1 gt { 0 0 0 0 0 0 1 1 true subspline n 2 gt { 0 0 0 0 1 1 2 2 false subspline 1 1 n 3 sub { /i exch def i 1 sub dup i dup i 1 add dup i 2 add dup false subspline } for n 3 sub dup n 2 sub dup n 1 sub dup 2 copy false subspline } if n 2 sub dup n 1 sub dup 2 copy 2 copy false subspline patternNone not brushLeftArrow not brushRightArrow not and and { ifill } if brushNone not { istroke } if 0 0 1 1 leftarrow n 2 sub dup n 1 sub dup rightarrow } if end } dup 0 4 dict put def /Circ { newpath 0 360 arc patternNone not { ifill } if brushNone not { istroke } if } def /CBSpl { 0 begin dup 2 gt { storexyn newpath n 1 sub dup 0 0 1 1 2 2 true subspline 1 1 n 3 sub { /i exch def i 1 sub dup i dup i 1 add dup i 2 add dup false subspline } for n 3 sub dup n 2 sub dup n 1 sub dup 0 0 false subspline n 2 sub dup n 1 sub dup 0 0 1 1 false subspline patternNone not { ifill } if brushNone not { istroke } if } { Poly } ifelse end } dup 0 4 dict put def /Elli { 0 begin newpath 4 2 roll translate scale 0 0 1 0 360 arc patternNone not { ifill } if brushNone not { istroke } if end } dup 0 1 dict put def /Line { 0 begin 2 storexyn newpath x 0 get y 0 get moveto x 1 get y 1 get lineto brushNone not { istroke } if 0 0 1 1 leftarrow 0 0 1 1 rightarrow end } dup 0 4 dict put def /MLine { 0 begin storexyn newpath n 1 gt { x 0 get y 0 get moveto 1 1 n 1 sub { /i exch def x i get y i get lineto } for patternNone not brushLeftArrow not brushRightArrow not and and { ifill } if brushNone not { istroke } if 0 0 1 1 leftarrow n 2 sub dup n 1 sub dup rightarrow } if end } dup 0 4 dict put def /Poly { 3 1 roll newpath moveto -1 add { lineto } repeat closepath patternNone not { ifill } if brushNone not { istroke } if } def /Rect { 0 begin /t exch def /r exch def /b exch def /l exch def newpath l b moveto l t lineto r t lineto r b lineto closepath patternNone not { ifill } if brushNone not { istroke } if end } dup 0 4 dict put def /Text { ishow } def /idef { dup where { pop pop pop } { exch def } ifelse } def /ifill { 0 begin gsave patternGrayLevel -1 ne { fgred bgred fgred sub patternGrayLevel mul add fggreen bggreen fggreen sub patternGrayLevel mul add fgblue bgblue fgblue sub patternGrayLevel mul add setrgbcolor eofill } { eoclip originalCTM setmatrix pathbbox /t exch def /r exch def /b exch def /l exch def /w r l sub ceiling cvi def /h t b sub ceiling cvi def /imageByteWidth w 8 div ceiling cvi def /imageHeight h def bgred bggreen bgblue setrgbcolor eofill fgred fggreen fgblue setrgbcolor w 0 gt h 0 gt and { l b translate w h scale w h true [w 0 0 h neg 0 h] { patternproc } imagemask } if } ifelse grestore end } dup 0 8 dict put def /istroke { gsave brushDashOffset -1 eq { [] 0 setdash 1 setgray } { brushDashArray brushDashOffset setdash fgred fggreen fgblue setrgbcolor } ifelse brushWidth setlinewidth originalCTM setmatrix stroke grestore } def /ishow { 0 begin gsave fgred fggreen fgblue setrgbcolor /fontDict printFont printSize scalefont dup setfont def /descender fontDict begin 0 [FontBBox] 1 get FontMatrix end transform exch pop def /vertoffset 1 printSize sub descender sub def { 0 vertoffset moveto show /vertoffset vertoffset printSize sub def } forall grestore end } dup 0 3 dict put def /patternproc { 0 begin /patternByteLength patternString length def /patternHeight patternByteLength 8 mul sqrt cvi def /patternWidth patternHeight def /patternByteWidth patternWidth 8 idiv def /imageByteMaxLength imageByteWidth imageHeight mul stringLimit patternByteWidth sub min def /imageMaxHeight imageByteMaxLength imageByteWidth idiv patternHeight idiv patternHeight mul patternHeight max def /imageHeight imageHeight imageMaxHeight sub store /imageString imageByteWidth imageMaxHeight mul patternByteWidth add string def 0 1 imageMaxHeight 1 sub { /y exch def /patternRow y patternByteWidth mul patternByteLength mod def /patternRowString patternString patternRow patternByteWidth getinterval def /imageRow y imageByteWidth mul def 0 patternByteWidth imageByteWidth 1 sub { /x exch def imageString imageRow x add patternRowString putinterval } for } for imageString end } dup 0 12 dict put def /min { dup 3 2 roll dup 4 3 roll lt { exch } if pop } def /max { dup 3 2 roll dup 4 3 roll gt { exch } if pop } def /midpoint { 0 begin /y1 exch def /x1 exch def /y0 exch def /x0 exch def x0 x1 add 2 div y0 y1 add 2 div end } dup 0 4 dict put def /thirdpoint { 0 begin /y1 exch def /x1 exch def /y0 exch def /x0 exch def x0 2 mul x1 add 3 div y0 2 mul y1 add 3 div end } dup 0 4 dict put def /subspline { 0 begin /movetoNeeded exch def y exch get /y3 exch def x exch get /x3 exch def y exch get /y2 exch def x exch get /x2 exch def y exch get /y1 exch def x exch get /x1 exch def y exch get /y0 exch def x exch get /x0 exch def x1 y1 x2 y2 thirdpoint /p1y exch def /p1x exch def x2 y2 x1 y1 thirdpoint /p2y exch def /p2x exch def x1 y1 x0 y0 thirdpoint p1x p1y midpoint /p0y exch def /p0x exch def x2 y2 x3 y3 thirdpoint p2x p2y midpoint /p3y exch def /p3x exch def movetoNeeded { p0x p0y moveto } if p1x p1y p2x p2y p3x p3y curveto end } dup 0 17 dict put def /storexyn { /n exch def /y n array def /x n array def n 1 sub -1 0 { /i exch def y i 3 2 roll put x i 3 2 roll put } for } def /SSten { fgred fggreen fgblue setrgbcolor dup true exch 1 0 0 -1 0 6 -1 roll matrix astore } def /FSten { dup 3 -1 roll dup 4 1 roll exch newpath 0 0 moveto dup 0 exch lineto exch dup 3 1 roll exch lineto 0 lineto closepath bgred bggreen bgblue setrgbcolor eofill SSten } def /Rast { exch dup 3 1 roll 1 0 0 -1 0 6 -1 roll matrix astore } def /arrowhead { 0 begin transform originalCTM itransform /taily exch def /tailx exch def transform originalCTM itransform /tipy exch def /tipx exch def /dy tipy taily sub def /dx tipx tailx sub def /angle dx 0 ne dy 0 ne or { dy dx atan } { 90 } ifelse def gsave originalCTM setmatrix tipx tipy translate angle rotate newpath arrowHeight neg arrowWidth 2 div moveto 0 0 lineto arrowHeight neg arrowWidth 2 div neg lineto patternNone not { originalCTM setmatrix /padtip arrowHeight 2 exp 0.25 arrowWidth 2 exp mul add sqrt brushWidth mul arrowWidth div def /padtail brushWidth 2 div def tipx tipy translate angle rotate padtip 0 translate arrowHeight padtip add padtail add arrowHeight div dup scale arrowheadpath ifill } if brushNone not { originalCTM setmatrix tipx tipy translate angle rotate arrowheadpath istroke } if grestore end } dup 0 9 dict put def /arrowheadpath { newpath arrowHeight neg arrowWidth 2 div moveto 0 0 lineto arrowHeight neg arrowWidth 2 div neg lineto } def /leftarrow { 0 begin y exch get /taily exch def x exch get /tailx exch def y exch get /tipy exch def x exch get /tipx exch def brushLeftArrow { tipx tipy tailx taily arrowhead } if end } dup 0 4 dict put def /rightarrow { 0 begin y exch get /tipy exch def x exch get /tipx exch def y exch get /taily exch def x exch get /tailx exch def brushRightArrow { tipx tipy tailx taily arrowhead } if end } dup 0 4 dict put def Begin [ 0.799705 0 0 0.799705 0 0 ] concat /originalCTM matrix currentmatrix def Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.125 -0 -0 0.125 433.125 504.625 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.125 -0 -0 0.125 433.125 552.625 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.125 -0 -0 0.125 265.125 504.625 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.125 -0 -0 0.125 265.125 552.625 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.125 -0 -0 0.125 137.125 504.625 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.125 -0 -0 0.125 137.125 552.625 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.125 -0 -0 0.125 321.125 600.625 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.125 -0 -0 0.125 489.125 600.625 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.125 -0 -0 0.125 193.125 600.625 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I Elli 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.5 -0 -0 0.5 144 410 ] concat 453 529 448 32 Elli End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0.75 SetP [ 0.125 -0 -0 0.125 486.625 598.125 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl none SetB %I b n 0 0 0 SetCFg 1 1 1 SetCBg 1 SetP [ 0.084375 -0 -0 0.084375 505.272 606.534 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0.75 SetP [ 0.125 -0 -0 0.125 318.625 598.125 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl none SetB %I b n 0 0 0 SetCFg 1 1 1 SetCBg 1 SetP [ 0.084375 -0 -0 0.084375 337.272 606.534 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0.75 SetP [ 0.125 -0 -0 0.125 190.625 598.125 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl none SetB %I b n 0 0 0 SetCFg 1 1 1 SetCBg 1 SetP [ 0.084375 -0 -0 0.084375 209.272 606.534 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0.75 SetP [ 0.125 -0 -0 0.125 134.625 502.125 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl none SetB %I b n 0 0 0 SetCFg 1 1 1 SetCBg 1 SetP [ 0.084375 -0 -0 0.084375 153.272 510.534 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 1 SetP [ 0.125 -0 -0 0.125 262.625 502.125 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 1 SetP [ 0.125 -0 -0 0.125 430.625 502.125 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I Elli 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.125 -0 -0 0.125 362.875 611.625 ] concat 617 99 16 16 Elli End Begin %I Elli 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.125 -0 -0 0.125 378.875 611.625 ] concat 617 99 16 16 Elli End Begin %I Line 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.5 -0 -0 0.5 133.5 439.5 ] concat 117 369 181 369 Line End Begin %I Line 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.5 -0 -0 0.5 261.5 439.5 ] concat 117 369 181 369 Line End Begin %I Line 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.5 -0 -0 0.5 429.5 439.5 ] concat 117 369 181 369 Line End Begin %I Text 0 0 0 SetCFg Times-Roman 12 SetF [ 1 0 0 1 179.5 532 ] concat [ (Procr) ] Text End Begin %I Text 0 0 0 SetCFg Times-Roman 12 SetF [ 1 0 0 1 307.5 532 ] concat [ (Procr) ] Text End Begin %I Text 0 0 0 SetCFg Times-Roman 12 SetF [ 1 0 0 1 475.5 532 ] concat [ (Procr) ] Text End Begin %I Text 0 0 0 SetCFg Times-Roman 12 SetF [ 1 0 0 1 235.5 628 ] concat [ (Mem) ] Text End Begin %I Text 0 0 0 SetCFg Times-Roman 12 SetF [ 1 0 0 1 363.5 628 ] concat [ (Mem) ] Text End Begin %I Text 0 0 0 SetCFg Times-Roman 12 SetF [ 1 0 0 1 531.5 628 ] concat [ (Mem) ] Text End Begin %I Elli 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.125 -0 -0 0.125 346.875 611.625 ] concat 617 99 16 16 Elli End Begin %I BSpl 1 0 1 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.5 -0 -0 0.5 19.5 393.5 ] concat 457 333 473 381 441 365 457 413 4 BSpl End Begin %I BSpl 1 0 1 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.5 -0 -0 0.5 147.5 393.5 ] concat 457 333 473 381 441 365 457 413 4 BSpl End Begin %I Text 0 0 0 SetCFg Times-Roman 12 SetF [ 1 0 0 1 360 555 ] concat [ (Remote) (memory) ] Text End Begin %I Text 0 0 0 SetCFg Times-Roman 12 SetF [ 1 0 0 1 232 555 ] concat [ (Local) (memory) ] Text End Begin %I BSpl 1 0 1 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.5 -0 -0 0.5 315.5 393.5 ] concat 457 333 473 381 441 365 457 413 4 BSpl End Begin %I Text 0 0 0 SetCFg Times-Roman 12 SetF [ 1 0 0 1 528 555 ] concat [ (Remote) (memory) ] Text End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0.75 SetP [ 0.125 -0 -0 0.125 134.625 550.125 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl none SetB %I b n 0 0 0 SetCFg 1 1 1 SetCBg 1 SetP [ 0.084375 -0 -0 0.084375 153.272 558.534 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 1 SetP [ 0.125 -0 -0 0.125 262.625 550.125 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 1 SetP [ 0.125 -0 -0 0.125 430.625 550.125 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I Text 0 0 0 SetCFg Times-Roman 12 SetF [ 1 0 0 1 177 580 ] concat [ (Cache) ] Text End Begin %I Text 0 0 0 SetCFg Times-Roman 12 SetF [ 1 0 0 1 305 580 ] concat [ (Cache) ] Text End Begin %I Text 0 0 0 SetCFg Times-Roman 12 SetF [ 1 0 0 1 473 580 ] concat [ (Cache) ] Text End Begin %I Line 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 1 -0 -0 1 60 364 ] concat 132 180 132 196 Line End Begin %I Line 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 1 -0 -0 1 188 364 ] concat 132 180 132 196 Line End Begin %I Line 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 1 -0 -0 1 356 364 ] concat 132 180 132 196 Line End Begin %I Line 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 1 -0 -0 1 49 237 ] concat 143 355 143 435 Line End Begin %I Line 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 1 -0 -0 1 49 237 ] concat 439 355 439 427 Line End Begin %I Line 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 1 -0 -0 1 49 237 ] concat 271 355 271 427 Line End Begin %I Elli 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 1 SetP [ 0.5 -0 -0 0.5 141.5 407.5 ] concat 453 529 448 32 Elli End Begin %I Text 0 0 0 SetCFg Times-Roman 12 SetF [ 1 0 0 1 306.5 676 ] concat [ (Interconnection Network) ] Text End End %I eop showpage end %%EndDocument @endspecial 295 598 a Fo(Figure)11 b(1:)18 b(Scalable)12 b(shared-memory)f (multiprocessor)h(architecture.)4 719 y(partitioning)g(data)i(and)g (computations)f(in)g(a)h(way)f(that)h(minimizes)f(interprocessor)g (communications.)4 773 y(On)f(SSMMs,)h(processors)f(communicate)f(through)g (shared)g(memory)m(,)h(and)f(the)h(cost)g(of)f(interprocessor)4 827 y(communications)h(\(i.e.,)i(remote)e(memory)f(access\))j(is)f (relatively)e(inexpensive.)19 b(W)l(e)13 b(show)g(that)f(cache)4 881 y(af)o(\256nity)m(,)i(memory)f(contention)h(and)g(false)g(sharing)g(are)g (additional)g(factors)g(that)f(must)i(be)f(considered)4 935 y(in)i(the)g(selection)g(of)f(data)h(distributions.)28 b(Furthermore,)16 b(the)g(presence)g(of)f(a)h(single)g(shared)g(address)4 989 y(space)i(allows)g(\257exibility)f(in)g(the)h(selection)g(of)f(a)h (computation)e(partition.)33 b(Speci\256cally)m(,)19 b(we)f(show)4 1044 y(that)h(relaxing)g(the)h(commonly)e(used)i(owner)o(-computes)f(rule)g ([15)o(])h(has)g(performance)e(advantages.)4 1098 y(W)l(e)d(present)g (experimental)f(results)i(to)e(support)h(our)f(conclusions)i(using)f(three)f (applications)h(on)g(two)4 1152 y(SSMMs,)e(the)g(Hector)f(and)g(the)g(KSR1)h (multiprocessors.)77 1228 y(The)g(remainder)f(of)g(this)g(paper)g(is)h(or)o (ganized)f(as)h(follows.)18 b(Section)12 b(2)h(presents)f(an)h(overview)f (data)4 1282 y(distributions.)35 b(Section)18 b(3)g(describes)h(the)f (factors)f(that)h(impact)g(on)g(the)h(selection)f(of)g(computation)4 1336 y(and)i(data)g(partitions)f(on)g(SSMMs.)41 b(Section)19 b(4)h(gives)g(experimental)f(evidence)h(of)f(the)h(impact)f(of)4 1390 y(cache)e(af)o(\256nity)e(and)h(false)g(sharing)g(on)g(the)g(choice)h (of)e(data)h(partitions.)29 b(Section)16 b(5)g(presents)h(results)4 1444 y(to)d(show)g(that)g(the)g(\257exibility)f(in)h(selecting)h(the)f (computation)f(partitioning)g(can)h(be)g(used)h(to)f(improve)4 1498 y(performance.)j(Section)9 b(6)i(reviews)f(related)g(work.)17 b(Finally)m(,)11 b(Section)e(7)i(presents)f(concluding)g(remarks)4 1553 y(and)j(directions)e(for)h(future)f(work.)4 1734 y Fn(2)71 b(Data)19 b(Distributions)4 1861 y Fo(Data)10 b(distribution)f([15)o(,)i(16]) e(is)i(achieved)f(by)g(specifying)f(a)h(partitioning)f(scheme)h(for)f(each)i (array)e(in)h(the)4 1915 y(program)h(and)i(by)f(specifying)g(a)g(processor)h (geometry)e(to)i(which)f(array)g(partitions)f(map.)18 b(A)13 b(processor)4 1970 y(geometry)g(is)i(an)f Fj(n)p Fo(-dimensional)f(Cartesian) h(grid)f(of)h(virtual)f(processors)h Fi(\()p Fj(V)1385 1977 y Fm(0)1404 1970 y Fj(;)8 b(V)1454 1977 y Fm(1)1473 1970 y Fj(;)g Fh(\001)g(\001)g(\001)g Fj(;)g(V)1611 1977 y Fg(n)p Ff(\000)p Fm(1)1679 1970 y Fi(\))p Fo(,)15 b(where)4 2024 y Fj(V)32 2031 y Fg(i)63 2024 y Fo(is)i(the)f(number)g(of)g(processors)h(in)f (the)g Fj(i)793 2006 y Fg(th)844 2024 y Fo(dimension)g(of)g(the)h(grid,)g (and)f Fj(V)1430 2031 y Fm(0)1463 2024 y Fh(\002)d Fj(V)1543 2031 y Fm(1)1575 2024 y Fh(\002)g(\001)8 b(\001)g(\001)14 b(\002)f Fj(V)1779 2031 y Fg(n)p Ff(\000)p Fm(1)4 2078 y Fo(=)i Fj(P)7 b Fo(,)16 b(the)f(total)f(number)h(of)f(processors.)26 b(A)15 b(partitioning)e(scheme)i(assigns)h(a)f Fe(partitioning)f(attribute)4 2132 y Fo(to)k(each)g(dimension)g(the)g(array)m(.)34 b(There)18 b(are)g(four)f(partitioning)f(attributes.)35 b(The)18 b Fd(Block)g Fo(attribute)4 2186 y(divides)f(the)g(corresponding)g(dimension)g(of)f(the)h (array)g(in)g(approximately)f(equal)h(size)h(blocks)f(such)4 2240 y(that)j(a)g(processor)g(owns)h(a)f(contiguous)g(range)g(of)f(that)h (dimension)g(of)g(the)g(array)m(.)41 b(The)20 b Fd(Cyclic)4 2295 y Fo(attribute)11 b(divides)h(the)h(corresponding)e(array)g(dimension)h (by)g(distributing)f(the)h(array)f(elements)i(in)f(this)4 2349 y(dimension)g(to)g(processors)h(in)f(a)h(round-robin)d(fashion.)18 b(The)13 b Fd(BlockCyclic)f Fo(attribute)f(\256rst)h(groups)4 2403 y(array)f(elements)h(in)f(the)g(corresponding)g(dimension)g(in)g (contiguous)g(blocks)h(of)f(a)h(given)f(size,)h(and)g(then)4 2457 y(assigns)k(the)f(blocks)f(to)h(processors)g(in)g(a)g(round-robin)d (fashion.)26 b(The)15 b(block)f(size,)j(called)d(the)h Fe(block-)4 2511 y(cyclic)10 b(factor)p Fo(,)h(is)e(supplied)h(by)f(the)h(programmer)m(.) 16 b(Finally)m(,)9 b(the)h Fd(*)f Fo(attribute)g(is)h(used)g(to)f(indicate)g (that)h(the)4 2565 y(corresponding)f(dimension)g(of)g(the)h(array)f(is)h(not) f(distributed.)17 b(The)10 b(processor)g(geometry)f(on)g(which)h(the)4 2620 y(array)h(is)i(mapped)e(determines)h(the)g(number)f(of)g(processors)i (assigned)f(to)g(each)g(distributed)f(dimension)4 2674 y(of)h(the)f(array)m (.)18 b(For)11 b(example,)h(distributing)e(a)i(two)g(dimensional)f(array)h (using)f(the)h Fd(\(Block,Block\))4 2728 y Fo(attributes)g(onto)h(a)g(two)f (dimensional)h(processor)f(geometry)g(of)h(\(2,4\),)f(distributes)h(the)f (array)h(on)f(to)h(the)4 2782 y(8)k(processors,)i(assigning)f(2)f(processors) g(to)g(the)g(\256rst)g(dimension)g(and)g(4)g(processors)g(to)g(the)g(second)4 2836 y(dimension.)p eop %%Page: 3 3 3 2 bop 4 -21 a Fn(3)71 b(Performance)21 b(Factors)4 106 y Fo(The)15 b(main)g(factor)f(that)g(af)o(fects)h(the)f(performance)g(of)g(a)h (parallel)f(application)g(on)h(a)g(DMM)g(is)g(the)g(rel-)4 160 y(atively)i(high)f(cost)h(of)g(interprocessor)f(communication.)30 b(For)17 b(example,)h(the)f(latency)f(for)g(a)h(remote)4 215 y(memory)e(access)i(on)e(the)h(CM5)g(multiprocessor)f(is)h(approximately)e (2560)h(processor)h(cycles)1699 196 y Fm(2)1718 215 y Fo(.)28 b(This)4 269 y(necessitates)16 b(the)e(selection)h(of)f(computation)f(and)h (data)h(partitions)f(that)g(minimize)f(the)i(cost)f(of)g(com-)4 323 y(munication.)27 b(In)15 b(contrast,)h(on)f(SSMMs,)i(processors)f (communicate)e(through)h(shared)g(memory)g(and)4 377 y(the)j(cost)h(of)f (remote)f(memory)h(access)h(is)g(relatively)e(small.)36 b(For)17 b(example,)j(the)f(cost)f(of)g(a)g(remote)4 431 y(read)11 b(operation)g(on)g (the)h(KSR1)f(is)h(approximately)e(170)h(processor)g(cycles)h([24].)17 b(Consequently)m(,)12 b(other)4 485 y(factors)h(come)f(into)h(play)g(in)f (the)h(selection)g(of)f(computation)g(and)h(data)g(partitions.)19 b(In)13 b(this)g(section)g(we)4 540 y(elaborate)h(on)g(these)g(factors)g(and) g(on)g(how)g(they)g(af)o(fect)g(performance,)f(and)h(consequently)m(,)h(af)o (fect)f(the)4 594 y(choice)f(of)f(data)g(and)g(computation)g(partitions.)4 755 y Fc(3.1)58 b(Cache)14 b(Af\256nity)4 853 y Fo(Caches)j(are)e(used)h(in)f (SSMMs)h(to)g(reduce)f(ef)o(fective)g(memory)f(access)j(time)e(and)h(reduce)f (contention)4 907 y(in)e(the)h(interconnection)e(network.)21 b(Data)14 b(is)g(transferred)e(between)i(cache)g(and)f(memory)g(in)g(units)g (of)h(a)4 961 y Fe(cache)g(line)p Fo(,)h(typically)e(a)h(multiple)f(of)g(the) h(processor)g(word)f(size.)24 b Fe(Spatial)13 b(r)n(euse)i Fo(occurs)e(when)h(other)4 1015 y(words)h(on)g(the)g(same)g(line)g(are)g (used)g(by)g(the)g(processor)g(before)f(the)h(line)g(is)g(\257ushed)g(from)f (the)h(cache.)4 1070 y(Analogously)m(,)g Fe(temporal)f(r)n(euse)i Fo(occurs)e(when)g(data)h(on)f(a)g(cache)h(line)f(is)h(used)g(again)f(before) g(the)g(line)4 1124 y(is)i(evicted)g(from)e(the)i(cache.)29 b(The)16 b(performance)e(of)i(an)f(application)h(depends)g(to)f(a)h(lar)o(ge) f(extent)h(on)4 1178 y(the)g(ability)g(of)g(the)g(caches)h(to)f(exploit)g (spatial)h(and)f(temporal)f(reuse.)31 b(In)16 b(some)g(cases,)j(this)d(may)h (be)4 1232 y(dif)o(\256cult)9 b(because)i(of)f(the)g(limited)f(capacity)h (and)g(associativity)g(of)g(caches.)18 b(Data)10 b(brought)f(into)h(a)g (cache)4 1286 y(by)16 b(a)g(reference)f(or)h(a)g(prefetch)f(may)h(be)g (evicted)f(before)h(being)f(used)h(or)g(reused,)h(because)g(of)e(either)4 1340 y(a)i(capacity)g(or)g(a)g(con\257ict)f(miss)i(caused)f(by)g(a)g (subsequent)h(reference.)31 b(Cache)18 b(misses)f(on)g(SSMMs)4 1395 y(adversely)f(af)o(fect)g(performance,)g(since)h(evicted)f(data)g(must)g (be)g(retrieved)f(from)g(its)i(home)e(memory)m(,)4 1449 y(which)k(may)g(be)g (remote)f(to)h(the)f(processor)m(.)38 b(Caches)20 b(play)f(less)h(of)e(an)h (important)f(role)g(in)h(DMMs)4 1503 y(because)g(cache)f(misses)h(result)e (exclusively)h(in)f(local)h(memory)f(accesses,)k(which)d(are)g(inexpensive)4 1557 y(relative)12 b(to)g(interprocessor)g(communications.)4 1718 y Fc(3.2)58 b(False)14 b(Sharing)4 1816 y Fo(In)g(SSMMs)h(data)f(on)h (the)f(same)h(cache)g(line)f(may)g(be)h(shared)f(by)h(more)e(than)i(one)f (processor)n(,)h(and)g(the)4 1870 y(line)j(may)g(exit)g(in)g(more)g(than)g (one)g(processor)r(')m(s)g(cache)h(at)f(the)g(same)h(time.)35 b(Hardware)18 b(is)g(used)h(to)4 1925 y(maintain)13 b(the)f(consistency)i(of) e(the)h(multiple)g(copies)g(of)f(the)h(line,)h(typically)e(using)h(a)g (write-invalidate)4 1979 y(protocol)e([24,)h(14].)18 b Fe(T)m(rue)12 b(sharing)g Fo(occurs)g(when)g(two)g(or)f(more)g(processors)i(access)g(the)f (same)g(data)g(on)4 2033 y(a)k(cache)f(line,)i(and)e(it)g(re\257ects)g (necessary)h(data)f(communications)g(in)g(an)g(application.)27 b(On)15 b(the)g(other)4 2087 y(hand,)h Fe(false)e(sharing)h Fo(occurs)f(when)h(two)f(processors)h(access)h(dif)o(ferent)d(pieces)i(of)f (data)h(on)f(the)g(same)4 2141 y(cache)e(line.)18 b(If)11 b(processors)h (write)g(to)f(the)h(same)g(cache)g(line,)g(the)g(cache)g(consistency)h (hardware)e(causes)4 2195 y(the)j(cache)g(line)g(to)g(be)g(transferred)f (back)h(and)g(forth)f(between)h(processors)g(leading)g(to)g(a)g (\252ping-pong\272)4 2250 y(ef)o(fect)h([8)o(].)27 b(False)16 b(sharing)f(causes)h(extensive)g(invalidation)e(traf)o(\256c)g(and)i(can)f (considerably)g(degrade)4 2304 y(performance.)i(False)c(sharing)f(is)h (non-existent)e(on)i(DMMs.)4 2465 y Fc(3.3)58 b(Memory)14 b(Contention)4 2563 y Fo(Memory)i(contention)g(occurs)g(when)g(many)g(processors)h(access)h (data)e(in)g(a)g(single)h(memory)e(module)4 2617 y(at)j(the)g(same)h(time.)35 b(Since)18 b(the)g(communication)f(protocol)g(in)h(SSMMs)g(is)g(receiver)o (-initiated,)h(and)4 2671 y(transfers)i(data)g(in)f(units)h(of)g(relatively)f (small)h(cache)g(lines,)j(a)d(lar)o(ge)g(number)f(of)h(requests)g(to)g(the)4 2725 y(same)12 b(memory)f(can)h(over\257ow)f(memory)g(buf)o(fers)g(and)h (cause)g(excessive)h(delays)f(in)g(memory)e(response)4 2780 y(time)20 b([13].)42 b(Contention)20 b(has)h(been)g(considered)g(less)g(of)f (a)h(performance)e(bottleneck)h(on)h(DMMs)p 4 2825 737 2 v 62 2855 a Fl(2)79 2870 y Fk(Calculated)10 b(based)h(on)f(the)g(elapsed)h (time)f(for)g(a)g(send-reply)g(message)i(of)e(128)g(bytes)g([19)o(].)p eop %%Page: 4 4 4 3 bop 4 -27 a Fo(because)16 b(a)g(sender)o(-initiated)e(communication)h (protocol)f(is)i(employed,)g(and)g(because)g(programmers)4 27 y(typically)f(communicate)f(data)i(in)f(lar)o(ge)g(infrequent)f(messages.) 28 b(Applications)15 b(on)g(DMMs)h(also)f(use)4 82 y(collective)d (communications)g([15)o(])g(that)h(further)e(reduce)h(contention.)4 243 y Fc(3.4)58 b(Over)o(head)14 b(of)g(Parallelism)4 341 y Fo(In)g(DMM,)i(synchronization)e(is)h(achieved)f(through)g(data)g (communication.)24 b(However)n(,)15 b(on)g(SSMMs,)4 395 y(synchronization)9 b(is)h(explicit)e(and)i(is)g(independent)f(of)f(data)i(communication.)16 b(The)10 b(resulting)f(overhead)4 449 y(can)14 b(become)f(a)h(performance)e (bottleneck)h([27)o(],)h(and)f(must)h(be)f(minimized.)21 b(The)14 b(performance)e(of)h(an)4 503 y(application)e(is)h(also)h(af)o(fected)e(by)h (the)f(overhead)h(involved)f(in)h(parallelizing)e(loops,)j(manifested)e(in)h (the)4 557 y(form)h(of)h(computation)f(partitioning)f(tests)j([25)o(].)23 b(These)15 b(tests)g(can)f(be)g(eliminated)g(in)f(some)i(cases)g(by)4 612 y(compiler)g(analysis,)i(but)d(when)i(not)f(possible,)h(can)g(degrade)f (performance.)26 b(This)15 b(overhead)g(though)4 666 y(also)d(present)g(in)f (the)h(case)g(of)f(DMMs,)j(is)e(not)f(considered)h(signi\256cant)f(because)h (of)g(the)f(predominantly)4 720 y(high)h(cost)h(of)f(remote)g(memory)f (access.)4 902 y Fn(4)71 b(Impact)19 b(on)f(Data)h(Distribution)4 1029 y Fo(In)e(this)g(section)g(we)g(use)g(two)g(applications,)h Fd(Multigrid)e Fo(and)h Fd(Tred2)p Fo(,)h(to)f(illustrate)f(the)h(impact)4 1083 y(of)f(cache)h(af)o(\256nity)f(and)h(false)g(sharing)f(on)h(the)f (choice)h(of)f(a)h(data)g(distribution.)30 b(The)17 b(KSR1)f(system)4 1137 y(is)f(used)f(because)h(of)f(its)h(lar)o(ge)f(cache)g(size,)i(and)e (because)h(of)f(the)g(presence)h(of)f(monitoring)e(hardware)4 1191 y(that)i(enables)h(the)g(measurement)f(of)g(the)g(number)g(of)f (non-local)h(memory)g(accesses)i(and)e(the)g(number)4 1245 y(of)e(caches)h(misses)h(for)d(a)i(processor)m(.)77 1321 y(The)j(KSR1)e(is)h (a)g(Cache)g(only)g(Memory)f(Architecture)g(\(COMA\))g(con\256gured)g(as)h(a) g(hierarchy)f(of)4 1375 y(slotted)c(rings)g(with)g(processing)g(cells)h(on)f (the)g(leaf-level)f(rings.)18 b(The)10 b(local)g(portion)g(of)f(shared)i (memory)4 1429 y(associated)g(with)e(a)i(processor)e(is)i(or)o(ganized)e(as)i (a)f(cache.)18 b(There)10 b(is)g(no)g(home)g(location)f(for)g(data,)i(rather) n(,)4 1483 y(data)k(may)g(exist)f(in)h(more)f(than)h(one)f(local)h(memory)m (.)24 b(The)16 b(hardware)e(maintains)g(the)h(consistency)g(of)4 1538 y(possible)e(multiple)e(copies)i(of)f(the)g(data.)77 1613 y(The)e(KSR1)g(implicitly)e(implements)i(the)f(owner)o(-computes)g(rule,)h (since)g(data)g(written)f(by)g(a)h(proces-)4 1667 y(sor)j(must)f(exclusively) g(reside)h(in)f(the)h(processor)r(')m(s)f(local)g(portion)g(of)g(the)g (shared)h(memory)m(.)k(Hardware)4 1722 y(automatically)j(migrates)g(data)h (to)g(the)f(processor)h(that)f(requests)h(the)g(data)f(in)h(units)g(of)f Fe(subpages)p Fo(.)4 1776 y(Hence,)13 b(the)f(computation)g(partitioning)e (of)i(a)g(loop)g(dictates)h(the)f(residence)g(of)g(a)g(data)h(item)e(and)i (hence)4 1830 y(the)k(distribution)f(of)h(the)g(arrays)g(in)g(the)g(loop.)33 b(Data)17 b(which)g(is)h(read)f(by)g(the)g(processors)h(may)f(exist)4 1884 y(in)e(multiple)e(local)i(memories,)g(and)g(read)f(requests)h(to)g(this) g(data)f(from)g(dif)o(ferent)f(processors)i(may)g(be)4 1938 y(satis\256ed)e(from)e(dif)o(ferent)g(portions)h(of)g(the)g(shared)h(memory)m (.)4 2099 y Fc(4.1)58 b(Cache-Conscious)13 b(Data)i(Distribution)4 2197 y Fo(The)j Fd(Multigrid)e Fo(application)g(from)g(the)h(NAS)f(suite)h (of)g(benchmarks)f(illustrates)h(how)g(data)g(dis-)4 2252 y(tributions)d (must)h(be)g(cache-conscious.)27 b Fd(Multigrid)14 b Fo(is)h(a)g(three)g (dimensional)f(solver)h(calculating)4 2306 y(the)j(potential)f(\256eld)h(on)f (a)h(cubical)g(grid.)34 b(W)l(e)18 b(focus)g(on)f(the)h(subroutine)f Fd(psinv)h Fo(which)f(uses)i(two)4 2360 y(3-dimensional)13 b(arrays)h Fj(U)20 b Fo(and)14 b Fj(R)p Fo(.)25 b(The)14 b(subroutine)g (mainly)f(performs)h(the)g(following)f(computation)4 2414 y(inside)i(a)h (triply)e(nested)h(loop:)23 b Fj(U)5 b Fi(\()p Fj(i;)23 b(j;)h(k)r Fi(\))15 b(+)j(=)33 b Fj(\013)p Fi(\()15 b Fj(R)p Fi(\()p Fj(f)5 b Fi(\()p Fj(i)p Fi(\))p Fj(;)24 b(g)r Fi(\()p Fj(j)s Fi(\))p Fj(;)f(h)p Fi(\()p Fj(k)r Fi(\)\)\))p Fo(,)16 b(where)f Fj(f)5 b Fi(\()p Fj(i)p Fi(\))15 b Fo(=)h Fj(i)c Fh(\000)g Fo(1,)4 2468 y Fj(i)18 b Fo(or)g Fj(i)13 b Fi(+)i Fo(1,)20 b(as)e(are)g(the)g (functions)g Fj(g)i Fo(and)e Fj(h)p Fo(.)36 b(The)18 b(loop)g(nest)g(is)h (fully)e(parallel.)35 b(The)18 b(application)4 2522 y(has)e(nearest)g (neighbor)e(communications)h(along)g(all)g(three)g(dimensions,)i(which)e(is)h (typical)f(of)g(many)4 2577 y(scienti\256c)d(applications.)77 2652 y(In)d(this)g(application,)g(we)g(choose)h(not)e(to)h(parallelize)f(the) h(innermost)g(loop)f(to)h(avoid)g(cache)g(line)g(false)4 2706 y(sharing)k(and)g(cache)h(interference;)e(successive)j(iterations)e(of)f (this)i(loop)f(access)h(successive)h(elements)4 2761 y(on)h(the)f(same)i (cache)f(line.)28 b(Hence)16 b(we)g(use)g(a)g(two)g(dimensional)f(grid)g(for) g(the)h(processor)f(geometry)m(.)4 2815 y(Since)10 b(the)g(application)g(has) h(nearest)f(neighbor)f(communications,)i Fd(Block)f Fo(distribution)f (performs)g(the)4 2869 y(best.)18 b(The)10 b(restriction)e(of)h(the)h (innermost)f(loop)g(to)g(be)h(sequential)f(requires)g(the)g(arrays)h(to)f(be) h(distributed)p eop %%Page: 5 5 5 4 bop 503 532 a @beginspecial 50 @llx 50 @lly 410 @urx 302 @ury 2057 @rwi @setspecial %%BeginDocument: mg.ps /gnudict 40 dict def gnudict begin /Color false def /Solid false def /gnulinewidth 5.000 def /vshift -33 def /dl {10 mul} def /hpt 31.5 def /vpt 31.5 def /M {moveto} bind def /L {lineto} bind def /R {rmoveto} bind def /V {rlineto} bind def /vpt2 vpt 2 mul def /hpt2 hpt 2 mul def /Lshow { currentpoint stroke M 0 vshift R show } def /Rshow { currentpoint stroke M dup stringwidth pop neg vshift R show } def /Cshow { currentpoint stroke M dup stringwidth pop -2 div vshift R show } def /DL { Color {setrgbcolor Solid {pop []} if 0 setdash } {pop pop pop Solid {pop []} if 0 setdash} ifelse } def /BL { stroke gnulinewidth 2 mul setlinewidth } def /AL { stroke gnulinewidth 2 div setlinewidth } def /PL { stroke gnulinewidth setlinewidth } def /LTb { BL [] 0 0 0 DL } def /LTa { AL [1 dl 2 dl] 0 setdash 0 0 0 setrgbcolor } def /LT0 { PL [] 0 1 0 DL } def /LT1 { PL [4 dl 2 dl] 0 0 1 DL } def /LT2 { PL [2 dl 3 dl] 1 0 0 DL } def /LT3 { PL [1 dl 1.5 dl] 1 0 1 DL } def /LT4 { PL [5 dl 2 dl 1 dl 2 dl] 0 1 1 DL } def /LT5 { PL [4 dl 3 dl 1 dl 3 dl] 1 1 0 DL } def /LT6 { PL [2 dl 2 dl 2 dl 4 dl] 0 0 0 DL } def /LT7 { PL [2 dl 2 dl 2 dl 2 dl 2 dl 4 dl] 1 0.3 0 DL } def /LT8 { PL [2 dl 2 dl 2 dl 2 dl 2 dl 2 dl 2 dl 4 dl] 0.5 0.5 0.5 DL } def /P { stroke [] 0 setdash currentlinewidth 2 div sub M 0 currentlinewidth V stroke } def /D { stroke [] 0 setdash 2 copy vpt add M hpt neg vpt neg V hpt vpt neg V hpt vpt V hpt neg vpt V closepath stroke P } def /A { stroke [] 0 setdash vpt sub M 0 vpt2 V currentpoint stroke M hpt neg vpt neg R hpt2 0 V stroke } def /B { stroke [] 0 setdash 2 copy exch hpt sub exch vpt add M 0 vpt2 neg V hpt2 0 V 0 vpt2 V hpt2 neg 0 V closepath stroke P } def /C { stroke [] 0 setdash exch hpt sub exch vpt add M hpt2 vpt2 neg V currentpoint stroke M hpt2 neg 0 R hpt2 vpt2 V stroke } def /T { stroke [] 0 setdash 2 copy vpt 1.12 mul add M hpt neg vpt -1.62 mul V hpt 2 mul 0 V hpt neg vpt 1.62 mul V closepath stroke P } def /S { 2 copy A C} def end gnudict begin gsave 50 50 translate 0.100 0.100 scale 0 setgray /Times-Roman findfont 100 scalefont setfont newpath LTa 600 251 M 0 2218 V LTb LTa 600 473 M 2817 0 V LTb 600 473 M 63 0 V 2754 0 R -63 0 V 540 473 M (96) Rshow LTa 600 916 M 2817 0 V LTb 600 916 M 63 0 V 2754 0 R -63 0 V 540 916 M (98) Rshow LTa 600 1360 M 2817 0 V LTb 600 1360 M 63 0 V 2754 0 R -63 0 V -2814 0 R (100) Rshow LTa 600 1804 M 2817 0 V LTb 600 1804 M 63 0 V 2754 0 R -63 0 V -2814 0 R (102) Rshow LTa 600 2247 M 2817 0 V LTb 600 2247 M 63 0 V 2754 0 R -63 0 V -2814 0 R (104) Rshow LTa 600 251 M 0 2218 V LTb 600 251 M 0 63 V 0 2155 R 0 -63 V 600 151 M (\(16,1\)) Cshow LTa 1304 251 M 0 2218 V LTb 1304 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (\(8,2\)) Cshow LTa 2009 251 M 0 2218 V LTb 2009 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (\(4,4\)) Cshow LTa 2713 251 M 0 2218 V LTb 2713 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (\(2,8\)) Cshow LTa 3417 251 M 0 2218 V LTb 3417 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (\(1,16\)) Cshow 600 251 M 2817 0 V 0 2218 V -2817 0 V 600 251 L 340 1360 M currentpoint gsave translate 90 rotate 0 0 M (Normalized Execution Time \(w.r.t \(16,1\)\)) Cshow grestore 2008 51 M (Processor Geometry - 16 Processors ) Cshow LT1 1654 2106 M (160x160x160) Rshow 1714 2106 M 180 0 V 1523 341 R 2713 2048 L 2009 1826 L 1304 939 L 600 1360 L 1774 2106 A 3417 2447 A 2713 2048 A 2009 1826 A 1304 939 A 600 1360 A LT2 1654 2006 M (144x144x144) Rshow 1714 2006 M 180 0 V 1523 241 R 2713 1715 L -705 -67 V 1304 850 L 600 1360 L 1774 2006 B 3417 2247 B 2713 1715 B 2009 1648 B 1304 850 B 600 1360 B LT4 1654 1906 M (64x64x64) Rshow 1714 1906 M 180 0 V 1523 -590 R 2713 1160 L 2009 340 L 1304 495 L 600 1360 L 1774 1906 T 3417 1316 T 2713 1160 T 2009 340 T 1304 495 T 600 1360 T stroke grestore end showpage %%EndDocument @endspecial 377 612 a Fo(Figure)11 b(2:)18 b(Normalized)11 b(Execution)i(time)f(of)g Fd(Multigrid)p Fo(.)4 733 y(with)17 b Fd(\(*,Block,Block\))f Fo(since)h(the)g(arrays)g(are)g(assumed)h(to)f(be)h (stored)f(using)g(column)f(major)4 787 y(ordering.)31 b(W)n(ith)16 b(16)h(processors,)h(it)f(is)g(possible)g(to)g(choose)g(one)g(of)f(the)h (\(16,1\),)h(\(8,2\),)f(\(4,4\),)h(\(2,8\))4 841 y(and)d(\(1,16\))g (processor)g(geometries.)27 b(The)15 b(choice)h(of)e(the)i(processor)f (geometry)f(af)o(fects)h(the)g(number)4 895 y(of)h(processors)g(that)g (execute)g(each)g(parallel)f(loop.)29 b(For)15 b(example,)i(a)f(processor)g (geometry)f(of)h(\(8,2\),)4 949 y(implies)11 b(8)g(processors)h(assigned)g (to)e(the)i(inner)e(parallel)h(loop)g(and)g(2)g(processors)g(assigned)h(to)f (the)g(outer)4 1004 y(parallel)h(loop.)77 1079 y(Figure)17 b(2)g(shows)h(the)f(execution)g(time)g(of)g(the)h(application)e(for)h (various)g(processor)g(geometries)4 1133 y(with)e(the)g Fd(\(*,Block,Block\)) e Fo(distribution)h(for)g(the)h(arrays)g(on)g(the)g(KSR1)f(with)h(16)g (processors,)4 1188 y(normalized)d(with)h(respect)g(to)g(the)g(\(16,1\))f (processor)h(geometry)m(.)19 b(For)12 b(a)h(small)g(data)g(size)h (\(64x64x64\),)4 1242 y(execution)22 b(time)f(is)h(minimized)e(by)i(a)g (distribution)e(with)h(equal)h(number)f(of)g(processors)h(in)f(each)4 1296 y(dimension,)15 b(i.e.,)i(\(4,4\).)24 b(This)16 b(is)f(the)f(same)h (distribution)e(scheme)j(suggested)f(in)f(the)h(Syracuse)f(High)4 1350 y(Performance)9 b(Fortran)h(applications)g(suite)772 1332 y Fm(3)801 1350 y Fo(for)g(DMMs.)19 b(However)n(,)11 b(when)f(the)h(data)g (size)g(is)g(lar)o(ge,)g(the)4 1404 y(processor)h(geometry)f(\(4,4\))h(no)g (longer)f(performs)g(the)h(best.)19 b(The)12 b(execution)g(time)g(is)g (minimized)f(with)4 1458 y(a)i(processor)f(geometry)g(of)g(\(8,2\).)77 1534 y(The)20 b(impact)e(of)h(processor)g(geometry)f(on)g(performance)g(is)h (due)g(to)g(cache)g(af)o(\256nity)m(,)h(as)g(can)f(be)4 1588 y(deduced)12 b(from)f(Figures)h(3)g(and)g(4.)19 b(Figure)11 b(3)h(shows)h(the)f(measured)g(number)g(of)f(cache)i(lines)f(accessed)4 1642 y(from)17 b(remote)g(memory)f(modules,)j(normalized)e(with)g(respect)h (to)f(the)h(processor)f(geometry)g(\(16,1\).)4 1697 y(The)h(number)e(of)h (remote)f(memory)g(accesses)j(is)e(minimal)g(when)g(the)g(processor)g (geometry)f(is)h(\(4,4\))4 1751 y(for)h(all)g(data)h(sizes.)38 b(Figure)17 b(4)i(shows)g(the)g(average)f(measured)h(number)e(of)i(cache)g (misses)g(from)f(a)4 1805 y(processor)c(cache,)h(again)e(normalized)g(with)g (respect)h(to)g(the)f(processor)h(geometry)f(\(16,1\).)21 b(When)14 b(the)4 1859 y(data)e(size)h(is)f(small)g(\(64x64x64\),)g(the)g(data)g(used)g (by)g(a)h(processor)f(\256ts)g(into)f(the)h(256k)g(processor)g(cache)4 1913 y(and)19 b(the)g(misses)h(from)e(the)h(cache)h(in)f(this)g(case)h (re\257ect)f(remote)f(memory)g(accesses)j(that)e(occur)g(in)4 1967 y(the)13 b(parallel)g(program.)19 b(Hence,)14 b(the)f(predominant)f (factor)h(af)o(fecting)f(performance)g(is)h(interprocessor)4 2022 y(communication,)f(and)g(the)h(best)f(performance)g(is)g(attained)g (using)h(the)f(\(4,4\))g(geometry)m(.)77 2097 y(However)n(,)17 b(when)f(the)g(arrays)g(are)g(relatively)f(lar)o(ge)h(\(144x144x144\),)g(the) g(cache)g(capacity)h(is)f(no)4 2151 y(longer)g(suf)o(\256cient)h(to)g(hold)f (data)i(from)d(successive)k(iterations)d(of)h(the)g(outer)f(parallel)h(loop,) h(and)f(the)4 2206 y(number)10 b(of)h(cache)g(misses)h(increases.)19 b(When)11 b(the)g(number)f(of)h(processors)g(assigned)h(to)f(the)g(outer)f (loop)4 2260 y(increases,)j(the)f(number)f(of)h(misses)h(from)d(the)i(cache)h (also)f(increases.)19 b(The)12 b(\(4,4\))g(processor)g(geometry)4 2314 y(minimizes)d(the)f(amount)h(of)f(remote)g(memory)g(access,)k(but)c(the) h(\(16,1\))f(processor)h(geometry)f(minimizes)4 2368 y(the)k(amount)f(of)g (cache)h(misses.)19 b(The)12 b(distribution)e(with)i(\(8,2\))f(processor)g (geometry)g(strikes)h(a)g(balance)4 2422 y(between)17 b(the)g(cost)g(of)g (remote)f(memory)g(access)i(and)f(the)g(cost)g(of)g(cache)g(misses,)i (resulting)e(in)f(best)4 2476 y(overall)c(performance,)g(in)g(spite)g(of)g (higher)g(interprocessor)g(communication)f(cost.)4 2638 y Fc(4.2)58 b(False)14 b(Sharing)g(Conscious)g(Data)h(Distribution)4 2736 y Fo(The)d(programs)f Fd(Tred2)h Fo(\(which)f(is)h(part)f(of)g(Eispack\),)i Fd(mdg)p Fo(,)f(and)g Fd(trfd)f Fo(\(which)g(are)h(both)f(part)h(of)f(the)4 2790 y(Perfect)f(Club)h(Benchmark)f(Suite\))g(exhibit)h(parallelism)f(which)h (result)f(in)h(considerable)g(false)g(sharing.)p 4 2835 737 2 v 62 2865 a Fl(3)79 2880 y Fk(http://www)m(.npac.syr)n(.edu/hpfa/)c(.)p eop %%Page: 6 6 6 5 bop 47 586 a @beginspecial 50 @llx 50 @lly 230 @urx 176 @ury 2057 @rwi @setspecial %%BeginDocument: spmiss.ps /gnudict 40 dict def gnudict begin /Color false def /Solid false def /gnulinewidth 5.000 def /vshift -33 def /dl {10 mul} def /hpt 31.5 def /vpt 31.5 def /M {moveto} bind def /L {lineto} bind def /R {rmoveto} bind def /V {rlineto} bind def /vpt2 vpt 2 mul def /hpt2 hpt 2 mul def /Lshow { currentpoint stroke M 0 vshift R show } def /Rshow { currentpoint stroke M dup stringwidth pop neg vshift R show } def /Cshow { currentpoint stroke M dup stringwidth pop -2 div vshift R show } def /DL { Color {setrgbcolor Solid {pop []} if 0 setdash } {pop pop pop Solid {pop []} if 0 setdash} ifelse } def /BL { stroke gnulinewidth 2 mul setlinewidth } def /AL { stroke gnulinewidth 2 div setlinewidth } def /PL { stroke gnulinewidth setlinewidth } def /LTb { BL [] 0 0 0 DL } def /LTa { AL [1 dl 2 dl] 0 setdash 0 0 0 setrgbcolor } def /LT0 { PL [] 0 1 0 DL } def /LT1 { PL [4 dl 2 dl] 0 0 1 DL } def /LT2 { PL [2 dl 3 dl] 1 0 0 DL } def /LT3 { PL [1 dl 1.5 dl] 1 0 1 DL } def /LT4 { PL [5 dl 2 dl 1 dl 2 dl] 0 1 1 DL } def /LT5 { PL [4 dl 3 dl 1 dl 3 dl] 1 1 0 DL } def /LT6 { PL [2 dl 2 dl 2 dl 4 dl] 0 0 0 DL } def /LT7 { PL [2 dl 2 dl 2 dl 2 dl 2 dl 4 dl] 1 0.3 0 DL } def /LT8 { PL [2 dl 2 dl 2 dl 2 dl 2 dl 2 dl 2 dl 4 dl] 0.5 0.5 0.5 DL } def /P { stroke [] 0 setdash currentlinewidth 2 div sub M 0 currentlinewidth V stroke } def /D { stroke [] 0 setdash 2 copy vpt add M hpt neg vpt neg V hpt vpt neg V hpt vpt V hpt neg vpt V closepath stroke P } def /A { stroke [] 0 setdash vpt sub M 0 vpt2 V currentpoint stroke M hpt neg vpt neg R hpt2 0 V stroke } def /B { stroke [] 0 setdash 2 copy exch hpt sub exch vpt add M 0 vpt2 neg V hpt2 0 V 0 vpt2 V hpt2 neg 0 V closepath stroke P } def /C { stroke [] 0 setdash exch hpt sub exch vpt add M hpt2 vpt2 neg V currentpoint stroke M hpt2 neg 0 R hpt2 vpt2 V stroke } def /T { stroke [] 0 setdash 2 copy vpt 1.12 mul add M hpt neg vpt -1.62 mul V hpt 2 mul 0 V hpt neg vpt 1.62 mul V closepath stroke P } def /S { 2 copy A C} def end gnudict begin gsave 50 50 translate 0.050 0.050 scale 0 setgray /Times-Roman findfont 100 scalefont setfont newpath LTa 600 251 M 0 2218 V LTb LTa 600 251 M 2817 0 V LTb 600 251 M 63 0 V 2754 0 R -63 0 V 540 251 M (30) Rshow LTa 600 568 M 2817 0 V LTb 600 568 M 63 0 V 2754 0 R -63 0 V 540 568 M (40) Rshow LTa 600 885 M 2817 0 V LTb 600 885 M 63 0 V 2754 0 R -63 0 V 540 885 M (50) Rshow LTa 600 1202 M 2817 0 V LTb 600 1202 M 63 0 V 2754 0 R -63 0 V -2814 0 R (60) Rshow LTa 600 1518 M 2817 0 V LTb 600 1518 M 63 0 V 2754 0 R -63 0 V -2814 0 R (70) Rshow LTa 600 1835 M 2817 0 V LTb 600 1835 M 63 0 V 2754 0 R -63 0 V -2814 0 R (80) Rshow LTa 600 2152 M 2817 0 V LTb 600 2152 M 63 0 V 2754 0 R -63 0 V -2814 0 R (90) Rshow LTa 600 2469 M 2817 0 V LTb 600 2469 M 63 0 V 2754 0 R -63 0 V -2814 0 R (100) Rshow LTa 600 251 M 0 2218 V LTb 600 251 M 0 63 V 0 2155 R 0 -63 V 600 151 M (\(16,1\)) Cshow LTa 1304 251 M 0 2218 V LTb 1304 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (\(8,2\)) Cshow LTa 2009 251 M 0 2218 V LTb 2009 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (\(4,4\)) Cshow LTa 2713 251 M 0 2218 V LTb 2713 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (\(2,8\)) Cshow LTa 3417 251 M 0 2218 V LTb 3417 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (\(1,16\)) Cshow 600 251 M 2817 0 V 0 2218 V -2817 0 V 600 251 L 340 1360 M currentpoint gsave translate 90 rotate 0 0 M (Normalized subpage misses \(w.r.t. \(16,1\)\)) Cshow grestore 2008 51 M (Processor Geometry - 16 Processors ) Cshow LT1 1654 2106 M (160x160x160) Rshow 1714 2106 M 180 0 V 600 2469 M 1304 1154 L 2009 495 L 705 646 V 3417 2453 L 1774 2106 A 600 2469 A 1304 1154 A 2009 495 A 2713 1141 A 3417 2453 A LT2 1654 2006 M (144x144x144) Rshow 1714 2006 M 180 0 V 600 2469 M 1304 1072 L 2009 473 L 705 567 V 3417 2387 L 1774 2006 B 600 2469 B 1304 1072 B 2009 473 B 2713 1040 B 3417 2387 B LT4 1654 1906 M (64x64x64) Rshow 1714 1906 M 180 0 V 600 2469 M 1304 1113 L 2009 543 L 705 649 V 3417 2444 L 1774 1906 T 600 2469 T 1304 1113 T 2009 543 T 2713 1192 T 3417 2444 T stroke grestore end showpage %%EndDocument @endspecial 124 640 a Fo(Figure)12 b(3.)18 b(Remote)12 b(Memory)g(Access.) 899 586 y @beginspecial 50 @llx 50 @lly 230 @urx 176 @ury 2057 @rwi @setspecial %%BeginDocument: datac.ps /gnudict 40 dict def gnudict begin /Color false def /Solid false def /gnulinewidth 5.000 def /vshift -33 def /dl {10 mul} def /hpt 31.5 def /vpt 31.5 def /M {moveto} bind def /L {lineto} bind def /R {rmoveto} bind def /V {rlineto} bind def /vpt2 vpt 2 mul def /hpt2 hpt 2 mul def /Lshow { currentpoint stroke M 0 vshift R show } def /Rshow { currentpoint stroke M dup stringwidth pop neg vshift R show } def /Cshow { currentpoint stroke M dup stringwidth pop -2 div vshift R show } def /DL { Color {setrgbcolor Solid {pop []} if 0 setdash } {pop pop pop Solid {pop []} if 0 setdash} ifelse } def /BL { stroke gnulinewidth 2 mul setlinewidth } def /AL { stroke gnulinewidth 2 div setlinewidth } def /PL { stroke gnulinewidth setlinewidth } def /LTb { BL [] 0 0 0 DL } def /LTa { AL [1 dl 2 dl] 0 setdash 0 0 0 setrgbcolor } def /LT0 { PL [] 0 1 0 DL } def /LT1 { PL [4 dl 2 dl] 0 0 1 DL } def /LT2 { PL [2 dl 3 dl] 1 0 0 DL } def /LT3 { PL [1 dl 1.5 dl] 1 0 1 DL } def /LT4 { PL [5 dl 2 dl 1 dl 2 dl] 0 1 1 DL } def /LT5 { PL [4 dl 3 dl 1 dl 3 dl] 1 1 0 DL } def /LT6 { PL [2 dl 2 dl 2 dl 4 dl] 0 0 0 DL } def /LT7 { PL [2 dl 2 dl 2 dl 2 dl 2 dl 4 dl] 1 0.3 0 DL } def /LT8 { PL [2 dl 2 dl 2 dl 2 dl 2 dl 2 dl 2 dl 4 dl] 0.5 0.5 0.5 DL } def /P { stroke [] 0 setdash currentlinewidth 2 div sub M 0 currentlinewidth V stroke } def /D { stroke [] 0 setdash 2 copy vpt add M hpt neg vpt neg V hpt vpt neg V hpt vpt V hpt neg vpt V closepath stroke P } def /A { stroke [] 0 setdash vpt sub M 0 vpt2 V currentpoint stroke M hpt neg vpt neg R hpt2 0 V stroke } def /B { stroke [] 0 setdash 2 copy exch hpt sub exch vpt add M 0 vpt2 neg V hpt2 0 V 0 vpt2 V hpt2 neg 0 V closepath stroke P } def /C { stroke [] 0 setdash exch hpt sub exch vpt add M hpt2 vpt2 neg V currentpoint stroke M hpt2 neg 0 R hpt2 vpt2 V stroke } def /T { stroke [] 0 setdash 2 copy vpt 1.12 mul add M hpt neg vpt -1.62 mul V hpt 2 mul 0 V hpt neg vpt 1.62 mul V closepath stroke P } def /S { 2 copy A C} def end gnudict begin gsave 50 50 translate 0.050 0.050 scale 0 setgray /Times-Roman findfont 100 scalefont setfont newpath LTa 600 251 M 0 2218 V LTb LTa 600 251 M 2817 0 V LTb 600 251 M 63 0 V 2754 0 R -63 0 V 540 251 M (90) Rshow LTa 600 528 M 2817 0 V LTb 600 528 M 63 0 V 2754 0 R -63 0 V 540 528 M (100) Rshow LTa 600 806 M 2817 0 V LTb 600 806 M 63 0 V 2754 0 R -63 0 V 540 806 M (110) Rshow LTa 600 1083 M 2817 0 V LTb 600 1083 M 63 0 V 2754 0 R -63 0 V -2814 0 R (120) Rshow LTa 600 1360 M 2817 0 V LTb 600 1360 M 63 0 V 2754 0 R -63 0 V -2814 0 R (130) Rshow LTa 600 1637 M 2817 0 V LTb 600 1637 M 63 0 V 2754 0 R -63 0 V -2814 0 R (140) Rshow LTa 600 1915 M 2817 0 V LTb 600 1915 M 63 0 V 2754 0 R -63 0 V -2814 0 R (150) Rshow LTa 600 2192 M 2817 0 V LTb 600 2192 M 63 0 V 2754 0 R -63 0 V -2814 0 R (160) Rshow LTa 600 2469 M 2817 0 V LTb 600 2469 M 63 0 V 2754 0 R -63 0 V -2814 0 R (170) Rshow LTa 600 251 M 0 2218 V LTb 600 251 M 0 63 V 0 2155 R 0 -63 V 600 151 M (\(16,1\)) Cshow LTa 1304 251 M 0 2218 V LTb 1304 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (\(8,2\)) Cshow LTa 2009 251 M 0 2218 V LTb 2009 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (\(4,4\)) Cshow LTa 2713 251 M 0 2218 V LTb 2713 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (\(2,8\)) Cshow LTa 3417 251 M 0 2218 V LTb 3417 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (\(1,16\)) Cshow 600 251 M 2817 0 V 0 2218 V -2817 0 V 600 251 L 340 1360 M currentpoint gsave translate 90 rotate 0 0 M (Normalized cache misses \(w.r.t \(16,1\)\)) Cshow grestore 2008 51 M (Processor Geometry - 16 Processors ) Cshow LT1 1654 2106 M (160x160x160) Rshow 1714 2106 M 180 0 V 600 528 M 1304 817 L 705 626 V 705 721 V 704 250 V 1774 2106 A 600 528 A 1304 817 A 2009 1443 A 2713 2164 A 3417 2414 A LT2 1654 2006 M (144x144x144) Rshow 1714 2006 M 180 0 V 600 528 M 1304 678 L 705 488 V 705 610 V 704 333 V 1774 2006 B 600 528 B 1304 678 B 2009 1166 B 2713 1776 B 3417 2109 B LT4 1654 1906 M (64x64x64) Rshow 1714 1906 M 180 0 V 600 528 M 1304 329 L 705 -36 V 705 133 V 704 97 V 1774 1906 T 600 528 T 1304 329 T 2009 293 T 2713 426 T 3417 523 T stroke grestore end showpage %%EndDocument @endspecial 1085 640 a(Figure)g(4.)18 b(Cache)13 b(Misses.)47 1297 y @beginspecial 50 @llx 50 @lly 410 @urx 302 @ury 2057 @rwi @setspecial %%BeginDocument: tred2.ps /gnudict 40 dict def gnudict begin /Color false def /Solid false def /gnulinewidth 5.000 def /vshift -33 def /dl {10 mul} def /hpt 31.5 def /vpt 31.5 def /M {moveto} bind def /L {lineto} bind def /R {rmoveto} bind def /V {rlineto} bind def /vpt2 vpt 2 mul def /hpt2 hpt 2 mul def /Lshow { currentpoint stroke M 0 vshift R show } def /Rshow { currentpoint stroke M dup stringwidth pop neg vshift R show } def /Cshow { currentpoint stroke M dup stringwidth pop -2 div vshift R show } def /DL { Color {setrgbcolor Solid {pop []} if 0 setdash } {pop pop pop Solid {pop []} if 0 setdash} ifelse } def /BL { stroke gnulinewidth 2 mul setlinewidth } def /AL { stroke gnulinewidth 2 div setlinewidth } def /PL { stroke gnulinewidth setlinewidth } def /LTb { BL [] 0 0 0 DL } def /LTa { AL [1 dl 2 dl] 0 setdash 0 0 0 setrgbcolor } def /LT0 { PL [] 0 1 0 DL } def /LT1 { PL [4 dl 2 dl] 0 0 1 DL } def /LT2 { PL [2 dl 3 dl] 1 0 0 DL } def /LT3 { PL [1 dl 1.5 dl] 1 0 1 DL } def /LT4 { PL [5 dl 2 dl 1 dl 2 dl] 0 1 1 DL } def /LT5 { PL [4 dl 3 dl 1 dl 3 dl] 1 1 0 DL } def /LT6 { PL [2 dl 2 dl 2 dl 4 dl] 0 0 0 DL } def /LT7 { PL [2 dl 2 dl 2 dl 2 dl 2 dl 4 dl] 1 0.3 0 DL } def /LT8 { PL [2 dl 2 dl 2 dl 2 dl 2 dl 2 dl 2 dl 4 dl] 0.5 0.5 0.5 DL } def /P { stroke [] 0 setdash currentlinewidth 2 div sub M 0 currentlinewidth V stroke } def /D { stroke [] 0 setdash 2 copy vpt add M hpt neg vpt neg V hpt vpt neg V hpt vpt V hpt neg vpt V closepath stroke P } def /A { stroke [] 0 setdash vpt sub M 0 vpt2 V currentpoint stroke M hpt neg vpt neg R hpt2 0 V stroke } def /B { stroke [] 0 setdash 2 copy exch hpt sub exch vpt add M 0 vpt2 neg V hpt2 0 V 0 vpt2 V hpt2 neg 0 V closepath stroke P } def /C { stroke [] 0 setdash exch hpt sub exch vpt add M hpt2 vpt2 neg V currentpoint stroke M hpt2 neg 0 R hpt2 vpt2 V stroke } def /T { stroke [] 0 setdash 2 copy vpt 1.12 mul add M hpt neg vpt -1.62 mul V hpt 2 mul 0 V hpt neg vpt 1.62 mul V closepath stroke P } def /S { 2 copy A C} def end gnudict begin gsave 50 50 translate 0.100 0.100 scale 0 setgray /Times-Roman findfont 100 scalefont setfont newpath LTa 600 251 M 0 2218 V LTb LTa 600 251 M 2817 0 V LTb 600 251 M 63 0 V 2754 0 R -63 0 V 540 251 M (1e+06) Rshow LTb 600 2469 M 63 0 V 2754 0 R -63 0 V 540 2469 M (4e+06) Rshow LTa 600 1360 M 2817 0 V LTb 600 1360 M 31 0 V 2786 0 R -31 0 V LTa 600 2009 M 2817 0 V LTb 600 2009 M 31 0 V 2786 0 R -31 0 V LTa 600 2469 M 2817 0 V LTb 600 2469 M 31 0 V 2786 0 R -31 0 V LTa 600 251 M 0 2218 V LTb 600 251 M 0 63 V 0 2155 R 0 -63 V 600 151 M (0) Cshow LTa 952 251 M 0 2218 V LTb 952 251 M 0 63 V 0 2155 R 0 -63 V 952 151 M (2) Cshow LTa 1304 251 M 0 2218 V LTb 1304 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (4) Cshow LTa 1656 251 M 0 2218 V LTb 1656 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (6) Cshow LTa 2009 251 M 0 2218 V LTb 2009 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (8) Cshow LTa 2361 251 M 0 2218 V LTb 2361 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (10) Cshow LTa 2713 251 M 0 2218 V LTb 2713 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (12) Cshow LTa 3065 251 M 0 2218 V LTb 3065 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (14) Cshow LTa 3417 251 M 0 2218 V LTb 3417 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (16) Cshow 600 251 M 2817 0 V 0 2218 V -2817 0 V 600 251 L 340 1260 M currentpoint gsave translate 90 rotate 0 0 M (Execution Time \(Micro Seconds\)) Cshow grestore 2008 51 M (Number of Processors) Cshow LT0 3054 2306 M ("Cyclic") Rshow 3114 2306 M 180 0 V 776 910 M 176 318 V 352 533 V 705 -387 V 704 -149 V 704 707 V 3174 2306 D 776 910 D 952 1228 D 1304 1761 D 2009 1374 D 2713 1225 D 3417 1932 D LT1 3054 2206 M ("BlockCyclic") Rshow 3114 2206 M 180 0 V 776 942 M 952 817 L 1304 662 L 705 -34 V 704 110 V 704 1480 V 3174 2206 A 776 942 A 952 817 A 1304 662 A 2009 628 A 2713 738 A 3417 2218 A stroke grestore end showpage %%EndDocument @endspecial 140 1351 a(Figure)f(5.)18 b(Ef)o(fect)12 b(of)g(False)h (Sharing.)899 1297 y @beginspecial 50 @llx 50 @lly 410 @urx 302 @ury 2057 @rwi @setspecial %%BeginDocument: tred2c.ps /gnudict 40 dict def gnudict begin /Color false def /Solid false def /gnulinewidth 5.000 def /vshift -33 def /dl {10 mul} def /hpt 31.5 def /vpt 31.5 def /M {moveto} bind def /L {lineto} bind def /R {rmoveto} bind def /V {rlineto} bind def /vpt2 vpt 2 mul def /hpt2 hpt 2 mul def /Lshow { currentpoint stroke M 0 vshift R show } def /Rshow { currentpoint stroke M dup stringwidth pop neg vshift R show } def /Cshow { currentpoint stroke M dup stringwidth pop -2 div vshift R show } def /DL { Color {setrgbcolor Solid {pop []} if 0 setdash } {pop pop pop Solid {pop []} if 0 setdash} ifelse } def /BL { stroke gnulinewidth 2 mul setlinewidth } def /AL { stroke gnulinewidth 2 div setlinewidth } def /PL { stroke gnulinewidth setlinewidth } def /LTb { BL [] 0 0 0 DL } def /LTa { AL [1 dl 2 dl] 0 setdash 0 0 0 setrgbcolor } def /LT0 { PL [] 0 1 0 DL } def /LT1 { PL [4 dl 2 dl] 0 0 1 DL } def /LT2 { PL [2 dl 3 dl] 1 0 0 DL } def /LT3 { PL [1 dl 1.5 dl] 1 0 1 DL } def /LT4 { PL [5 dl 2 dl 1 dl 2 dl] 0 1 1 DL } def /LT5 { PL [4 dl 3 dl 1 dl 3 dl] 1 1 0 DL } def /LT6 { PL [2 dl 2 dl 2 dl 4 dl] 0 0 0 DL } def /LT7 { PL [2 dl 2 dl 2 dl 2 dl 2 dl 4 dl] 1 0.3 0 DL } def /LT8 { PL [2 dl 2 dl 2 dl 2 dl 2 dl 2 dl 2 dl 4 dl] 0.5 0.5 0.5 DL } def /P { stroke [] 0 setdash currentlinewidth 2 div sub M 0 currentlinewidth V stroke } def /D { stroke [] 0 setdash 2 copy vpt add M hpt neg vpt neg V hpt vpt neg V hpt vpt V hpt neg vpt V closepath stroke P } def /A { stroke [] 0 setdash vpt sub M 0 vpt2 V currentpoint stroke M hpt neg vpt neg R hpt2 0 V stroke } def /B { stroke [] 0 setdash 2 copy exch hpt sub exch vpt add M 0 vpt2 neg V hpt2 0 V 0 vpt2 V hpt2 neg 0 V closepath stroke P } def /C { stroke [] 0 setdash exch hpt sub exch vpt add M hpt2 vpt2 neg V currentpoint stroke M hpt2 neg 0 R hpt2 vpt2 V stroke } def /T { stroke [] 0 setdash 2 copy vpt 1.12 mul add M hpt neg vpt -1.62 mul V hpt 2 mul 0 V hpt neg vpt 1.62 mul V closepath stroke P } def /S { 2 copy A C} def end gnudict begin gsave 50 50 translate 0.100 0.100 scale 0 setgray /Times-Roman findfont 100 scalefont setfont newpath LTa 600 251 M 2817 0 V LTb 600 251 M 63 0 V 2754 0 R -63 0 V -2814 0 R (40000) Rshow LTa 600 251 M 2817 0 V LTb 600 251 M 31 0 V 2786 0 R -31 0 V LTa 600 497 M 2817 0 V LTb 600 497 M 31 0 V 2786 0 R -31 0 V LTa 600 697 M 2817 0 V LTb 600 697 M 31 0 V 2786 0 R -31 0 V LTa 600 867 M 2817 0 V LTb 600 867 M 31 0 V 2786 0 R -31 0 V LTa 600 1014 M 2817 0 V LTb 600 1014 M 31 0 V 2786 0 R -31 0 V LTa 600 1144 M 2817 0 V LTb 600 1144 M 31 0 V 2786 0 R -31 0 V LTa 600 1260 M 2817 0 V LTb 600 1260 M 63 0 V 2754 0 R -63 0 V -2814 0 R (100000) Rshow LTa 600 2023 M 2817 0 V LTb 600 2023 M 31 0 V 2786 0 R -31 0 V LTa 600 2469 M 2817 0 V LTb 600 2469 M 31 0 V 2786 0 R -31 0 V LTa 600 251 M 0 2218 V LTb 600 251 M 0 63 V 0 2155 R 0 -63 V 600 151 M (0) Cshow LTa 952 251 M 0 2218 V LTb 952 251 M 0 63 V 0 2155 R 0 -63 V 952 151 M (2) Cshow LTa 1304 251 M 0 2218 V LTb 1304 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (4) Cshow LTa 1656 251 M 0 2218 V LTb 1656 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (6) Cshow LTa 2009 251 M 0 2218 V LTb 2009 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (8) Cshow LTa 2361 251 M 0 2218 V LTb 2361 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (10) Cshow LTa 2713 251 M 0 2218 V LTb 2713 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (12) Cshow LTa 3065 251 M 0 2218 V LTb 3065 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (14) Cshow LTa 3417 251 M 0 2218 V LTb 3417 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (16) Cshow 600 251 M 2817 0 V 0 2218 V -2817 0 V 600 251 L 400 1660 M currentpoint gsave translate 90 rotate 0 0 M (Cache Misses) Cshow grestore 2008 51 M (Number of Processors) Cshow LT0 3054 2306 M ("Cyclic") Rshow 3114 2306 M 180 0 V 123 -856 R -704 -3 V -704 172 V -705 449 V 952 1885 L 776 2180 L 3174 2306 D 3417 1450 D 2713 1447 D 2009 1619 D 1304 2068 D 952 1885 D 776 2180 D LT1 3054 2206 M ("BlockCyclic") Rshow 3114 2206 M 180 0 V 3417 1029 M 2713 642 L 2009 475 L -705 601 V 952 1596 L 776 2166 L 3174 2206 A 3417 1029 A 2713 642 A 2009 475 A 1304 1076 A 952 1596 A 776 2166 A stroke grestore end showpage %%EndDocument @endspecial 1085 1351 a(Figure)f(6.)18 b(Cache)13 b(Misses.)4 1510 y(These)18 b(programs)f(have)g(triangular)f(iteration)g(spaces)i(which)f (necessitate)h(cyclical)f(distribution)f(for)4 1564 y(load)c(balancing.)19 b(The)13 b(choice)f(of)g(this)h(distribution)e(combined)h(with)g(the)h (storage)f(order)g(of)g(the)g(arrays)4 1619 y(cause)17 b(more)f(than)g(one)g (processor)g(to)g(share)g(the)g(same)h(cache)f(line,)i(leading)e(to)g(false)g (sharing.)29 b(The)4 1673 y(impact)17 b(of)f(this)h(false)h(sharing)e(is)i (shown)f(in)g(Figure)f(5)h(for)f(the)h Fd(Tred2)g Fo(application)f(on)h(the)g (KSR1)4 1727 y(multiprocessor)m(.)37 b(The)19 b(\256gure)f(shows)h(the)g (execution)f(time)g(of)g(the)h(application)f(for)g Fd(Cyclic)g Fo(and)4 1781 y Fd(BlockCyclic)12 b Fo(distributions)g(using)i(1)f(to)g(16)g (processors.)20 b(The)14 b(use)g(of)e(the)h Fd(Cyclic)g Fo(distribution)4 1835 y(results)f(in)g(a)h(lar)o(ge)f(number)f(of)h(cache)h(misses,)g(as)g (can)f(be)g(seen)h(in)f(Figure)g(6.)18 b(The)13 b(resulting)e(overhead)4 1889 y(causes)20 b(execution)f(time)g(to)g(increase)g(as)g(the)g(number)g(of) f(processors)i(increases.)39 b(The)19 b(arrays)g(are)4 1944 y(distributed)c(using)g(a)g Fd(BlockCyclic)f Fo(distribution,)h(where)g(the)g (size)h(of)f(the)g(block)g(is)h(equal)f(to)g(the)4 1998 y(size)22 b(of)e(the)h(cache)g(line,)i(which)e(ef)o(fectively)f(eliminates)h(false)g (sharing.)43 b(When)21 b(the)g(number)f(of)4 2052 y(processors)14 b(is)g(small,)g(the)f(load)h(is)g(relatively)e(well-balanced,)i(and)f(the)h (elimination)e(of)h(false)h(sharing)4 2106 y(improves)h(performance.)25 b(However)n(,)15 b(as)h(the)f(number)f(of)h(processors)g(increases,)i(the)e (load)f(becomes)4 2160 y(increasingly)f(imbalanced,)h(and)f(the)h(negative)f (impact)g(of)g(this)g(load)g(imbalance)h(begins)f(to)g(outweigh)4 2214 y(the)h(bene\256ts)h(of)e(eliminating)h(false)g(sharing.)24 b(A)14 b(compiler)g(for)f(SSMM)h(must)g(consider)h(this)f(tradeof)o(f)4 2269 y(between)f(load)f(imbalance)g(and)g(false)h(sharing)f(when)g (determining)g(data)g(distributions.)4 2450 y Fn(5)71 b(Impact)19 b(on)f(Computation)i(Partitioning)4 2577 y Fo(The)12 b(owner)o(-computes)f (rule)g(has)h(been)f(the)h(computation)f(partitioner)f(of)h(choice)g(for)g (compiling)g(HPF-)4 2631 y(type)17 b(languages)g(on)g(DMMs)h([16].)32 b(The)17 b(owner)o(-computes)f(rule)h(maps)g(a)g(statement)h(such)f(that)g (the)4 2686 y(the)h(computation)e(is)i(executed)g(on)g(the)f(processor)h(on)f (which)h(the)f(data)h(element)f(that)h(is)g(written)e(is)4 2740 y(local.)27 b(All)15 b(the)g(data)g(elements)g(that)g(are)g(required)f (to)h(compute)g(the)g(result)g(\(which)g(may)g(be)g(remote\))4 2794 y(are)h(communicated)f(to)h(the)g(processor)m(.)29 b(A)16 b(strict)g(rule)f(such)i(as)f(owner)o(-computes)f(is)h(not)g(necessary)4 2848 y(on)h(a)f(SSMM)h(because)g(message)h(passing)f(code)g(is)g(not)f (generated)g(at)h(compile)f(time)g([3].)30 b(In)17 b(some)p eop %%Page: 7 7 7 6 bop 482 311 a @beginspecial 127 @llx 520 @lly 393 @urx 632 @ury 2160 @rwi @setspecial %%BeginDocument: adi.idraw /arrowhead { 0 begin transform originalCTM itransform /taily exch def /tailx exch def transform originalCTM itransform /tipy exch def /tipx exch def /dy tipy taily sub def /dx tipx tailx sub def /angle dx 0 ne dy 0 ne or { dy dx atan } { 90 } ifelse def gsave originalCTM setmatrix tipx tipy translate angle rotate newpath arrowHeight neg arrowWidth 2 div moveto 0 0 lineto arrowHeight neg arrowWidth 2 div neg lineto patternNone not { originalCTM setmatrix /padtip arrowHeight 2 exp 0.25 arrowWidth 2 exp mul add sqrt brushWidth mul arrowWidth div def /padtail brushWidth 2 div def tipx tipy translate angle rotate padtip 0 translate arrowHeight padtip add padtail add arrowHeight div dup scale arrowheadpath ifill } if brushNone not { originalCTM setmatrix tipx tipy translate angle rotate arrowheadpath istroke } if grestore end } dup 0 9 dict put def /arrowheadpath { newpath arrowHeight neg arrowWidth 2 div moveto 0 0 lineto arrowHeight neg arrowWidth 2 div neg lineto } def /leftarrow { 0 begin y exch get /taily exch def x exch get /tailx exch def y exch get /tipy exch def x exch get /tipx exch def brushLeftArrow { tipx tipy tailx taily arrowhead } if end } dup 0 4 dict put def /rightarrow { 0 begin y exch get /tipy exch def x exch get /tipx exch def y exch get /taily exch def x exch get /tailx exch def brushRightArrow { tipx tipy tailx taily arrowhead } if end } dup 0 4 dict put def /arrowHeight 10 def /arrowWidth 5 def /IdrawDict 51 dict def IdrawDict begin /reencodeISO { dup dup findfont dup length dict begin { 1 index /FID ne { def }{ pop pop } ifelse } forall /Encoding ISOLatin1Encoding def currentdict end definefont } def /ISOLatin1Encoding [ /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /space/exclam/quotedbl/numbersign/dollar/percent/ampersand/quoteright /parenleft/parenright/asterisk/plus/comma/minus/period/slash /zero/one/two/three/four/five/six/seven/eight/nine/colon/semicolon /less/equal/greater/question/at/A/B/C/D/E/F/G/H/I/J/K/L/M/N /O/P/Q/R/S/T/U/V/W/X/Y/Z/bracketleft/backslash/bracketright /asciicircum/underscore/quoteleft/a/b/c/d/e/f/g/h/i/j/k/l/m /n/o/p/q/r/s/t/u/v/w/x/y/z/braceleft/bar/braceright/asciitilde /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/dotlessi/grave/acute/circumflex/tilde/macron/breve /dotaccent/dieresis/.notdef/ring/cedilla/.notdef/hungarumlaut /ogonek/caron/space/exclamdown/cent/sterling/currency/yen/brokenbar /section/dieresis/copyright/ordfeminine/guillemotleft/logicalnot /hyphen/registered/macron/degree/plusminus/twosuperior/threesuperior /acute/mu/paragraph/periodcentered/cedilla/onesuperior/ordmasculine /guillemotright/onequarter/onehalf/threequarters/questiondown /Agrave/Aacute/Acircumflex/Atilde/Adieresis/Aring/AE/Ccedilla /Egrave/Eacute/Ecircumflex/Edieresis/Igrave/Iacute/Icircumflex /Idieresis/Eth/Ntilde/Ograve/Oacute/Ocircumflex/Otilde/Odieresis /multiply/Oslash/Ugrave/Uacute/Ucircumflex/Udieresis/Yacute /Thorn/germandbls/agrave/aacute/acircumflex/atilde/adieresis /aring/ae/ccedilla/egrave/eacute/ecircumflex/edieresis/igrave /iacute/icircumflex/idieresis/eth/ntilde/ograve/oacute/ocircumflex /otilde/odieresis/divide/oslash/ugrave/uacute/ucircumflex/udieresis /yacute/thorn/ydieresis ] def /Helvetica reencodeISO def /none null def /numGraphicParameters 17 def /stringLimit 65535 def /Begin { save numGraphicParameters dict begin } def /End { end restore } def /SetB { dup type /nulltype eq { pop false /brushRightArrow idef false /brushLeftArrow idef true /brushNone idef } { /brushDashOffset idef /brushDashArray idef 0 ne /brushRightArrow idef 0 ne /brushLeftArrow idef /brushWidth idef false /brushNone idef } ifelse } def /SetCFg { /fgblue idef /fggreen idef /fgred idef } def /SetCBg { /bgblue idef /bggreen idef /bgred idef } def /SetF { /printSize idef /printFont idef } def /SetP { dup type /nulltype eq { pop true /patternNone idef } { dup -1 eq { /patternGrayLevel idef /patternString idef } { /patternGrayLevel idef } ifelse false /patternNone idef } ifelse } def /BSpl { 0 begin storexyn newpath n 1 gt { 0 0 0 0 0 0 1 1 true subspline n 2 gt { 0 0 0 0 1 1 2 2 false subspline 1 1 n 3 sub { /i exch def i 1 sub dup i dup i 1 add dup i 2 add dup false subspline } for n 3 sub dup n 2 sub dup n 1 sub dup 2 copy false subspline } if n 2 sub dup n 1 sub dup 2 copy 2 copy false subspline patternNone not brushLeftArrow not brushRightArrow not and and { ifill } if brushNone not { istroke } if 0 0 1 1 leftarrow n 2 sub dup n 1 sub dup rightarrow } if end } dup 0 4 dict put def /Circ { newpath 0 360 arc patternNone not { ifill } if brushNone not { istroke } if } def /CBSpl { 0 begin dup 2 gt { storexyn newpath n 1 sub dup 0 0 1 1 2 2 true subspline 1 1 n 3 sub { /i exch def i 1 sub dup i dup i 1 add dup i 2 add dup false subspline } for n 3 sub dup n 2 sub dup n 1 sub dup 0 0 false subspline n 2 sub dup n 1 sub dup 0 0 1 1 false subspline patternNone not { ifill } if brushNone not { istroke } if } { Poly } ifelse end } dup 0 4 dict put def /Elli { 0 begin newpath 4 2 roll translate scale 0 0 1 0 360 arc patternNone not { ifill } if brushNone not { istroke } if end } dup 0 1 dict put def /Line { 0 begin 2 storexyn newpath x 0 get y 0 get moveto x 1 get y 1 get lineto brushNone not { istroke } if 0 0 1 1 leftarrow 0 0 1 1 rightarrow end } dup 0 4 dict put def /MLine { 0 begin storexyn newpath n 1 gt { x 0 get y 0 get moveto 1 1 n 1 sub { /i exch def x i get y i get lineto } for patternNone not brushLeftArrow not brushRightArrow not and and { ifill } if brushNone not { istroke } if 0 0 1 1 leftarrow n 2 sub dup n 1 sub dup rightarrow } if end } dup 0 4 dict put def /Poly { 3 1 roll newpath moveto -1 add { lineto } repeat closepath patternNone not { ifill } if brushNone not { istroke } if } def /Rect { 0 begin /t exch def /r exch def /b exch def /l exch def newpath l b moveto l t lineto r t lineto r b lineto closepath patternNone not { ifill } if brushNone not { istroke } if end } dup 0 4 dict put def /Text { ishow } def /idef { dup where { pop pop pop } { exch def } ifelse } def /ifill { 0 begin gsave patternGrayLevel -1 ne { fgred bgred fgred sub patternGrayLevel mul add fggreen bggreen fggreen sub patternGrayLevel mul add fgblue bgblue fgblue sub patternGrayLevel mul add setrgbcolor eofill } { eoclip originalCTM setmatrix pathbbox /t exch def /r exch def /b exch def /l exch def /w r l sub ceiling cvi def /h t b sub ceiling cvi def /imageByteWidth w 8 div ceiling cvi def /imageHeight h def bgred bggreen bgblue setrgbcolor eofill fgred fggreen fgblue setrgbcolor w 0 gt h 0 gt and { l w add b translate w neg h scale w h true [w 0 0 h neg 0 h] { patternproc } imagemask } if } ifelse grestore end } dup 0 8 dict put def /istroke { gsave brushDashOffset -1 eq { [] 0 setdash 1 setgray } { brushDashArray brushDashOffset setdash fgred fggreen fgblue setrgbcolor } ifelse brushWidth setlinewidth originalCTM setmatrix stroke grestore } def /ishow { 0 begin gsave fgred fggreen fgblue setrgbcolor /fontDict printFont printSize scalefont dup setfont def /descender fontDict begin 0 [FontBBox] 1 get FontMatrix end transform exch pop def /vertoffset 1 printSize sub descender sub def { 0 vertoffset moveto show /vertoffset vertoffset printSize sub def } forall grestore end } dup 0 3 dict put def /patternproc { 0 begin /patternByteLength patternString length def /patternHeight patternByteLength 8 mul sqrt cvi def /patternWidth patternHeight def /patternByteWidth patternWidth 8 idiv def /imageByteMaxLength imageByteWidth imageHeight mul stringLimit patternByteWidth sub min def /imageMaxHeight imageByteMaxLength imageByteWidth idiv patternHeight idiv patternHeight mul patternHeight max def /imageHeight imageHeight imageMaxHeight sub store /imageString imageByteWidth imageMaxHeight mul patternByteWidth add string def 0 1 imageMaxHeight 1 sub { /y exch def /patternRow y patternByteWidth mul patternByteLength mod def /patternRowString patternString patternRow patternByteWidth getinterval def /imageRow y imageByteWidth mul def 0 patternByteWidth imageByteWidth 1 sub { /x exch def imageString imageRow x add patternRowString putinterval } for } for imageString end } dup 0 12 dict put def /min { dup 3 2 roll dup 4 3 roll lt { exch } if pop } def /max { dup 3 2 roll dup 4 3 roll gt { exch } if pop } def /midpoint { 0 begin /y1 exch def /x1 exch def /y0 exch def /x0 exch def x0 x1 add 2 div y0 y1 add 2 div end } dup 0 4 dict put def /thirdpoint { 0 begin /y1 exch def /x1 exch def /y0 exch def /x0 exch def x0 2 mul x1 add 3 div y0 2 mul y1 add 3 div end } dup 0 4 dict put def /subspline { 0 begin /movetoNeeded exch def y exch get /y3 exch def x exch get /x3 exch def y exch get /y2 exch def x exch get /x2 exch def y exch get /y1 exch def x exch get /x1 exch def y exch get /y0 exch def x exch get /x0 exch def x1 y1 x2 y2 thirdpoint /p1y exch def /p1x exch def x2 y2 x1 y1 thirdpoint /p2y exch def /p2x exch def x1 y1 x0 y0 thirdpoint p1x p1y midpoint /p0y exch def /p0x exch def x2 y2 x3 y3 thirdpoint p2x p2y midpoint /p3y exch def /p3x exch def movetoNeeded { p0x p0y moveto } if p1x p1y p2x p2y p3x p3y curveto end } dup 0 17 dict put def /storexyn { /n exch def /y n array def /x n array def n 1 sub -1 0 { /i exch def y i 3 2 roll put x i 3 2 roll put } for } def /SSten { fgred fggreen fgblue setrgbcolor dup true exch 1 0 0 -1 0 6 -1 roll matrix astore } def /FSten { dup 3 -1 roll dup 4 1 roll exch newpath 0 0 moveto dup 0 exch lineto exch dup 3 1 roll exch lineto 0 lineto closepath bgred bggreen bgblue setrgbcolor eofill SSten } def /Rast { exch dup 3 1 roll 1 0 0 -1 0 6 -1 roll matrix astore } def Begin [ 0.799705 0 0 0.799705 0 0 ] concat /originalCTM matrix currentmatrix def Begin %I Pict Begin %I Pict Begin %I Line 0 0 1 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 1 -0 -0 1 97 141 ] concat 87 643 119 643 Line End Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 224 787 ] concat [ (Phase 1) ] Text End End %I eop Begin %I Pict Begin %I Line 0 0 1 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 1 -0 -0 1 97 141 ] concat 87 643 87 619 Line End Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 6.12303e-17 1 -1 6.12303e-17 176.5 714.5 ] concat [ (Phase2) ] Text End End %I eop Begin %I Pict [ 1 0 0 1 -96 48 ] concat Begin %I Pict [ 1 0 0 1 -8 0 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 344 715 ] concat [ (P0) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 224 280 ] concat 79 419 175 443 Rect End End %I eop Begin %I Pict [ 1 0 0 1 -8 8 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 344 683 ] concat [ (P1) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 224 248 ] concat 79 419 175 443 Rect End End %I eop Begin %I Pict Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 336 667 ] concat [ (P2) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 216 232 ] concat 79 419 175 443 Rect End End %I eop Begin %I Pict [ 1 0 0 1 0 8 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 336 635 ] concat [ (P3) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 216 200 ] concat 79 419 175 443 Rect End End %I eop End %I eop End %I eop Begin %I Pict [ 1 0 0 1 15 -1 ] concat Begin %I Pict [ 1 0 0 1 176 192 ] concat Begin %I Pict Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 184 571 ] concat [ (P0) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 92 144 ] concat 87 411 111 435 Rect End End %I eop Begin %I Pict [ 1 0 0 1 -168 -72 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 376 643 ] concat [ (P1) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 100 144 ] concat 271 483 295 507 Rect End End %I eop Begin %I Pict [ 1 0 0 1 -200 8 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 432 563 ] concat [ (P2) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 100 144 ] concat 327 403 351 427 Rect End End %I eop Begin %I Pict [ 1 0 0 1 -280 104 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 536 467 ] concat [ (P3) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 100 144 ] concat 431 307 455 331 Rect End End %I eop Begin %I Pict [ 1 0 0 1 72 -24 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 184 571 ] concat [ (P0) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 92 144 ] concat 87 411 111 435 Rect End End %I eop Begin %I Pict [ 1 0 0 1 48 -48 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 184 571 ] concat [ (P0) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 92 144 ] concat 87 411 111 435 Rect End End %I eop Begin %I Pict [ 1 0 0 1 24 -72 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 184 571 ] concat [ (P0) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 92 144 ] concat 87 411 111 435 Rect End End %I eop Begin %I Pict [ 1 0 0 1 -192 -96 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 376 643 ] concat [ (P1) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 100 144 ] concat 271 483 295 507 Rect End End %I eop Begin %I Pict [ 1 0 0 1 -120 -120 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 376 643 ] concat [ (P1) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 100 144 ] concat 271 483 295 507 Rect End End %I eop Begin %I Pict [ 1 0 0 1 -144 -144 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 376 643 ] concat [ (P1) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 100 144 ] concat 271 483 295 507 Rect End End %I eop Begin %I Pict [ 1 0 0 1 -224 -16 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 432 563 ] concat [ (P2) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 100 144 ] concat 327 403 351 427 Rect End End %I eop Begin %I Pict [ 1 0 0 1 -248 -40 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 432 563 ] concat [ (P2) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 100 144 ] concat 327 403 351 427 Rect End End %I eop Begin %I Pict [ 1 0 0 1 -176 -64 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 432 563 ] concat [ (P2) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 100 144 ] concat 327 403 351 427 Rect End End %I eop Begin %I Pict [ 1 0 0 1 -304 80 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 536 467 ] concat [ (P3) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 100 144 ] concat 431 307 455 331 Rect End End %I eop Begin %I Pict [ 1 0 0 1 -352 32 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 536 467 ] concat [ (P3) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 100 144 ] concat 431 307 455 331 Rect End End %I eop Begin %I Pict [ 1 0 0 1 -328 56 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 536 467 ] concat [ (P3) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 100 144 ] concat 431 307 455 331 Rect End End %I eop End %I eop Begin %I Pict [ 1 0 0 1 160 0 ] concat Begin %I Line 0 0 1 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 1 -0 -0 1 97 141 ] concat 87 643 87 619 Line End Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 6.12303e-17 1 -1 6.12303e-17 176.5 714.5 ] concat [ (Phase2) ] Text End End %I eop Begin %I Pict [ 1 0 0 1 160 0 ] concat Begin %I Line 0 0 1 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 1 -0 -0 1 97 141 ] concat 87 643 119 643 Line End Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 224 787 ] concat [ (Phase 1) ] Text End End %I eop End %I eop Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 160.25 662.724 ] concat [ (\(a\) Row Block Distribution.) ] Text End Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 323.75 664 ] concat [ (\(b\) Our Proposed Distribution.) ] Text End End %I eop showpage end %%EndDocument @endspecial 269 391 a Fo(Figure)11 b(7:)18 b(Data)13 b(Distribution)e(used)i (to)f(alleviate)g(Memory)g(Contention.)503 1045 y @beginspecial 50 @llx 50 @lly 410 @urx 302 @ury 2057 @rwi @setspecial %%BeginDocument: 256.adi.ps /gnudict 40 dict def gnudict begin /Color false def /Solid false def /gnulinewidth 5.000 def /vshift -33 def /dl {10 mul} def /hpt 31.5 def /vpt 31.5 def /M {moveto} bind def /L {lineto} bind def /R {rmoveto} bind def /V {rlineto} bind def /vpt2 vpt 2 mul def /hpt2 hpt 2 mul def /Lshow { currentpoint stroke M 0 vshift R show } def /Rshow { currentpoint stroke M dup stringwidth pop neg vshift R show } def /Cshow { currentpoint stroke M dup stringwidth pop -2 div vshift R show } def /DL { Color {setrgbcolor Solid {pop []} if 0 setdash } {pop pop pop Solid {pop []} if 0 setdash} ifelse } def /BL { stroke gnulinewidth 2 mul setlinewidth } def /AL { stroke gnulinewidth 2 div setlinewidth } def /PL { stroke gnulinewidth setlinewidth } def /LTb { BL [] 0 0 0 DL } def /LTa { AL [1 dl 2 dl] 0 setdash 0 0 0 setrgbcolor } def /LT0 { PL [] 0 1 0 DL } def /LT1 { PL [4 dl 2 dl] 0 0 1 DL } def /LT2 { PL [2 dl 3 dl] 1 0 0 DL } def /LT3 { PL [1 dl 1.5 dl] 1 0 1 DL } def /LT4 { PL [5 dl 2 dl 1 dl 2 dl] 0 1 1 DL } def /LT5 { PL [4 dl 3 dl 1 dl 3 dl] 1 1 0 DL } def /LT6 { PL [2 dl 2 dl 2 dl 4 dl] 0 0 0 DL } def /LT7 { PL [2 dl 2 dl 2 dl 2 dl 2 dl 4 dl] 1 0.3 0 DL } def /LT8 { PL [2 dl 2 dl 2 dl 2 dl 2 dl 2 dl 2 dl 4 dl] 0.5 0.5 0.5 DL } def /P { stroke [] 0 setdash currentlinewidth 2 div sub M 0 currentlinewidth V stroke } def /D { stroke [] 0 setdash 2 copy vpt add M hpt neg vpt neg V hpt vpt neg V hpt vpt V hpt neg vpt V closepath stroke P } def /A { stroke [] 0 setdash vpt sub M 0 vpt2 V currentpoint stroke M hpt neg vpt neg R hpt2 0 V stroke } def /B { stroke [] 0 setdash 2 copy exch hpt sub exch vpt add M 0 vpt2 neg V hpt2 0 V 0 vpt2 V hpt2 neg 0 V closepath stroke P } def /C { stroke [] 0 setdash exch hpt sub exch vpt add M hpt2 vpt2 neg V currentpoint stroke M hpt2 neg 0 R hpt2 vpt2 V stroke } def /T { stroke [] 0 setdash 2 copy vpt 1.12 mul add M hpt neg vpt -1.62 mul V hpt 2 mul 0 V hpt neg vpt 1.62 mul V closepath stroke P } def /S { 2 copy A C} def end gnudict begin gsave 50 50 translate 0.100 0.100 scale 0 setgray /Times-Roman findfont 100 scalefont setfont newpath LTa 600 251 M 0 2218 V LTb LTa 600 251 M 2817 0 V LTb 600 251 M 63 0 V 2754 0 R -63 0 V 540 251 M (1000) Rshow LTa 600 585 M 2817 0 V LTb 600 585 M 31 0 V 2786 0 R -31 0 V LTa 600 780 M 2817 0 V LTb 600 780 M 31 0 V 2786 0 R -31 0 V LTa 600 919 M 2817 0 V LTb 600 919 M 31 0 V 2786 0 R -31 0 V LTa 600 1026 M 2817 0 V LTb 600 1026 M 31 0 V 2786 0 R -31 0 V LTa 600 1114 M 2817 0 V LTb 600 1114 M 31 0 V 2786 0 R -31 0 V LTa 600 1188 M 2817 0 V LTb 600 1188 M 31 0 V 2786 0 R -31 0 V LTa 600 1253 M 2817 0 V LTb 600 1253 M 31 0 V 2786 0 R -31 0 V LTa 600 1309 M 2817 0 V LTb 600 1309 M 31 0 V 2786 0 R -31 0 V LTa 600 1360 M 2817 0 V LTb 600 1360 M 63 0 V 2754 0 R -63 0 V -2814 0 R (10000) Rshow LTa 600 1694 M 2817 0 V LTb 600 1694 M 31 0 V 2786 0 R -31 0 V LTa 600 1889 M 2817 0 V LTb 600 1889 M 31 0 V 2786 0 R -31 0 V LTa 600 2028 M 2817 0 V LTb 600 2028 M 31 0 V 2786 0 R -31 0 V LTa 600 2135 M 2817 0 V LTb 600 2135 M 31 0 V 2786 0 R -31 0 V LTa 600 2223 M 2817 0 V LTb 600 2223 M 31 0 V 2786 0 R -31 0 V LTa 600 2297 M 2817 0 V LTb 600 2297 M 31 0 V 2786 0 R -31 0 V LTa 600 2362 M 2817 0 V LTb 600 2362 M 31 0 V 2786 0 R -31 0 V LTa 600 2418 M 2817 0 V LTb 600 2418 M 31 0 V 2786 0 R -31 0 V LTa 600 2469 M 2817 0 V LTb 600 2469 M 63 0 V 2754 0 R -63 0 V -2814 0 R (100000) Rshow LTa 600 251 M 0 2218 V LTb 600 251 M 0 63 V 0 2155 R 0 -63 V 600 151 M (0) Cshow LTa 952 251 M 0 2218 V LTb 952 251 M 0 63 V 0 2155 R 0 -63 V 952 151 M (2) Cshow LTa 1304 251 M 0 2218 V LTb 1304 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (4) Cshow LTa 1656 251 M 0 2218 V LTb 1656 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (6) Cshow LTa 2009 251 M 0 2218 V LTb 2009 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (8) Cshow LTa 2361 251 M 0 2218 V LTb 2361 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (10) Cshow LTa 2713 251 M 0 2218 V LTb 2713 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (12) Cshow LTa 3065 251 M 0 2218 V LTb 3065 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (14) Cshow LTa 3417 251 M 0 2218 V LTb 3417 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (16) Cshow 600 251 M 2817 0 V 0 2218 V -2817 0 V 600 251 L 220 1260 M currentpoint gsave translate 90 rotate 0 0 M (Execution Time \(milli sec\)) Cshow grestore 2008 51 M (Number of Processors) Cshow LT0 2009 1253 M (Owner Computes\(Block, Block\)) Rshow 2069 1253 M 180 0 V 952 2200 M 352 -24 V 705 86 V 1408 89 V 2129 1253 D 952 2200 D 1304 2176 D 2009 2262 D 3417 2351 D LT1 2009 1153 M (Sequential) Rshow 2069 1153 M 180 0 V 776 2146 M 2129 1153 A 776 2146 A LT2 2009 1053 M (No Distributions) Rshow 2069 1053 M 180 0 V 3417 2056 M -704 3 V -704 6 V -705 8 V -352 61 V 2129 1053 B 3417 2056 B 2713 2059 B 2009 2065 B 1304 2073 B 952 2134 B LT3 2009 953 M (\(*,Cyclic\)) Rshow 2069 953 M 180 0 V 952 2090 M 352 -33 V 705 -48 V 704 25 V 704 18 V 2129 953 C 952 2090 C 1304 2057 C 2009 2009 C 2713 2034 C 3417 2052 C LT4 2009 853 M (\(*,Block\)) Rshow 2069 853 M 180 0 V 952 2070 M 352 -53 V 705 -45 V 704 23 V 704 9 V 2129 853 T 952 2070 T 1304 2017 T 2009 1972 T 2713 1995 T 3417 2004 T LT5 2009 753 M (Owner Computes\(*,Cyclic\)) Rshow 2069 753 M 180 0 V 952 2080 M 352 -252 V 705 -78 V 352 216 V 352 20 V 704 128 V 2129 753 S 952 2080 S 1304 1828 S 2009 1750 S 2361 1966 S 2713 1986 S 3417 2114 S LT6 2009 653 M (Owner Computes\(*,Block\)) Rshow 2069 653 M 180 0 V 3417 1820 M 2713 1611 L -704 -55 V -705 118 V 952 1954 L 2129 653 D 3417 1820 D 2713 1611 D 2009 1556 D 1304 1674 D 952 1954 D LT7 2009 553 M (\(Block,Block\)) Rshow 2069 553 M 180 0 V 1168 547 R -704 222 V -704 85 V -705 249 V 952 1951 L 2129 553 A 3417 1100 A 2713 1322 A 2009 1407 A 1304 1656 A 952 1951 A stroke grestore end showpage %%EndDocument @endspecial 532 1111 a(Figure)f(8:)18 b(ADI)12 b(Performance)f(\(256x256\).) 4 1246 y(cases)18 b(adhering)d(to)i(owner)o(-computes)e(rule)h(can)h(incur)e (severe)i(synchronization)f(or)g(ownership)f(test)4 1300 y(overhead)c(which)f (exceeds)h(the)g(cost)g(of)f(accessing)i(remote)e(memory)m(.)17 b(W)l(e)11 b(use)g(the)g(Altering)f(Direction)4 1355 y(Integration)i(\()p Fd(ADI)p Fo(\))f(to)i(illustrate)f(that)h(the)g(shared)f(address)i(space)f (provides)g(\257exibility)e(in)i(the)g(choice)4 1409 y(of)i(computation)f (partitions,)h(reducing)f(contention)g(and)h(synchronization)f(overhead,)i (and)e(resulting)4 1463 y(in)e(signi\256cant)g(performance)g(improvements.)77 1539 y(W)l(e)k(use)f(the)g(Hector)n(,)h(a)f(Non-Uniform)e(Memory)i(Access)h (multiprocessor)n(,)f(as)g(an)h(experimental)4 1593 y(platform.)21 b(Hector)13 b(consists)h(of)f(4)h(sets)g(of)f(processor)o(-memory)g(pairs)g (connected)h(by)f(a)h(bus)g(to)f(form)g(a)4 1647 y(station;)g(4)g(stations)g (are)f(connected)h(by)g(a)g(local)f(ring)g(to)h(form)f(a)h(cluster;)f(4)h (local)g(rings)f(are)h(connected)4 1701 y(by)g(a)g(global)g(ring.)19 b(W)l(e)14 b(use)f(a)g(system)h(with)e(one)h(cluster)m(.)21 b(Each)13 b(processor)o(-memory)f(pair)g(consists)i(of)4 1755 y(a)f(Motorola)f(MC88100)h(CPU,)g(a)g(16)f(KB)h(instruction)f(cache,)i(a)f (16)f(KB)h(data)g(cache)g(and)g(4)f(MB)i(of)e(the)4 1809 y(globally)i (addressable)g(memory)m(.)23 b(The)15 b(hardware)e(provides)h(no)g(support)g (for)f(cache)i(coherence.)23 b(The)4 1864 y(coherence)12 b(of)f(data)h(is)g (maintained)f(by)h(software)f(at)h(cache)g(line)f(granularity)g([10)o(].)18 b(Data)12 b(distributions)4 1918 y(are)g(implemented)g(using)g(the)h(array)e (allocation)h(techniques)h(described)f(in)g([21,)h(3].)4 2079 y Fc(5.1)58 b(Contention)15 b(and)f(Synchr)n(onization)h(Conscious)e (Distribution)4 2177 y Fo(The)21 b Fd(ADI)e Fo(program)g(has)h(two)g(phases)h (with)e(parallelism)g(along)h(orthogonal)f(dimensions)h(in)f(each)4 2231 y(phase.)k(It)13 b(operates)g(on)h(three)f(2-dimensional)g(arrays)g Fj(A)p Fo(,)h Fj(B)i Fo(and)e Fj(X)t Fo(.)22 b(A)14 b(single)g(iteration)e (of)i(an)f(outer)4 2285 y(sequentially)d(iterated)f(loop)h(consists)g(of)g(a) g(forward)f(and)g(a)i(backward)e(sweep)i(phase)f(along)g(the)f(rows)h(of)4 2339 y(three)f(arrays,)h(followed)e(by)h(another)g(forward)e(and)i(backward)g (sweep)h(phase)f(along)g(the)g(columns)g(of)g(the)4 2394 y(arrays)g([18].)17 b(This)10 b(application)f(is)h(typical)f(of)g(other)g(programs)g(such)g(as)h Fd(2D-FFT)f Fo(and)h Fd(Erlebacher)4 2448 y Fo(that)i(have)h(parallelism)f (in)g(orthogonal)f(directions)h(in)g(dif)o(ferent)f(phases)j(of)e(the)g (program.)77 2523 y(The)k(best)g(data)f(distribution)f(scheme)i(for)e Fd(ADI)h Fo(remains)g(an)g(issue)h(of)f(debate)h([18)o(,)g(4].)26 b(The)16 b(two)4 2578 y(proposed)h(schemes)g(partition)f(arrays)g(along)h(a)f (single)h(dimension,)h(either)e(in)h(blocks)g(or)f(cyclically)m(.)4 2632 y(These)f(distributions,)g(in)e(conjunction)h(with)g(the)g(owner)o (-computes)f(rule)h(result)f(in)h(a)h(wavefront)e(type)4 2686 y(computation,)e(leading)g(to)g(heavy)g(synchronization)g(overhead)g(in)g (one)g(of)f(the)i(phases)g(of)e(the)h(program.)4 2740 y(Figure)i(7\(a\))g (shows)i(a)f Fd(Block)f Fo(distribution)g(of)g(the)h(rows)g(of)f(the)h (arrays.)23 b(W)n(ith)13 b(such)h(a)g(distribution,)4 2794 y(during)g(the)h(\256rst)g(phase)g(of)g(the)g(program)e(all)i(the)g (processors)g(access)i(data)e(that)f(is)i(local)f(and)g(require)4 2848 y(no)g(communication.)25 b(During)14 b(the)h(second)g(phase,)h(however)n (,)g(the)f(parallelism)f(is)h(orthogonal)f(to)h(the)p eop %%Page: 8 8 8 7 bop 35 1 a Fo(T)m(able)12 b(1:)18 b(Performance)11 b(Bottlenecks)i(for)e (various)h(data)h(and)f(computation)g(partitioning)e(for)i(ADI.)p 244 33 1363 2 v 243 84 2 51 v 252 84 V 277 69 a Fb(Data)g(Distribution)p 620 84 V 77 w(Compute)f(Rule)p 993 84 V 135 w(Performance)i(Bottleneck)p 1597 84 V 1606 84 V 244 85 1363 2 v 243 136 2 51 v 252 136 V 387 121 a(None)p 620 136 V 247 w(Relaxed)p 993 136 V 228 w(Memory)e(Contention)p 1597 136 V 1606 136 V 243 187 V 252 187 V 344 172 a(\(*,)h(Block\))p 620 187 V 117 w(Owner)o(-Computes)p 993 187 V 125 w(High)f(Synchronization)p 1597 187 V 1606 187 V 243 238 V 252 238 V 344 223 a(\(*,)h(Block\))p 620 238 V 204 w(Relaxed)p 993 238 V 228 w(Memory)f(Contention)p 1597 238 V 1606 238 V 243 289 V 252 289 V 339 273 a(\(*,)h(Cyclic\))p 620 289 V 112 w(Owner)o(-Computes)p 993 289 V 125 w(High)f(Synchronization)p 1597 289 V 1606 289 V 243 340 V 252 340 V 339 324 a(\(*,)h(Cyclic\))p 620 340 V 199 w(Relaxed)p 993 340 V 228 w(Memory)f(Contention)p 1597 340 V 1606 340 V 243 390 V 252 390 V 301 375 a(\(Block,)h(Block\))p 620 390 V 74 w(Owner)o(-Computes)p 993 390 V 180 w(Ownership)f(tests)p 1597 390 V 1606 390 V 243 441 V 252 441 V 301 426 a(\(Block,)h(Block\))p 620 441 V 161 w(Relaxed)p 993 441 V 137 w(High)f(Remote)g(Memory)g(Access)p 1597 441 V 1606 441 V 244 443 1363 2 v 4 606 a Fo(direction)17 b(of)g(distribution.)32 b(Strict)16 b(adherence)h(to)g(the)h(owner)o (-computes)e(rule)h(implies)g(ordering)f(of)4 660 y(the)d(computations)f(by)g (processors)h(on)f(the)g(corresponding)g(chunk)g(of)g(the)h(columns)f(they)g (own.)19 b(Thus,)4 715 y(processor)10 b Fj(i)g Fo(has)h(to)f(wait)f(for)h (processor)g Fj(i)c Fh(\000)g Fo(1)j(to)h(\256nish)g(the)g(computation)f(on)h (its)h(chunk)f(of)f(the)h(column)4 769 y(before)g(proceeding.)17 b(A)10 b(lar)o(ger)f(number)h(of)f(synchronizations)h(are)g(required)f(to)h (maintain)g(the)g(ordering)4 823 y(involved)i(in)g(the)g(wavefront)g (computation.)77 899 y(The)i(synchronization)e(overhead)h(can)g(be)g (eliminated)f(by)h(relaxing)f(the)h(owner)o(-computes)f(rule)h(in)4 953 y(the)18 b(second)g(phase)h(and)f(allowing)f(the)h(processor)g(to)f (write)h(the)f(results)i(to)e(remote)h(memory)f(mod-)4 1007 y(ules.)24 b(This)15 b(eliminates)f(synchronization)f(overhead)h(at)g(the)g (expense)g(of)g(increased)g(remote)g(memory)4 1061 y(accesses.)26 b(However)n(,)15 b(the)g(use)g(of)f(this)g(relaxed)g(compute)g(rule)g(with)g (the)g Fd(\(*,Block\))g Fo(distribution)4 1115 y(results)9 b(in)g(heavy)g(contention.)17 b(Each)9 b(processor)g(is)g(responsible)g(for)f (computing)g(a)i(column,)f(and)g(hence,)4 1169 y(each)14 b(processor)g (accesses)h(every)e(memory)g(module)g(in)g(sequence.)23 b(Thus,)15 b(a)e(given)h(memory)e(module)4 1224 y(is)h(accessed)h(by)e(every)g (processor)g(at)h(the)f(same)h(time,)f(leading)g(to)h(contention.)77 1299 y(The)k(data)e(distribution)g(scheme)h(depicted)f(in)h(Figure)f(7\(b\)) 1149 1281 y Fm(4)1182 1299 y Fo(eliminates)h(contention)f(and)h(results)4 1353 y(in)21 b(the)g(best)h(possible)f(performance)f(with)h(the)g(relaxed)g (compute)g(rule.)44 b(W)n(ith)21 b(this)g(distribution,)4 1408 y(processors)13 b(access)g(data)g(from)e(remote)g(memory)h(modules)g(in)g (both)g(phases)h(of)f(the)g(program.)17 b(In)12 b(both)4 1462 y(phases,)h(processors)f(start)g(working)e(on)i(the)f(columns)h(assigned)g (to)g(them)f(by)g(accessing)i(data)f(that)f(is)h(in)4 1516 y(dif)o(ferent)f(memory)f(modules)i(thus)g(avoiding)f(contention.)18 b(There)12 b(is)g(no)f(wavefront)g(type)h(parallelism,)4 1570 y(and)h(hence)f(no)g(overhead)g(involved)g(due)h(to)f(synchronization.)77 1646 y(The)19 b(use)f(of)f(owner)o(-computes)g(rule)h(with)f(the)h (distribution)f(of)g(Figure)g(7\(b\))g(will)h(not)f(result)h(in)4 1700 y(good)f(performance.)31 b(Either)17 b(ownership)g(tests)h(must)f(be)g (introduced)f(in)h(the)g(body)g(of)g(the)g(loops)g(to)4 1754 y(enforce)c(the)g(owner)o(-computes)f(rule,)i(or)e(the)i(loops)f(must)g(be)g (rewritten)g(with)f(additional)h(strip-mined)4 1808 y(controlling)f(loops)h (to)g(schedule)h(the)f(computations)f(on)h(sub-blocks)g(of)g(the)g(array)m(.) 20 b(The)14 b(former)d(leads)4 1862 y(to)h(overhead)g(and)h(the)f(latter)g (introduces)g(synchronization)g(similar)f(to)i(the)f(wavefront)f (computation.)77 1938 y(The)j(result)g(of)f(executing)g(the)h Fd(ADI)f Fo(application)g(on)h(the)f(Hector)g(multiprocessor)g(for)g(a)h (data)f(size)4 1992 y(of)18 b(256x256)f(with)h(various)g(data)g (distributions)g(and)g(compute)g(rules)g(is)g(shown)g(in)g(Figure)g(8.)35 b(The)4 2046 y Fd(\(Block,Block\))17 b Fo(data)h(distribution)f(that)h (relaxes)g(the)g(owner)o(-computes)f(rule)g(outperforms)g(all)4 2101 y(data)d(distribution)e(schemes)i(that)f(adhere)g(to)g(the)g(rule.)21 b(The)14 b(\256gure)f(also)g(indicates)h(that)f(the)g(overhead)4 2155 y(due)j(to)g(the)f(ownership)h(tests)g(when)g(using)g(the)f(owner)o (-computes)g(rule)h(with)f(a)h Fd(\(Block,Block\))4 2209 y Fo(distribution)d(degrades)h(performance.)21 b(It)14 b(is)g(also)g(clear)g (that)f(the)h(use)g(of)g(data)g(distribution)e(improves)4 2263 y(performance)i(over)h(the)g(use)h(of)f(operating)f(system)i(policies)f(to)g (manage)g(data)h(\(the)e(no)i(distributions)4 2317 y(curve\).)35 b(The)19 b(performance)e(bottlenecks)h(of)g(various)g(distributions)g(for)f Fd(ADI)h Fo(are)g(summarized)g(in)4 2371 y(T)m(able)13 b(1.)p 4 2406 737 2 v 62 2437 a Fl(4)79 2452 y Fb(This)f(is)h(equivalent)g(to)f (!HPF$)i(PROCESSORS)i(PROCS\(N\))g(with)c(!HPF$)i(DISTRIBUTE)h(B\(BLOCK,)4 2503 y(BLOCK\),)10 b(X\(BLOCK,)g(BLOCK\))g(ON)e(PROCS)j(in)d Fa(HPF)p Fb(.)i(In)f(the)f(current)h Fa(HPF)h Fb(speci\256cation,)f(this)f (distribution)4 2554 y(is)18 b(not)g(valid;)k(the)c(rank)h(of)g(each)g (distributee)f(must)f(equal)i(the)g(rank)f(of)h(the)g(named)f(processor)h (grid)f([16].)4 2604 y(Distributions)7 b(in)i(which)g(this)g(is)f(not)h(the)g (case)h(introduce)f(additional)g(complexity)e(on)j(DMMs)e([17].)16 b(In)10 b(contrast,)4 2655 y(SSMMs)h(provide)g(the)g(\257exibility)f(to)h (implement)f(these)h(distributions.)p eop %%Page: 9 9 9 8 bop 4 -21 a Fn(6)71 b(Related)19 b(W)l(ork)4 106 y Fo(Several)12 b(researchers)g(have)g(focused)g(on)g(the)g(problem)f(of)g(deriving)g(data)h (distributions)f(automatically)4 160 y(for)g(DMMs.)20 b(Li)12 b(and)g(Chen)h([22)o(],)f(Gupta)g(and)g(Banerjee)h([12)o(],)f(Zima)h(et)f (al.)h([9)o(])f(and)g(Garcia)g(et)g(al.)h([11)o(])4 215 y(follow)e(the)h (approach)g(of)f(\256nding)h(the)f(alignment)h(constraints)g(between)g(dif)o (ferent)e(dimensions)i(of)g(the)4 269 y(arrays)g(and)g(derive)g(a)g(data)g (distribution)f(that)h(minimizes)g(interprocessor)g(communication.)17 b(T)m(o)12 b(avoid)4 323 y(a)f(heuristic)g(approach,)g(Bixby)g(et)g(al.)h([7) o(])f(formulate)e(a)j(0-1)e(integer)g(programming)g(problem)g(for)g(deriv-)4 377 y(ing)k(data)g(distributions.)21 b(Their)14 b(approach)g(relies)g(on)f (the)h(assumption)g(that)g(a)g(good)f(data)h(distribution)4 431 y(for)h(the)i(entire)e(program)g(can)i(be)f(found)f(by)i(mer)o(ging)e (the)h(data)g(distributions)g(of)f(smaller)h(segments)4 485 y(of)g(the)g(program.)27 b(They)17 b(minimize)e(the)h(interprocessor)f (communication)g(using)h(the)g(\252performance)4 540 y(estimator)r(\272)c (developed)h(by)g(Balasundaram)g(et)g(al.)g([6)o(].)20 b(Anderson)12 b([5])g(presents)i(an)e(algebraic)h(frame-)4 594 y(work)g(for)g(determining)f (data)h(and)h(computation)e(partitions)h(by)g(minimizing)g(communication)f (across)4 648 y(processors.)28 b(Data)16 b(transformations)e(are)i(then)f (applied)h(so)f(that)h(the)f(processors)h(access)h(contiguous)4 702 y(data)g(regions)f(to)h(reduce)g(false)g(sharing.)31 b(This)17 b(technique)g(is)g(oblivious)f(to)h(SSMM)g(speci\256c)g(issues)4 756 y(such)c(as)g(contention)f(and)g(cache)h(af)o(\256nity)m(.)4 938 y Fn(7)71 b(Concluding)19 b(Remarks)4 1065 y Fo(Although)9 b(lar)o(ge)g(SSMMs)i(are)e(built)g(based)h(on)g(an)g(architecture)e(with)i (distributed)f(memory)m(,)g(the)h(shared)4 1119 y(memory)15 b(paradigm)g(introduces)g(performance)g(issues)i(that)e(are)h(dif)o(ferent)f (from)f(those)i(encountered)4 1173 y(in)e(DMMs.)24 b(The)14 b(high)f(cost)i(of)e(interprocessor)g(communication)g(in)h(distributed)f (memory)f(multipro-)4 1227 y(cessors)18 b(makes)e(the)h(minimization)e(of)h (communication)g(the)g(predominant)g(issue)h(in)f(selecting)h(data)4 1282 y(distributions)h(and)i(in)e(partitioning)g(computations.)38 b(On)19 b(SSMMs,)j(a)d(methodology)f(for)h(selecting)4 1336 y(data)14 b(distributions)g(must)g(also)g(consider)g(cache)h(af)o(\256nity)m (,)f(memory)f(contention)h(and)g(false)g(sharing)g(in)4 1390 y(addition)d(to)g(the)g(cost)h(of)f(interprocessor)g(communication.)17 b(Furthermore,)10 b(the)h(single)h(shared)f(address)4 1444 y(space)j(present)f(in)g(SSMMs)g(provides)g(\257exibility)f(in)h(the)g (selection)g(of)f(computation)h(partitions.)19 b(This)4 1498 y(should)e(be)f(exploited)g(in)g(applications)h(in)f(which)g(owner)o (-computes)g(results)g(in)h(poor)f(performance.)4 1552 y(The)f Fe(Jasmine)g Fo(compiler)f(project)g([2)o(])h(is)g(investigating)f(the)g (issues)i(discussed)f(in)g(this)f(paper)h(through)4 1607 y(the)d(development) g(of)g(a)h(framework)e(for)g(automatically)h(deriving)f(data)i(distributions) e(on)i(SSMMs.)4 1777 y Fn(Refer)o(ences)29 1896 y Fo([1])24 b(T)l(.S.)17 b(Abdelrahman)f(et)g(al.)29 b(An)16 b(overview)f(of)h(the)g (NUMAchine)h(multiprocessor)e(project.)28 b(In)112 1943 y Fe(Pr)n(oc.)13 b(of)g(the)f(Canadian)g(Super)n(computing)g(Conf.)p Fo(,)h(pages)g (283\261295,)f(1994.)29 2032 y([2])24 b(T)l(.S.)12 b(Abdelrahman,)f(N.)h (Manjikian,)g(and)f(S.)g(T)m(andri.)16 b(The)11 b(Jasmine)h(Compiler.)k(In)10 b(preparation.)29 2120 y([3])24 b(T)l(.S.)e(Abdelrahman)e(and)h(T)l(.N.)h(W)l (ong.)41 b(Distributed)20 b(array)g(data)h(management)g(on)f(NUMA)112 2167 y(multiprocessors.)d(In)12 b Fe(Pr)n(oc.)i(of)e(SHPCC)p Fo(,)i(pages)f(551\261559,)f(1994.)29 2256 y([4])24 b(S.P)-6 b(.)10 b(Amarasinghe,)g(J.M.)h(Anderson,)f(M.S.)h(Lam,)g(and)e(A.W)-5 b(.)11 b(Lim.)j(An)9 b(overview)g(of)g(a)h(compiler)112 2303 y(for)k(scalable)j(parallel)e(machines.)27 b(In)15 b Fe(Languages)h(and)f (Compilers)i(for)f(Parallel)g(Computing)p Fo(,)112 2350 y(pages)c (253\261272.)h(Springer)o(-V)-6 b(erlag)10 b(LNCS-768,)j(1993.)29 2438 y([5])24 b(J.M.)13 b(Anderson.)j(Demonstration)11 b(of)g(automatic)g (data)h(and)f(computation)g(decomposition)g(tech-)112 2485 y(niques.)f(In)e Fe(Pr)n(oc.)g(of)g(the)g(W)-5 b(orkshop)8 b(on)g(Automatic)g(Data)g(Layout)g(and)g(Performance)g(Pr)n(ediction)p Fo(,)112 2532 y(1995.)29 2620 y([6])24 b(V)-6 b(.)12 b(Balasundaram,)i(G.)f (Fox,)g(K.)g(Kennedy)m(,)g(and)f(U.)i(Kremer)m(.)k(A)13 b(static)g (performance)e(estimator)112 2667 y(to)h(guide)g(data)g(partitioning)f (decisions.)19 b(In)12 b Fe(Pr)n(oc.)i(of)e(PPOPP)p Fo(,)j(pages)e (213\261223,)f(1991.)29 2756 y([7])24 b(R.)9 b(Bixby)m(,)h(K.)g(Kennedy)m(,)f (and)h(U.)f(Kremer)m(.)j(Automatic)d(data)g(layout)f(using)h(0-1)g(integer)f (program-)112 2803 y(ming.)15 b(In)c Fe(Pr)n(oc.)i(of)e(the)g(Int'l)f(Conf.)i (on)f(Parallel)g(Ar)n(chitectur)n(es)i(and)e(Compilation)g(T)-5 b(echniques)p Fo(,)112 2850 y(pages)12 b(111\261122,)h(1994.)p eop %%Page: 10 10 10 9 bop 29 -27 a Fo([8])24 b(W)-5 b(.J.)19 b(Bolosky)f(and)g(M.L.)h(Scott.) 32 b(False)18 b(sharing)f(and)h(its)g(ef)o(fect)f(on)g(shared)h(memory)f (multi-)112 20 y(processors.)k(In)13 b Fe(Pr)n(oc.)j(of)d(4th)g(Symp.)h(on)g (Experiences)h(with)e(Distributed)h(and)f(Multipr)n(ocessor)112 67 y(Systems)p Fo(,)g(pages)g(57\26171,)f(1993.)29 155 y([9])24 b(B.M.)14 b(Chapman,)g(T)l(.)h(Fahringer)n(,)e(and)g(H.)h(Zima.)21 b(Automatic)12 b(support)h(for)f(data)i(distribution)e(on)112 202 y(distributed)g(memory)g(multiprocessor)g(systems.)22 b(In)12 b Fe(Languages)h(and)g(Compilers)h(for)f(Parallel)112 249 y(Computing)p Fo(,)f(pages)h(184\261199.)f(Springer)o(-V)-6 b(erlag)11 b(LNCS-768,)h(1993.) 4 337 y([10])24 b(B.)11 b(Gamsa.)k(Region-oriented)9 b(main)h(memory)f (management)h(in)g(shared-memory)f(NUMA)i(mul-)112 384 y(tiprocessors.)19 b(Master)r(')m(s)13 b(thesis,)h(Department)d(of)i(Computer)f(Science,)h (University)f(of)g(T)m(oronto,)112 431 y(T)m(oronto,)f(CANADA,)i(1992.)4 519 y([11])24 b(J.)f(Garcia,)h(E.)f(A)-5 b(yguade,)26 b(and)c(J.)h(Labarta.) 44 b(A)22 b(novel)g(approach)g(towards)g(automatic)g(data)112 566 y(distribution.)33 b(In)18 b Fe(Pr)n(oc.)i(of)e(the)h(W)-5 b(orkshop)20 b(on)e(Automatic)g(Data)g(Layout)g(and)h(Performance)112 613 y(Pr)n(ediction)p Fo(,)13 b(1995.)4 701 y([12])24 b(M.)16 b(Gupta)f(and)h(P)-6 b(.)17 b(Banerjee.)27 b(Automatic)15 b(data)g (partitioning)g(on)g(distributed)g(memory)g(multi-)112 748 y(processors.)j Fe(IEEE)c(T)m(rans.)f(on)f(Parallel)h(and)f(Distributed)h (Systems)p Fo(,)g(3\(2\):179\261193,)e(1992.)4 836 y([13])24 b(K.)15 b(Harzallah)g(and)g(K.C.)h(Sevcik.)25 b(Hot)15 b(spot)g(analysis)g (in)g(lar)o(ge)g(scale)h(shared)f(memory)f(multi-)112 883 y(processors.)k(In) 12 b Fe(Pr)n(oc.)i(of)e(Super)n(computing'93)p Fo(,)g(pages)h(895\261905.)f (ACM,)i(1993.)4 971 y([14])24 b(M.)11 b(Heinrich)f(et)h(al.)16 b(The)11 b(Stanford)f(FLASH)g(Multiprocessor.)16 b(In)10 b Fe(Pr)n(oc.)i(of)f(the)g(21st)g(Int'l)e(Symp.)112 1018 y(on)j(Computer)g(Ar)n (chitectur)n(e)p Fo(,)j(pages)e(302\261313,)f(1994.)4 1106 y([15])24 b(S.)15 b(Hiranandani,)i(K.)f(Kennedy)m(,)g(and)g(C.)g(T)m(seng.)27 b(Compiler)15 b(optimizations)g(for)g(Fortran)f(D)i(on)112 1153 y(MIMD)f(distributed-memory)e(machines.)25 b(In)15 b Fe(Pr)n(oc.)h(of)f (Super)n(computing'91)p Fo(,)g(pages)h(86\261100,)112 1200 y(Albuquerque,)c(NM,)h(1991.)4 1288 y([16])24 b(HPF)l(.)33 b(High)17 b(Performance)g(Fortran)g(Language)i(Speci\256cation)e(\(High)g (Performance)g(Fortran)112 1335 y(Forum\).)f(T)m(echnical)d(report)e (CRPC-TR92225,)i(Rice)g(University)m(,)f(1994.)4 1424 y([17])24 b(C.)13 b(Koelbel.)18 b(HPF)12 b(constraints.)18 b(Personal)12 b(Communications,)g(1995.)4 1512 y([18])24 b(U.)14 b(Kremer)m(.)23 b(Automatic)14 b(data)g(layout)g(for)g(distributed-memory)e(multiprocessors.) 23 b(T)m(echnical)112 1559 y(report)11 b(CRPC-TR93229-S,)h(Center)h(for)e (Research)i(on)f(Parallel)g(Computation,)g(1993.)4 1647 y([19])24 b(T)l(.T)l(.)15 b(Kwan,)f(B.K.)h(T)m(otty)m(,)e(and)g(D.A.)h(Reed.)21 b(Communication)12 b(and)i(computation)e(performance)112 1694 y(of)f(the)i(CM5.)19 b(In)12 b Fe(Pr)n(oc.)h(of)g(Super)n(computing'93)p Fo(,)f(pages)h(192\261201.)f(ACM,)h(1993.)4 1782 y([20])24 b(D.)15 b(Lenoski)h(et)f(al.)26 b(The)15 b(Stanford)f(DASH)h(multiprocessor)m (.)25 b Fe(IEEE)16 b(Computer)p Fo(,)h(25\(3\):63\26179,)112 1829 y(1992.)4 1917 y([21])24 b(H.)12 b(Li)g(and)g(K.C.)h(Sevcik.)k (Numacros:)g(Data)12 b(parallel)f(programming)f(on)i(NUMA)g(multiproces-)112 1964 y(sors.)h(In)c Fe(Pr)n(oc.)i(of)e(4th)g(Symp.)h(on)g(Experiences)g(with) g(Distributed)f(and)g(Multipr)n(ocessor)j(Systems)p Fo(,)112 2011 y(pages)g(247\261263,)h(1993.)4 2099 y([22])24 b(J.)11 b(Li)h(and)f(M.)h(Chen.)k(Compiling)10 b(communication-ef)o(\256cient)f (programs)i(for)f(massively)h(parallel)112 2146 y(machines.)18 b Fe(Journal)12 b(of)h(Parallel)f(and)h(Distributed)f(Computing)p Fo(,)g(2\(3\):361\261376,)f(1991.)4 2234 y([23])24 b(Cray)14 b(Research.)25 b(The)16 b(Cray)e(Research)h(Massively)h(Parallel)e(Processor) g(System)h(-)f(Cray)h(T3D.)112 2281 y(T)m(echnical)d(report)f(80922,)i (Munchen,)g(Germany)m(,)f(1993.)4 2369 y([24])24 b(Kendall)12 b(Square)f(Research.)19 b Fe(KSR1)13 b(Principles)h(of)e(Operation)p Fo(.)18 b(W)l(altham,)13 b(MA,)g(1991.)4 2457 y([25])24 b(J.)12 b(T)m(orres,)g(E.)h(A)-5 b(yguade,)13 b(J.)g(Labarta,)f(and)g(M.)h(V)-6 b(alero.)17 b(Align)12 b(and)g(distribute-based)f(linear)g(loop)112 2504 y(transformations.)16 b(In)11 b Fe(Languages)g(and)h(Compilers)h(for)f (Parallel)g(Computing)p Fo(,)g(pages)g(321\261339.)112 2551 y(Springer)o(-V)-6 b(erlag)10 b(LNCS-768,)j(1993.)4 2639 y([26])24 b(Z.)17 b(V)m(ranesic,)h(M.)f(Stumm,)f(R.)i(White,)f(and)f(D.)h(Lewis.)30 b(The)16 b(Hector)g(Multiprocessor.)29 b Fe(IEEE)112 2686 y(Computer)p Fo(,)13 b(24\(1\):72\26180,)e(1991.)4 2774 y([27])24 b(R.W)-5 b(.)12 b(W)n(isniewski,)g(L.I.)g(Kontothanassis,)h(and)e(M.L.)h(Scott.)k (High)11 b(performance)e(synchroniza-)112 2821 y(tion)i(algorithms)h(for)g (multiprogrammed)e(multiprocessors.)18 b(In)12 b Fe(Pr)n(oc.)h(of)g(PPOPP)p Fo(,)h(1995.)p eop %%Trailer end userdict /end-hook known{end-hook}if %%EOF |
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Tandri_Abdel_PDPTA95.ps version [2089955127].
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643 2644 2645 2646 2647 2648 2649 2650 2651 2652 2653 2654 2655 2656 2657 2658 2659 2660 2661 2662 2663 2664 2665 2666 2667 2668 2669 2670 2671 2672 2673 2674 2675 2676 2677 2678 2679 2680 2681 2682 2683 2684 2685 2686 2687 2688 2689 2690 2691 2692 2693 2694 2695 2696 2697 2698 2699 2700 2701 2702 2703 2704 2705 2706 2707 2708 2709 2710 2711 2712 2713 2714 2715 2716 2717 2718 2719 2720 2721 2722 2723 2724 2725 2726 2727 2728 2729 2730 2731 2732 2733 2734 2735 2736 2737 2738 2739 2740 2741 2742 2743 2744 2745 2746 2747 2748 2749 2750 2751 2752 2753 2754 2755 2756 2757 2758 2759 2760 2761 2762 2763 2764 2765 2766 2767 2768 2769 2770 2771 2772 2773 2774 2775 2776 2777 2778 2779 2780 2781 2782 2783 2784 2785 2786 2787 2788 2789 2790 2791 2792 2793 2794 2795 2796 2797 2798 2799 2800 2801 2802 2803 2804 2805 2806 2807 2808 2809 2810 2811 2812 2813 2814 2815 2816 2817 2818 2819 2820 2821 2822 2823 2824 2825 2826 2827 2828 2829 2830 2831 2832 2833 2834 2835 2836 2837 2838 2839 2840 2841 2842 2843 2844 2845 2846 2847 2848 2849 2850 2851 2852 2853 2854 2855 2856 2857 2858 2859 2860 2861 2862 2863 2864 2865 2866 2867 2868 2869 2870 2871 2872 2873 2874 2875 2876 2877 2878 2879 2880 2881 2882 2883 2884 2885 2886 2887 2888 2889 2890 2891 2892 2893 2894 2895 2896 2897 2898 2899 2900 2901 2902 2903 2904 2905 2906 2907 2908 2909 2910 2911 2912 2913 2914 2915 2916 2917 2918 2919 2920 2921 2922 2923 2924 2925 2926 2927 2928 2929 2930 2931 2932 2933 2934 2935 2936 2937 2938 2939 2940 2941 2942 2943 2944 2945 2946 2947 2948 2949 2950 2951 2952 2953 2954 2955 2956 2957 2958 2959 2960 2961 2962 2963 2964 2965 2966 2967 2968 2969 2970 2971 2972 2973 2974 2975 2976 2977 2978 2979 2980 2981 2982 2983 2984 2985 2986 2987 2988 2989 2990 2991 2992 2993 2994 2995 2996 2997 2998 2999 3000 3001 3002 3003 3004 3005 3006 3007 3008 3009 3010 3011 3012 3013 3014 3015 3016 3017 3018 3019 3020 3021 3022 3023 3024 3025 3026 3027 3028 3029 3030 3031 3032 3033 3034 3035 3036 3037 3038 3039 3040 3041 3042 3043 3044 3045 3046 3047 3048 3049 3050 3051 3052 3053 3054 3055 3056 3057 3058 3059 3060 3061 3062 3063 3064 3065 3066 3067 3068 3069 3070 3071 3072 3073 3074 3075 3076 3077 3078 3079 3080 3081 3082 3083 3084 3085 3086 3087 3088 3089 3090 3091 3092 3093 3094 3095 3096 3097 3098 3099 3100 3101 3102 3103 3104 3105 3106 3107 3108 3109 3110 3111 3112 3113 3114 3115 3116 3117 3118 3119 3120 3121 3122 3123 3124 3125 3126 3127 3128 3129 3130 3131 3132 3133 3134 3135 3136 3137 3138 3139 3140 3141 3142 3143 3144 3145 3146 3147 3148 3149 3150 3151 3152 3153 3154 3155 3156 3157 3158 3159 3160 3161 3162 3163 3164 3165 3166 3167 3168 3169 3170 3171 3172 3173 3174 3175 3176 3177 3178 3179 3180 3181 3182 3183 3184 3185 3186 3187 3188 3189 3190 3191 3192 3193 3194 3195 3196 3197 3198 3199 3200 3201 3202 3203 3204 3205 3206 3207 3208 3209 3210 3211 3212 3213 3214 3215 3216 3217 3218 3219 3220 3221 3222 3223 3224 3225 3226 3227 3228 3229 3230 3231 3232 3233 3234 3235 3236 3237 3238 3239 3240 3241 3242 3243 3244 3245 3246 3247 3248 3249 3250 3251 3252 3253 3254 3255 3256 3257 3258 3259 3260 3261 3262 3263 3264 3265 3266 3267 3268 3269 3270 3271 3272 3273 3274 3275 3276 3277 3278 3279 3280 3281 3282 3283 3284 3285 3286 3287 3288 3289 3290 3291 3292 3293 3294 3295 3296 3297 3298 3299 3300 3301 3302 3303 3304 3305 3306 3307 3308 3309 3310 3311 3312 3313 3314 3315 3316 3317 3318 3319 3320 3321 3322 3323 3324 3325 3326 3327 3328 3329 3330 3331 3332 3333 3334 3335 3336 3337 3338 3339 3340 3341 3342 3343 3344 3345 3346 3347 3348 3349 3350 3351 3352 3353 3354 3355 3356 3357 3358 3359 3360 3361 3362 3363 3364 3365 3366 3367 3368 3369 3370 3371 3372 3373 3374 3375 3376 3377 3378 3379 3380 3381 3382 3383 3384 3385 3386 3387 3388 3389 3390 3391 3392 3393 3394 3395 3396 3397 3398 3399 3400 3401 3402 3403 3404 3405 3406 3407 3408 3409 3410 3411 3412 3413 3414 3415 3416 3417 3418 3419 3420 3421 3422 3423 3424 3425 3426 3427 3428 3429 3430 3431 3432 3433 3434 3435 3436 3437 3438 3439 3440 3441 3442 3443 3444 3445 3446 3447 3448 3449 3450 3451 3452 3453 3454 3455 3456 3457 3458 3459 3460 3461 3462 3463 3464 3465 3466 3467 3468 3469 3470 3471 3472 3473 3474 3475 3476 3477 3478 3479 3480 3481 3482 3483 3484 3485 3486 3487 3488 3489 3490 3491 3492 3493 3494 3495 3496 3497 3498 3499 3500 3501 3502 3503 3504 3505 3506 3507 3508 3509 3510 3511 3512 3513 3514 3515 3516 3517 3518 3519 3520 3521 3522 3523 3524 3525 3526 3527 3528 3529 3530 3531 3532 3533 3534 3535 3536 3537 3538 3539 3540 3541 3542 3543 3544 3545 3546 3547 3548 3549 3550 3551 3552 3553 3554 3555 3556 3557 3558 3559 3560 3561 3562 3563 3564 3565 3566 3567 3568 3569 3570 3571 3572 3573 3574 3575 3576 3577 3578 3579 3580 3581 3582 3583 3584 3585 3586 3587 3588 3589 3590 3591 3592 3593 3594 3595 3596 3597 3598 3599 3600 3601 3602 3603 3604 3605 3606 3607 3608 3609 3610 3611 3612 3613 3614 3615 3616 3617 3618 3619 3620 3621 3622 3623 3624 3625 3626 3627 3628 3629 3630 3631 3632 3633 3634 3635 3636 3637 3638 3639 3640 3641 3642 3643 3644 3645 3646 3647 3648 3649 3650 3651 3652 3653 3654 3655 3656 3657 3658 3659 3660 3661 3662 3663 3664 3665 3666 3667 3668 3669 3670 3671 3672 3673 3674 3675 3676 3677 3678 3679 3680 3681 3682 3683 3684 3685 3686 3687 3688 3689 3690 3691 3692 3693 3694 3695 3696 3697 3698 3699 3700 3701 3702 3703 3704 3705 3706 3707 3708 3709 3710 3711 3712 3713 3714 3715 3716 3717 3718 3719 3720 3721 3722 3723 3724 3725 3726 3727 3728 3729 3730 3731 3732 3733 3734 3735 3736 3737 3738 3739 3740 3741 3742 3743 3744 3745 3746 3747 3748 3749 3750 3751 3752 3753 3754 3755 3756 3757 3758 3759 3760 3761 3762 3763 3764 3765 3766 3767 3768 3769 3770 3771 3772 3773 3774 3775 3776 3777 3778 3779 3780 3781 3782 3783 3784 3785 3786 3787 3788 3789 3790 3791 3792 3793 3794 3795 3796 3797 3798 3799 3800 3801 3802 3803 3804 3805 3806 3807 3808 3809 3810 3811 3812 3813 3814 3815 3816 3817 3818 3819 3820 3821 3822 3823 3824 3825 3826 3827 3828 3829 3830 3831 3832 3833 3834 3835 3836 3837 3838 3839 3840 3841 3842 3843 3844 3845 3846 3847 3848 3849 3850 3851 3852 3853 3854 3855 3856 3857 3858 3859 3860 3861 3862 3863 3864 3865 3866 3867 3868 3869 3870 3871 3872 3873 3874 3875 3876 3877 3878 3879 3880 3881 3882 3883 3884 3885 3886 3887 3888 3889 3890 3891 3892 3893 3894 3895 3896 3897 3898 3899 3900 3901 3902 3903 3904 3905 3906 3907 3908 3909 3910 3911 3912 3913 3914 3915 3916 3917 3918 3919 3920 3921 3922 3923 3924 3925 3926 3927 3928 3929 3930 3931 3932 3933 3934 3935 3936 3937 3938 3939 3940 3941 3942 3943 3944 3945 3946 3947 3948 3949 3950 3951 3952 3953 3954 3955 3956 3957 3958 3959 3960 3961 3962 3963 3964 3965 3966 3967 3968 3969 3970 3971 3972 3973 3974 3975 3976 3977 3978 3979 3980 3981 3982 3983 3984 3985 3986 3987 3988 3989 3990 3991 3992 3993 3994 3995 3996 3997 3998 3999 4000 4001 4002 4003 4004 4005 4006 4007 4008 4009 4010 4011 4012 4013 4014 4015 4016 4017 4018 4019 4020 4021 4022 4023 4024 4025 4026 4027 4028 4029 4030 4031 4032 4033 4034 4035 4036 4037 4038 4039 4040 4041 4042 4043 4044 4045 4046 4047 4048 4049 4050 4051 4052 4053 4054 4055 4056 4057 4058 4059 4060 4061 4062 4063 4064 4065 4066 4067 4068 4069 4070 4071 4072 4073 4074 4075 4076 4077 4078 4079 4080 4081 4082 4083 4084 4085 4086 4087 4088 4089 4090 4091 4092 4093 4094 4095 4096 4097 4098 4099 4100 4101 4102 4103 4104 4105 4106 4107 4108 4109 4110 4111 4112 4113 4114 4115 4116 4117 4118 4119 4120 4121 4122 4123 4124 4125 4126 4127 4128 4129 4130 4131 4132 4133 4134 4135 4136 4137 4138 4139 4140 4141 4142 4143 4144 4145 4146 4147 4148 4149 4150 4151 4152 4153 4154 4155 4156 4157 4158 4159 4160 4161 4162 4163 4164 4165 4166 4167 4168 4169 4170 4171 4172 4173 4174 4175 4176 4177 4178 4179 4180 4181 4182 4183 4184 4185 4186 4187 4188 4189 4190 4191 4192 4193 4194 4195 4196 4197 4198 4199 4200 4201 4202 4203 4204 4205 4206 4207 4208 4209 4210 4211 4212 4213 4214 4215 4216 4217 4218 4219 4220 4221 4222 4223 4224 4225 4226 4227 4228 4229 4230 4231 4232 4233 4234 4235 4236 4237 4238 4239 4240 4241 4242 4243 4244 4245 4246 4247 4248 4249 4250 4251 4252 4253 4254 4255 4256 4257 4258 4259 4260 4261 4262 4263 4264 4265 4266 4267 4268 4269 4270 4271 4272 4273 4274 4275 4276 4277 4278 4279 4280 4281 4282 4283 4284 4285 4286 4287 4288 4289 4290 4291 4292 4293 4294 4295 4296 4297 4298 4299 4300 4301 4302 4303 4304 4305 4306 4307 4308 4309 4310 4311 4312 4313 4314 4315 4316 4317 4318 4319 4320 4321 4322 4323 4324 4325 4326 4327 4328 4329 4330 4331 4332 4333 4334 4335 4336 4337 4338 4339 4340 4341 4342 4343 4344 4345 4346 4347 4348 4349 4350 4351 4352 4353 4354 4355 4356 4357 4358 4359 4360 4361 4362 4363 4364 4365 4366 4367 4368 4369 4370 4371 4372 4373 4374 4375 4376 4377 4378 4379 4380 4381 4382 4383 4384 4385 4386 4387 4388 4389 4390 4391 4392 4393 4394 4395 4396 4397 4398 4399 4400 4401 4402 4403 4404 4405 4406 4407 4408 4409 4410 4411 4412 4413 4414 4415 4416 4417 4418 4419 4420 4421 4422 4423 4424 4425 4426 4427 4428 4429 4430 4431 4432 4433 4434 4435 4436 4437 4438 4439 4440 4441 4442 4443 4444 4445 4446 4447 4448 4449 4450 4451 4452 4453 4454 4455 4456 4457 4458 4459 4460 4461 4462 4463 4464 4465 4466 4467 4468 4469 4470 4471 4472 4473 4474 4475 4476 4477 4478 4479 4480 4481 4482 4483 4484 4485 4486 4487 4488 4489 4490 4491 4492 4493 4494 4495 4496 4497 4498 4499 4500 4501 4502 4503 4504 4505 4506 4507 4508 4509 4510 4511 4512 4513 4514 4515 4516 4517 4518 4519 4520 4521 4522 4523 4524 4525 4526 4527 4528 4529 4530 4531 4532 4533 4534 4535 4536 4537 4538 4539 4540 4541 4542 4543 4544 4545 4546 4547 4548 4549 4550 4551 4552 4553 4554 4555 4556 4557 4558 4559 4560 4561 4562 4563 4564 4565 4566 4567 4568 4569 4570 4571 4572 4573 4574 4575 4576 4577 4578 4579 4580 4581 4582 4583 4584 4585 4586 4587 4588 4589 4590 4591 4592 4593 4594 4595 4596 4597 4598 4599 4600 4601 4602 4603 4604 4605 4606 4607 4608 4609 4610 4611 4612 4613 4614 4615 4616 4617 4618 4619 4620 4621 4622 4623 4624 4625 4626 4627 4628 4629 4630 4631 4632 4633 4634 4635 4636 4637 4638 4639 4640 4641 4642 4643 4644 4645 4646 4647 4648 4649 4650 4651 4652 4653 4654 4655 4656 4657 4658 4659 4660 4661 4662 4663 4664 4665 4666 4667 4668 4669 4670 4671 4672 4673 4674 4675 4676 4677 4678 4679 4680 4681 4682 4683 4684 4685 4686 4687 4688 4689 4690 4691 4692 4693 4694 4695 4696 4697 4698 4699 4700 4701 4702 4703 4704 4705 4706 4707 4708 4709 4710 4711 4712 4713 4714 4715 4716 4717 4718 4719 4720 4721 4722 4723 4724 4725 4726 4727 4728 4729 4730 4731 4732 4733 4734 4735 4736 4737 4738 4739 4740 4741 4742 4743 4744 4745 4746 4747 4748 4749 4750 4751 4752 4753 4754 4755 4756 4757 4758 4759 4760 4761 4762 4763 4764 4765 4766 4767 4768 4769 4770 4771 4772 4773 4774 4775 4776 4777 4778 4779 4780 4781 4782 4783 4784 4785 4786 4787 4788 4789 4790 4791 4792 4793 4794 4795 4796 4797 4798 4799 4800 4801 4802 4803 4804 4805 4806 4807 4808 4809 4810 4811 4812 4813 4814 4815 4816 4817 4818 4819 4820 4821 4822 4823 4824 4825 4826 4827 4828 4829 4830 4831 4832 4833 4834 4835 4836 4837 4838 4839 4840 4841 4842 4843 4844 4845 4846 4847 4848 4849 4850 4851 4852 4853 4854 4855 4856 4857 4858 4859 4860 4861 4862 4863 4864 4865 4866 4867 4868 4869 4870 4871 4872 4873 4874 4875 4876 4877 4878 4879 4880 4881 4882 4883 4884 4885 4886 4887 4888 4889 4890 4891 4892 4893 4894 4895 4896 4897 4898 4899 4900 4901 4902 4903 4904 4905 4906 4907 4908 4909 4910 4911 4912 4913 4914 4915 4916 4917 4918 4919 4920 4921 4922 4923 4924 4925 4926 4927 4928 4929 4930 4931 4932 4933 4934 4935 4936 4937 4938 4939 4940 4941 4942 4943 4944 4945 4946 4947 4948 4949 4950 4951 4952 4953 4954 4955 4956 4957 4958 4959 4960 4961 4962 4963 4964 4965 4966 4967 4968 4969 4970 4971 4972 4973 4974 4975 4976 4977 4978 4979 4980 4981 4982 4983 4984 4985 4986 4987 4988 4989 4990 4991 4992 4993 4994 4995 4996 4997 4998 4999 5000 5001 5002 5003 5004 5005 5006 5007 5008 5009 5010 5011 5012 5013 5014 5015 5016 5017 5018 5019 5020 5021 5022 5023 5024 5025 5026 5027 5028 5029 5030 5031 5032 5033 5034 5035 5036 5037 5038 5039 5040 5041 5042 5043 5044 5045 5046 5047 5048 5049 5050 5051 5052 5053 5054 5055 5056 5057 5058 5059 5060 5061 5062 5063 5064 5065 5066 5067 5068 5069 5070 5071 5072 5073 5074 5075 5076 5077 5078 5079 5080 5081 5082 5083 5084 5085 5086 5087 5088 5089 5090 5091 5092 5093 5094 5095 5096 5097 5098 5099 5100 5101 5102 5103 5104 5105 |
%!PS-Adobe-2.0 %%Creator: dvips 5.512 Copyright 1986, 1993 Radical Eye Software %%Title: pdpta.dvi %%CreationDate: Thu Nov 23 17:27:55 1995 %%Pages: 10 %%PageOrder: Ascend %%BoundingBox: 0 0 612 792 %%DocumentFonts: Times-Bold Times-Roman Times-Italic Courier %%EndComments %DVIPSCommandLine: dvips -o pdpta.ps pdpta.dvi %DVIPSSource: TeX output 1995.08.11:1234 %%BeginProcSet: tex.pro /TeXDict 250 dict def TeXDict begin /N{def}def /B{bind def}N /S{exch}N /X{S N} B /TR{translate}N /isls false N /vsize 11 72 mul N /@rigin{isls{[0 -1 1 0 0 0] concat}if 72 Resolution div 72 VResolution div neg scale isls{Resolution hsize -72 div mul 0 TR}if Resolution VResolution vsize -72 div 1 add mul TR matrix currentmatrix dup dup 4 get round 4 exch put dup dup 5 get round 5 exch put setmatrix}N /@landscape{/isls true N}B /@manualfeed{statusdict /manualfeed true put}B /@copies{/#copies X}B /FMat[1 0 0 -1 0 0]N /FBB[0 0 0 0]N /nn 0 N /IE 0 N /ctr 0 N /df-tail{/nn 8 dict N nn begin /FontType 3 N /FontMatrix fntrx N /FontBBox FBB N string /base X array /BitMaps X /BuildChar{ CharBuilder}N /Encoding IE N end dup{/foo setfont}2 array copy cvx N load 0 nn put /ctr 0 N[}B /df{/sf 1 N /fntrx FMat N df-tail}B /dfs{div /sf X /fntrx[sf 0 0 sf neg 0 0]N df-tail}B /E{pop nn dup definefont setfont}B /ch-width{ch-data dup length 5 sub get}B /ch-height{ch-data dup length 4 sub get}B /ch-xoff{128 ch-data dup length 3 sub get sub}B /ch-yoff{ch-data dup length 2 sub get 127 sub}B /ch-dx{ch-data dup length 1 sub get}B /ch-image{ch-data dup type /stringtype ne{ctr get /ctr ctr 1 add N}if}B /id 0 N /rw 0 N /rc 0 N /gp 0 N /cp 0 N /G 0 N /sf 0 N /CharBuilder{save 3 1 roll S dup /base get 2 index get S /BitMaps get S get /ch-data X pop /ctr 0 N ch-dx 0 ch-xoff ch-yoff ch-height sub ch-xoff ch-width add ch-yoff setcachedevice ch-width ch-height true[1 0 0 -1 -.1 ch-xoff sub ch-yoff .1 add]{ch-image}imagemask restore}B /D{/cc X dup type /stringtype ne{]}if nn /base get cc ctr put nn /BitMaps get S ctr S sf 1 ne{dup dup length 1 sub dup 2 index S get sf div put}if put /ctr ctr 1 add N} B /I{cc 1 add D}B /bop{userdict /bop-hook known{bop-hook}if /SI save N @rigin 0 0 moveto /V matrix currentmatrix dup 1 get dup mul exch 0 get dup mul add .99 lt{/QV}{/RV}ifelse load def pop pop}N /eop{SI restore showpage userdict /eop-hook known{eop-hook}if}N /@start{userdict /start-hook known{start-hook} if pop /VResolution X /Resolution X 1000 div /DVImag X /IE 256 array N 0 1 255 {IE S 1 string dup 0 3 index put cvn put}for 65781.76 div /vsize X 65781.76 div /hsize X}N /p{show}N /RMat[1 0 0 -1 0 0]N /BDot 260 string N /rulex 0 N /ruley 0 N /v{/ruley X /rulex X V}B /V{}B /RV statusdict begin /product where{ pop product dup length 7 ge{0 7 getinterval dup(Display)eq exch 0 4 getinterval(NeXT)eq or}{pop false}ifelse}{false}ifelse end{{gsave TR -.1 -.1 TR 1 1 scale rulex ruley false RMat{BDot}imagemask grestore}}{{gsave TR -.1 -.1 TR rulex ruley scale 1 1 false RMat{BDot}imagemask grestore}}ifelse B /QV{ gsave transform round exch round exch itransform moveto rulex 0 rlineto 0 ruley neg rlineto rulex neg 0 rlineto fill grestore}B /a{moveto}B /delta 0 N /tail{dup /delta X 0 rmoveto}B /M{S p delta add tail}B /b{S p tail}B /c{-4 M} B /d{-3 M}B /e{-2 M}B /f{-1 M}B /g{0 M}B /h{1 M}B /i{2 M}B /j{3 M}B /k{4 M}B /w{0 rmoveto}B /l{p -4 w}B /m{p -3 w}B /n{p -2 w}B /o{p -1 w}B /q{p 1 w}B /r{ p 2 w}B /s{p 3 w}B /t{p 4 w}B /x{0 S rmoveto}B /y{3 2 roll p a}B /bos{/SS save N}B /eos{SS restore}B end %%EndProcSet %%BeginProcSet: texps.pro TeXDict begin /rf{findfont dup length 1 add dict begin{1 index /FID ne 2 index /UniqueID ne and{def}{pop pop}ifelse}forall[1 index 0 6 -1 roll exec 0 exch 5 -1 roll VResolution Resolution div mul neg 0 0]/Metrics exch def dict begin Encoding{exch dup type /integertype ne{pop pop 1 sub dup 0 le{pop}{[}ifelse}{ FontMatrix 0 get div Metrics 0 get div def}ifelse}forall Metrics /Metrics currentdict end def[2 index currentdict end definefont 3 -1 roll makefont /setfont load]cvx def}def /ObliqueSlant{dup sin S cos div neg}B /SlantFont{4 index mul add}def /ExtendFont{3 -1 roll mul exch}def /ReEncodeFont{/Encoding exch def}def end %%EndProcSet %%BeginProcSet: special.pro TeXDict begin /SDict 200 dict N SDict begin /@SpecialDefaults{/hs 612 N /vs 792 N /ho 0 N /vo 0 N /hsc 1 N /vsc 1 N /ang 0 N /CLIP 0 N /rwiSeen false N /rhiSeen false N /letter{}N /note{}N /a4{}N /legal{}N}B /@scaleunit 100 N /@hscale{@scaleunit div /hsc X}B /@vscale{@scaleunit div /vsc X}B /@hsize{/hs X /CLIP 1 N}B /@vsize{/vs X /CLIP 1 N}B /@clip{/CLIP 2 N}B /@hoffset{/ho X}B /@voffset{/vo X}B /@angle{/ang X}B /@rwi{10 div /rwi X /rwiSeen true N}B /@rhi {10 div /rhi X /rhiSeen true N}B /@llx{/llx X}B /@lly{/lly X}B /@urx{/urx X}B /@ury{/ury X}B /magscale true def end /@MacSetUp{userdict /md known{userdict /md get type /dicttype eq{userdict begin md length 10 add md maxlength ge{/md md dup length 20 add dict copy def}if end md begin /letter{}N /note{}N /legal{ }N /od{txpose 1 0 mtx defaultmatrix dtransform S atan/pa X newpath clippath mark{transform{itransform moveto}}{transform{itransform lineto}}{6 -2 roll transform 6 -2 roll transform 6 -2 roll transform{itransform 6 2 roll itransform 6 2 roll itransform 6 2 roll curveto}}{{closepath}}pathforall newpath counttomark array astore /gc xdf pop ct 39 0 put 10 fz 0 fs 2 F/|______Courier fnt invertflag{PaintBlack}if}N /txpose{pxs pys scale ppr aload pop por{noflips{pop S neg S TR pop 1 -1 scale}if xflip yflip and{pop S neg S TR 180 rotate 1 -1 scale ppr 3 get ppr 1 get neg sub neg ppr 2 get ppr 0 get neg sub neg TR}if xflip yflip not and{pop S neg S TR pop 180 rotate ppr 3 get ppr 1 get neg sub neg 0 TR}if yflip xflip not and{ppr 1 get neg ppr 0 get neg TR}if}{noflips{TR pop pop 270 rotate 1 -1 scale}if xflip yflip and{TR pop pop 90 rotate 1 -1 scale ppr 3 get ppr 1 get neg sub neg ppr 2 get ppr 0 get neg sub neg TR}if xflip yflip not and{TR pop pop 90 rotate ppr 3 get ppr 1 get neg sub neg 0 TR}if yflip xflip not and{TR pop pop 270 rotate ppr 2 get ppr 0 get neg sub neg 0 S TR}if}ifelse scaleby96{ppr aload pop 4 -1 roll add 2 div 3 1 roll add 2 div 2 copy TR .96 dup scale neg S neg S TR}if}N /cp{pop pop showpage pm restore}N end}if}if}N /normalscale{Resolution 72 div VResolution 72 div neg scale magscale{DVImag dup scale}if 0 setgray}N /psfts{S 65781.76 div N}N /startTexFig{/psf$SavedState save N userdict maxlength dict begin /magscale false def normalscale currentpoint TR /psf$ury psfts /psf$urx psfts /psf$lly psfts /psf$llx psfts /psf$y psfts /psf$x psfts currentpoint /psf$cy X /psf$cx X /psf$sx psf$x psf$urx psf$llx sub div N /psf$sy psf$y psf$ury psf$lly sub div N psf$sx psf$sy scale psf$cx psf$sx div psf$llx sub psf$cy psf$sy div psf$ury sub TR /showpage{}N /erasepage{}N /copypage{}N /p 3 def @MacSetUp}N /doclip{psf$llx psf$lly psf$urx psf$ury currentpoint 6 2 roll newpath 4 copy 4 2 roll moveto 6 -1 roll S lineto S lineto S lineto closepath clip newpath moveto}N /endTexFig{end psf$SavedState restore}N /@beginspecial{ SDict begin /SpecialSave save N gsave normalscale currentpoint TR @SpecialDefaults count /ocount X /dcount countdictstack N}N /@setspecial{CLIP 1 eq{newpath 0 0 moveto hs 0 rlineto 0 vs rlineto hs neg 0 rlineto closepath clip}if ho vo TR hsc vsc scale ang rotate rwiSeen{rwi urx llx sub div rhiSeen{ rhi ury lly sub div}{dup}ifelse scale llx neg lly neg TR}{rhiSeen{rhi ury lly sub div dup scale llx neg lly neg TR}if}ifelse CLIP 2 eq{newpath llx lly moveto urx lly lineto urx ury lineto llx ury lineto closepath clip}if /showpage{}N /erasepage{}N /copypage{}N newpath}N /@endspecial{count ocount sub{pop}repeat countdictstack dcount sub{end}repeat grestore SpecialSave restore end}N /@defspecial{SDict begin}N /@fedspecial{end}B /li{lineto}B /rl{ rlineto}B /rc{rcurveto}B /np{/SaveX currentpoint /SaveY X N 1 setlinecap newpath}N /st{stroke SaveX SaveY moveto}N /fil{fill SaveX SaveY moveto}N /ellipse{/endangle X /startangle X /yrad X /xrad X /savematrix matrix currentmatrix N TR xrad yrad scale 0 0 1 startangle endangle arc savematrix setmatrix}N end %%EndProcSet TeXDict begin 40258431 52099146 1000 300 300 (/stumm/a0/tandri/pdpta/pdpta.dvi) @start /Fa 175[27 7[27 1[27 70[{}3 45.833332 /Courier rf /Fb 80[25 25 51[20 23 23 33 23 23 13 18 15 23 23 23 23 36 13 23 1[13 23 23 15 20 23 20 23 20 3[15 1[15 2[33 2[33 28 25 30 1[25 33 33 41 28 33 1[15 33 1[25 28 33 30 30 33 5[13 3[23 23 4[23 2[11 15 11 1[23 15 15 3[23 2[15 33[{}60 45.833332 /Times-Roman rf /Fc 81[29 51[23 26 2[26 29 16 23 23 2[29 29 42 16 2[16 29 29 16 26 29 26 29 29 13[29 2[36 42 1[48 6[36 1[42 39 1[36 11[29 29 29 29 29 2[15 19 45[{}36 58.333336 /Times-Italic rf /Fd 134[30 2[30 30 30 30 30 1[30 30 30 30 30 30 1[30 30 30 30 30 30 30 30 30 12[30 6[30 3[30 2[30 30 30 30 30 30 14[30 4[30 30 1[30 30 30 40[{}36 50.000000 /Courier rf /Fe 134[22 22 33 1[25 14 19 19 25 25 25 25 36 14 22 1[14 25 25 14 22 25 22 25 25 9[41 2[28 25 30 1[30 36 1[41 28 33 22 17 36 2[30 36 33 1[30 7[25 4[25 25 25 25 2[12 17 5[17 39[{}47 50.000000 /Times-Italic rf /Ff 1 1 df<FFFFF0FFFFF014027D881B>0 D E /Fg 4 117 df<1F0006000600060006000C000C000C00 0C0018F01B181C08180838183018301830306030603160616062C022C03C10177E9614>104 D<0300038003000000000000000000000000001C002400460046008C000C001800180018003100 3100320032001C0009177F960C>I<383C0044C6004702004602008E06000C06000C06000C0C00 180C00180C40181840181880300880300F00120E7F8D15>110 D<030003000600060006000600 FFC00C000C000C001800180018001800300030803080310031001E000A147F930D>116 D E /Fh 3 3 df<FFFFFFFCFFFFFFFC1E027C8C27>0 D<70F8F8F87005057C8E0E>I<C00003E0 000770000E38001C1C00380E00700700E00381C001C38000E700007E00003C00003C00007E0000 E70001C3800381C00700E00E00701C003838001C70000EE00007C000031818799727>I E /Fi 4 62 df<00200040008001000300060004000C000C001800180030003000300070006000 60006000E000E000E000E000E000E000E000E000E000E000E000E000E000E00060006000600070 00300030003000180018000C000C0004000600030001000080004000200B327CA413>40 D<800040002000100018000C000400060006000300030001800180018001C000C000C000C000E0 00E000E000E000E000E000E000E000E000E000E000E000E000E000C000C000C001C00180018001 80030003000600060004000C00180010002000400080000B327DA413>I<000180000001800000 018000000180000001800000018000000180000001800000018000000180000001800000018000 00018000000180000001800000018000FFFFFFFEFFFFFFFE000180000001800000018000000180 000001800000018000000180000001800000018000000180000001800000018000000180000001 800000018000000180001F227D9C26>43 D<FFFFFFFEFFFFFFFE00000000000000000000000000 00000000000000000000000000000000000000FFFFFFFEFFFFFFFE1F0C7D9126>61 D E /Fj 16 111 df<003F000000E180000380C020070060400E0070401C0070403C0070803C00 3880780039007800390078003A00F0003A00F0003C00F0003800F0003800700038007000780030 00B800380338401C1C188007E00F001B157E941F>11 D<70F8FCFC740404040408081010204006 0F7C840E>59 D<00000080000000018000000001C000000003C000000003C000000007C0000000 0BC00000000BC000000013C000000033C000000023C000000043C000000043E000000081E00000 0181E000000101E000000201E000000201E000000401E000000C01E000000801E000001001E000 001FFFF000002000F000006000F000004000F000008000F000008000F000010000F000030000F0 00020000F000040000F8000C0000F8001E0000F800FF800FFF8021237EA225>65 D<007FFFF8000007800F00000780078000078003C0000F0001C0000F0001C0000F0001E0000F00 01E0001E0001C0001E0003C0001E0003C0001E000780003C000F00003C001E00003C003C00003C 01F000007FFFE00000780078000078003C000078001E0000F0001E0000F0000E0000F0000F0000 F0000F0001E0001E0001E0001E0001E0001E0001E0003C0003C0003C0003C000780003C000F000 03C001C00007C00F8000FFFFFC000023227EA125>I<007FFFF0000007801C000007800F000007 800700000F000380000F000380000F000380000F000380001E000780001E000780001E00078000 1E000F00003C000F00003C001E00003C003C00003C007000007801E000007FFF00000078000000 007800000000F000000000F000000000F000000000F000000001E000000001E000000001E00000 0001E000000003C000000003C000000003C000000003C000000007C0000000FFFC00000021227E A11F>80 D<007FFFE0000007803C000007800E000007800700000F000780000F000380000F0003 C0000F0003C0001E000780001E000780001E000780001E000F00003C001E00003C003C00003C00 7000003C01C000007FFE00000078078000007801C000007801E00000F000F00000F000F00000F0 00F00000F000F00001E001E00001E001E00001E001E00001E001E00003C003C00003C003C04003 C003C04003C001C08007C001C080FFFC00E3000000003C0022237EA125>82 D<3FFE01FF8003C0003C0003C000300003C0001000078000200007800020000780002000078000 20000F000040000F000040000F000040000F000040001E000080001E000080001E000080001E00 0080003C000100003C000100003C000100003C0001000078000200007800020000780002000078 000200007000040000F000040000F0000800007000080000700010000070002000003800400000 38008000001C01000000060600000001F800000021237DA121>85 D<FFF8007FC00F80000F000F 00000C000F000008000F000010000F800010000780002000078000600007800040000780008000 07800080000780010000078002000007C002000003C004000003C00C000003C008000003C01000 0003C010000003C020000003E040000003E040000001E080000001E180000001E100000001E200 000001E200000001E400000001F800000000F800000000F000000000E000000000E000000000C0 00000000C000000022237DA11C>I<007FFC03FF0007E000F80007C000E00003C000800003E001 000001E002000001F006000001F00C000000F018000000F81000000078200000007C400000007C 800000003D000000003E000000001E000000001F000000001F000000002F000000006F80000000 C78000000187C000000103C000000203C000000403E000000801E000001001F000002000F00000 4000F800008000F80001800078000300007C000F8000FC00FFE007FFC028227FA128>88 D<00001E00000063800000C7800001C7800001C300000180000003800000038000000380000003 80000007000000070000000700000007000000FFF800000E0000000E0000000E0000000E000000 0E0000000E0000001C0000001C0000001C0000001C0000001C0000003800000038000000380000 0038000000380000007000000070000000700000007000000060000000E0000000E0000000E000 0000C0000070C00000F1800000F1000000620000003C000000192D7EA218>102 D<000F0C00389C00605C00C03801C0380380380780380700700F00700F00700F00701E00E01E00 E01E00E01E00E01E01C00E01C00E03C00605C0031B8001E3800003800003800007000007000007 00700E00F00C00F018006070003FC000161F809417>I<00F0000FE00000E00000E00000E00001 C00001C00001C00001C000038000038000038000038000070000071F0007218007C0C00F00E00F 00E00E00E00E00E01C01C01C01C01C01C01C01C0380380380380380700380704700708700E0870 0E08700610E006206003C016237DA21C>I<00E000E001E000C000000000000000000000000000 00000000001E0023004380438083808380870007000E000E000E001C001C003800382038407040 7040308031001E000B227EA111>I<0000E00001E00001E00000C0000000000000000000000000 000000000000000000000000000000001E00002300004380008380010380010380010380000700 000700000700000700000E00000E00000E00000E00001C00001C00001C00001C00003800003800 00380000380000700000700000700070E000F0C000F180006300003E0000132C81A114>I<00F0 000FE00000E00000E00000E00001C00001C00001C00001C0000380000380000380000380000700 000700F00703080704380E08780E10780E20300E40001C80001F00001FC0001C70003838003838 00381C00381C10703820703820703820701840E00C8060070015237DA219>I<3C07C046186047 20308740388780388700388700380E00700E00700E00700E00701C00E01C00E01C01C01C01C138 01C23803823803823801847001883000F018157E941D>110 D E /Fk 134[21 1[30 1[21 12 16 14 1[21 21 21 32 12 2[12 21 21 14 18 21 18 21 18 3[14 1[14 17[14 5[28 8[12 21 21 5[21 21 1[12 10 14 45[{}32 41.666668 /Times-Roman rf /Fl 203[15 15 15 15 49[{}4 29.166668 /Times-Roman rf /Fm 203[17 17 17 17 17 48[{}5 33.333332 /Times-Roman rf /Fn 138[39 23 27 31 1[39 35 39 59 20 39 1[20 1[35 23 31 39 31 39 35 9[71 4[51 1[43 6[27 2[43 1[51 51 11[35 35 35 35 35 35 35 49[{}32 70.833336 /Times-Bold rf /Fo 69[22 8[25 1[28 28 3[22 47[22 25 25 36 25 25 14 19 17 25 25 25 25 39 14 25 14 14 25 25 17 22 25 22 25 22 3[17 1[17 30 2[47 36 36 30 28 33 1[28 36 36 44 30 36 19 17 36 36 28 30 36 33 33 36 3[28 1[14 14 25 25 25 25 25 25 25 25 25 25 1[12 17 12 2[17 17 17 39[{}75 50.000000 /Times-Roman rf /Fp 139[17 19 22 14[22 28 25 31[36 65[{}7 50.000000 /Times-Bold rf /Fq 2 104 df<0000F80003C0000F00001E00003C0000 780000780000780000780000780000780000780000780000780000780000780000780000780000 780000780000780000780000780000780000780000F00000F00001E000078000FE0000FE000007 800001E00000F00000F00000780000780000780000780000780000780000780000780000780000 7800007800007800007800007800007800007800007800007800007800007800003C00001E0000 0F000003C00000F8153C7CAC1E>102 D<F800000F000003C00001E00000F00000780000780000 780000780000780000780000780000780000780000780000780000780000780000780000780000 7800007800007800007800007800003C00003C00001E000007000001F80001F8000700001E0000 3C00003C0000780000780000780000780000780000780000780000780000780000780000780000 780000780000780000780000780000780000780000780000780000F00001E00003C0000F0000F8 0000153C7CAC1E>I E /Fr 134[29 2[29 29 16 23 19 1[29 29 29 45 16 29 1[16 29 29 19 26 29 26 29 26 11[42 36 32 5[52 7[36 42 39 1[42 54 5[16 4[29 29 2[29 2[15 19 15 44[{}37 58.333336 /Times-Roman rf /Fs 134[42 3[46 28 32 37 1[46 42 46 69 23 2[23 46 42 1[37 46 37 46 42 13[46 2[51 2[78 8[60 60 67[{}23 83.333336 /Times-Bold rf end %%EndProlog %%BeginSetup %%Feature: *Resolution 300dpi TeXDict begin %%EndSetup %%Page: 1 1 1 0 bop 80 177 a Fs(Computation)19 b(and)i(Data)e(Partitioning)g(on)h (Scalable)341 280 y(Shar)o(ed)g(Memory)g(Multipr)o(ocessors)403 451 y Fr(Sudarsan)15 b(T)l(andri)29 b(and)g(T)l(arek)14 b(S.)g(Abdelrahman) 316 526 y(Department)h(of)g(Electrical)f(and)h(Computer)g(Engineering)284 601 y(The)f(University)h(of)g(T)l(oronto,)f(T)l(oronto,)g(Canada,)f(M5S)i (1A4)478 675 y(e-mail:)g Fq(f)p Fr(tandri,tsa)p Fq(g)p Fr(@eecg.toronto.edu) 833 865 y Fp(Abstract)217 945 y Fo(In)g(this)h(paper)f(we)h(identify)f(the)h (factors)f(that)h(af)o(fect)f(the)h(derivation)e(of)i(com-)217 999 y(putation)10 b(and)h(data)g(partitions)g(on)g(scalable)g(shared)g (memory)g(multiprocessors)217 1053 y(\(SSMMs\).)18 b(W)l(e)12 b(show)h(that)f(these)h(factors)f(necessitate)i(an)e(SSMM-conscious)217 1107 y(approach.)17 b(In)10 b(addition)g(to)g(remote)g(memory)f(access,)k (which)d(is)h(the)f(sole)h(factor)217 1161 y(on)19 b(distributed)g(memory)f (multiprocessors,)k(cache)d(af)o(\256nity)m(,)i(memory)e(con-)217 1216 y(tention)12 b(and)h(false)g(sharing)f(are)h(important)f(factors)g(that) h(must)g(be)g(considered.)217 1270 y(Experimental)g(evidence)h(is)g (presented)g(to)g(demonstrate)f(the)h(impact)f(of)h(these)217 1324 y(factors)i(on)g(performance)g(using)g(three)h(applications)f(on)h(the)f (KSR1)h(and)f(the)217 1378 y(Hector)c(multiprocessors.)4 1540 y Fn(1)71 b(Intr)o(oduction)4 1667 y Fo(Scalable)12 b(shared)g(memory)f (multiprocessors)g(\(SSMMs\))g(are)h(becoming)f(increasingly)h(popular)f(and) h(a)4 1721 y(viable)e(alternative)f(to)h(distributed)f(memory)g (multiprocessors)h(\(DMMs\).)17 b(The)11 b(Stanford)e(DASH)g([20],)4 1775 y(FLASH)i([14)o(],)h(the)f(KSR1)f([24],)h(T)m(oronto')m(s)f(Hector)h ([26)o(],)h(NUMAchine)f([1)o(],)h(and)f(the)f(Cray)h(T3D)h([23)o(])4 1830 y(are)d(some)g(SSMMs)g(currently)e(in)i(use)g(or)f(under)g(development.) 17 b(Processors)9 b(in)f(a)h(SSMM)g(share)g(a)g(single)4 1884 y(coherent)f(address)g(space.)17 b(However)n(,)9 b(shared)f(memory)g(is)g (physically)g(distributed)g(to)f(allow)h(scalability)l(,)4 1938 y(as)17 b(shown)f(in)g(Figure)f(1.)29 b(This)17 b(distribution)e(of)g (shared)i(memory)e(results)h(in)g(non-uniform)e(memory)4 1992 y(access)f(latencies,)g(depending)f(on)f(the)h(distance)h(between)f(a)g (processor)f(and)h(memory)m(.)17 b(Consequently)m(,)4 2046 y(careful)12 b(placement)g(and)g(management)g(of)g(data)h(is)g(essential)g (for)e(scaling)i(performance.)77 2122 y(W)l(e)i(believe)f(that)g(data)g (distribution)732 2104 y Fm(1)764 2122 y Fo(is)g(a)h(good)f(paradigm)f(for)h (managing)f(data)i(in)f(data-parallel)4 2176 y(applications)h(on)g(SSMMs)g ([3)o(,)h(21].)25 b(The)16 b(division)e(of)h(array)f(data)h(allows)g(a)g (compiler)f(to)h(place)g(data)4 2230 y(in)g(the)g(physical)f(memory)g(of)h (the)g(processor)f(that)h(uses)h(it)e(the)h(most,)h(and)f(also)g(allows)g (the)g(compiler)4 2284 y(to)k(partition)f(the)h(computations)g(of)f(parallel) h(loops.)38 b(W)l(e)19 b(have)g(experimented)g(with)f(programmer)4 2339 y(speci\256ed)12 b(data)f(distributions)g(on)g(the)h(Hector)f (multiprocessor)f(and)i(have)g(found)e(them)h(to)h(be)f(ef)o(fective)4 2393 y(in)e(improving)e(performance.)16 b(However)n(,)10 b(the)e(task)h(of)g (selecting)g(a)g(good)f(data)h(distribution)f(requires)g(the)4 2447 y(programmer)i(to)h(understand)g(both)f(the)i(parallel)e(machine)h (architecture)g(and)g(the)g(data)g(access)i(patterns)4 2501 y(in)19 b(the)f(program.)37 b(Porting)17 b(programs)h(to)h(various)g (machines)g(and)f(tuning)h(them)f(for)g(performance)4 2555 y(becomes)g(a)f(tedious)g(and)g(laborious)g(process.)33 b(Consequently)m(,)19 b(it)e(is)h(desirable)f(to)g(derive)f(data)i(and)4 2609 y(computation)h (partitions)g(automatically)h(using)g(a)g(compiler)m(.)40 b(The)21 b(objective)e(of)h(this)g(paper)g(is)g(to)4 2664 y(describe)13 b(the)f(factors)g(that)g(af)o(fect)g(the)g(derivation)g(of)g(computation)f (and)i(data)f(partitions)g(on)g(SSMMs.)77 2739 y(On)19 b(DMMs,)k(the)c(main)g (factor)f(that)i(af)o(fects)f(the)g(performance)f(of)h(an)g(application)g(is) g(the)g(cost)4 2793 y(of)d(interprocessor)f(communication.)28 b(Consequently)m(,)17 b(scalable)g(performance)e(can)h(be)g(achieved)g(by)p 4 2838 737 2 v 62 2869 a Fl(1)79 2884 y Fk(In)10 b(this)f(paper)i(we)g(use)f (the)g(terms)h(data)g(distributi)o(ons)c(and)k(data)f(partitions)f (interchangeably)m(.)p eop %%Page: 2 2 2 1 bop 175 533 a @beginspecial 114 @llx 408 @lly 476 @urx 553 @ury 3600 @rwi @setspecial %%BeginDocument: numaarch1.ps /arrowHeight 10 def /arrowWidth 5 def /IdrawDict 51 dict def IdrawDict begin /reencodeISO { dup dup findfont dup length dict begin { 1 index /FID ne { def }{ pop pop } ifelse } forall /Encoding ISOLatin1Encoding def currentdict end definefont } def /ISOLatin1Encoding [ /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /space/exclam/quotedbl/numbersign/dollar/percent/ampersand/quoteright /parenleft/parenright/asterisk/plus/comma/minus/period/slash /zero/one/two/three/four/five/six/seven/eight/nine/colon/semicolon /less/equal/greater/question/at/A/B/C/D/E/F/G/H/I/J/K/L/M/N /O/P/Q/R/S/T/U/V/W/X/Y/Z/bracketleft/backslash/bracketright /asciicircum/underscore/quoteleft/a/b/c/d/e/f/g/h/i/j/k/l/m /n/o/p/q/r/s/t/u/v/w/x/y/z/braceleft/bar/braceright/asciitilde /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/dotlessi/grave/acute/circumflex/tilde/macron/breve /dotaccent/dieresis/.notdef/ring/cedilla/.notdef/hungarumlaut /ogonek/caron/space/exclamdown/cent/sterling/currency/yen/brokenbar /section/dieresis/copyright/ordfeminine/guillemotleft/logicalnot /hyphen/registered/macron/degree/plusminus/twosuperior/threesuperior /acute/mu/paragraph/periodcentered/cedilla/onesuperior/ordmasculine /guillemotright/onequarter/onehalf/threequarters/questiondown /Agrave/Aacute/Acircumflex/Atilde/Adieresis/Aring/AE/Ccedilla /Egrave/Eacute/Ecircumflex/Edieresis/Igrave/Iacute/Icircumflex /Idieresis/Eth/Ntilde/Ograve/Oacute/Ocircumflex/Otilde/Odieresis /multiply/Oslash/Ugrave/Uacute/Ucircumflex/Udieresis/Yacute /Thorn/germandbls/agrave/aacute/acircumflex/atilde/adieresis /aring/ae/ccedilla/egrave/eacute/ecircumflex/edieresis/igrave /iacute/icircumflex/idieresis/eth/ntilde/ograve/oacute/ocircumflex /otilde/odieresis/divide/oslash/ugrave/uacute/ucircumflex/udieresis /yacute/thorn/ydieresis ] def /Times-Roman reencodeISO def /none null def /numGraphicParameters 17 def /stringLimit 65535 def /Begin { save numGraphicParameters dict begin } def /End { end restore } def /SetB { dup type /nulltype eq { pop false /brushRightArrow idef false /brushLeftArrow idef true /brushNone idef } { /brushDashOffset idef /brushDashArray idef 0 ne /brushRightArrow idef 0 ne /brushLeftArrow idef /brushWidth idef false /brushNone idef } ifelse } def /SetCFg { /fgblue idef /fggreen idef /fgred idef } def /SetCBg { /bgblue idef /bggreen idef /bgred idef } def /SetF { /printSize idef /printFont idef } def /SetP { dup type /nulltype eq { pop true /patternNone idef } { dup -1 eq { /patternGrayLevel idef /patternString idef } { /patternGrayLevel idef } ifelse false /patternNone idef } ifelse } def /BSpl { 0 begin storexyn newpath n 1 gt { 0 0 0 0 0 0 1 1 true subspline n 2 gt { 0 0 0 0 1 1 2 2 false subspline 1 1 n 3 sub { /i exch def i 1 sub dup i dup i 1 add dup i 2 add dup false subspline } for n 3 sub dup n 2 sub dup n 1 sub dup 2 copy false subspline } if n 2 sub dup n 1 sub dup 2 copy 2 copy false subspline patternNone not brushLeftArrow not brushRightArrow not and and { ifill } if brushNone not { istroke } if 0 0 1 1 leftarrow n 2 sub dup n 1 sub dup rightarrow } if end } dup 0 4 dict put def /Circ { newpath 0 360 arc patternNone not { ifill } if brushNone not { istroke } if } def /CBSpl { 0 begin dup 2 gt { storexyn newpath n 1 sub dup 0 0 1 1 2 2 true subspline 1 1 n 3 sub { /i exch def i 1 sub dup i dup i 1 add dup i 2 add dup false subspline } for n 3 sub dup n 2 sub dup n 1 sub dup 0 0 false subspline n 2 sub dup n 1 sub dup 0 0 1 1 false subspline patternNone not { ifill } if brushNone not { istroke } if } { Poly } ifelse end } dup 0 4 dict put def /Elli { 0 begin newpath 4 2 roll translate scale 0 0 1 0 360 arc patternNone not { ifill } if brushNone not { istroke } if end } dup 0 1 dict put def /Line { 0 begin 2 storexyn newpath x 0 get y 0 get moveto x 1 get y 1 get lineto brushNone not { istroke } if 0 0 1 1 leftarrow 0 0 1 1 rightarrow end } dup 0 4 dict put def /MLine { 0 begin storexyn newpath n 1 gt { x 0 get y 0 get moveto 1 1 n 1 sub { /i exch def x i get y i get lineto } for patternNone not brushLeftArrow not brushRightArrow not and and { ifill } if brushNone not { istroke } if 0 0 1 1 leftarrow n 2 sub dup n 1 sub dup rightarrow } if end } dup 0 4 dict put def /Poly { 3 1 roll newpath moveto -1 add { lineto } repeat closepath patternNone not { ifill } if brushNone not { istroke } if } def /Rect { 0 begin /t exch def /r exch def /b exch def /l exch def newpath l b moveto l t lineto r t lineto r b lineto closepath patternNone not { ifill } if brushNone not { istroke } if end } dup 0 4 dict put def /Text { ishow } def /idef { dup where { pop pop pop } { exch def } ifelse } def /ifill { 0 begin gsave patternGrayLevel -1 ne { fgred bgred fgred sub patternGrayLevel mul add fggreen bggreen fggreen sub patternGrayLevel mul add fgblue bgblue fgblue sub patternGrayLevel mul add setrgbcolor eofill } { eoclip originalCTM setmatrix pathbbox /t exch def /r exch def /b exch def /l exch def /w r l sub ceiling cvi def /h t b sub ceiling cvi def /imageByteWidth w 8 div ceiling cvi def /imageHeight h def bgred bggreen bgblue setrgbcolor eofill fgred fggreen fgblue setrgbcolor w 0 gt h 0 gt and { l b translate w h scale w h true [w 0 0 h neg 0 h] { patternproc } imagemask } if } ifelse grestore end } dup 0 8 dict put def /istroke { gsave brushDashOffset -1 eq { [] 0 setdash 1 setgray } { brushDashArray brushDashOffset setdash fgred fggreen fgblue setrgbcolor } ifelse brushWidth setlinewidth originalCTM setmatrix stroke grestore } def /ishow { 0 begin gsave fgred fggreen fgblue setrgbcolor /fontDict printFont printSize scalefont dup setfont def /descender fontDict begin 0 [FontBBox] 1 get FontMatrix end transform exch pop def /vertoffset 1 printSize sub descender sub def { 0 vertoffset moveto show /vertoffset vertoffset printSize sub def } forall grestore end } dup 0 3 dict put def /patternproc { 0 begin /patternByteLength patternString length def /patternHeight patternByteLength 8 mul sqrt cvi def /patternWidth patternHeight def /patternByteWidth patternWidth 8 idiv def /imageByteMaxLength imageByteWidth imageHeight mul stringLimit patternByteWidth sub min def /imageMaxHeight imageByteMaxLength imageByteWidth idiv patternHeight idiv patternHeight mul patternHeight max def /imageHeight imageHeight imageMaxHeight sub store /imageString imageByteWidth imageMaxHeight mul patternByteWidth add string def 0 1 imageMaxHeight 1 sub { /y exch def /patternRow y patternByteWidth mul patternByteLength mod def /patternRowString patternString patternRow patternByteWidth getinterval def /imageRow y imageByteWidth mul def 0 patternByteWidth imageByteWidth 1 sub { /x exch def imageString imageRow x add patternRowString putinterval } for } for imageString end } dup 0 12 dict put def /min { dup 3 2 roll dup 4 3 roll lt { exch } if pop } def /max { dup 3 2 roll dup 4 3 roll gt { exch } if pop } def /midpoint { 0 begin /y1 exch def /x1 exch def /y0 exch def /x0 exch def x0 x1 add 2 div y0 y1 add 2 div end } dup 0 4 dict put def /thirdpoint { 0 begin /y1 exch def /x1 exch def /y0 exch def /x0 exch def x0 2 mul x1 add 3 div y0 2 mul y1 add 3 div end } dup 0 4 dict put def /subspline { 0 begin /movetoNeeded exch def y exch get /y3 exch def x exch get /x3 exch def y exch get /y2 exch def x exch get /x2 exch def y exch get /y1 exch def x exch get /x1 exch def y exch get /y0 exch def x exch get /x0 exch def x1 y1 x2 y2 thirdpoint /p1y exch def /p1x exch def x2 y2 x1 y1 thirdpoint /p2y exch def /p2x exch def x1 y1 x0 y0 thirdpoint p1x p1y midpoint /p0y exch def /p0x exch def x2 y2 x3 y3 thirdpoint p2x p2y midpoint /p3y exch def /p3x exch def movetoNeeded { p0x p0y moveto } if p1x p1y p2x p2y p3x p3y curveto end } dup 0 17 dict put def /storexyn { /n exch def /y n array def /x n array def n 1 sub -1 0 { /i exch def y i 3 2 roll put x i 3 2 roll put } for } def /SSten { fgred fggreen fgblue setrgbcolor dup true exch 1 0 0 -1 0 6 -1 roll matrix astore } def /FSten { dup 3 -1 roll dup 4 1 roll exch newpath 0 0 moveto dup 0 exch lineto exch dup 3 1 roll exch lineto 0 lineto closepath bgred bggreen bgblue setrgbcolor eofill SSten } def /Rast { exch dup 3 1 roll 1 0 0 -1 0 6 -1 roll matrix astore } def /arrowhead { 0 begin transform originalCTM itransform /taily exch def /tailx exch def transform originalCTM itransform /tipy exch def /tipx exch def /dy tipy taily sub def /dx tipx tailx sub def /angle dx 0 ne dy 0 ne or { dy dx atan } { 90 } ifelse def gsave originalCTM setmatrix tipx tipy translate angle rotate newpath arrowHeight neg arrowWidth 2 div moveto 0 0 lineto arrowHeight neg arrowWidth 2 div neg lineto patternNone not { originalCTM setmatrix /padtip arrowHeight 2 exp 0.25 arrowWidth 2 exp mul add sqrt brushWidth mul arrowWidth div def /padtail brushWidth 2 div def tipx tipy translate angle rotate padtip 0 translate arrowHeight padtip add padtail add arrowHeight div dup scale arrowheadpath ifill } if brushNone not { originalCTM setmatrix tipx tipy translate angle rotate arrowheadpath istroke } if grestore end } dup 0 9 dict put def /arrowheadpath { newpath arrowHeight neg arrowWidth 2 div moveto 0 0 lineto arrowHeight neg arrowWidth 2 div neg lineto } def /leftarrow { 0 begin y exch get /taily exch def x exch get /tailx exch def y exch get /tipy exch def x exch get /tipx exch def brushLeftArrow { tipx tipy tailx taily arrowhead } if end } dup 0 4 dict put def /rightarrow { 0 begin y exch get /tipy exch def x exch get /tipx exch def y exch get /taily exch def x exch get /tailx exch def brushRightArrow { tipx tipy tailx taily arrowhead } if end } dup 0 4 dict put def Begin [ 0.799705 0 0 0.799705 0 0 ] concat /originalCTM matrix currentmatrix def Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.125 -0 -0 0.125 433.125 504.625 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.125 -0 -0 0.125 433.125 552.625 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.125 -0 -0 0.125 265.125 504.625 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.125 -0 -0 0.125 265.125 552.625 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.125 -0 -0 0.125 137.125 504.625 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.125 -0 -0 0.125 137.125 552.625 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.125 -0 -0 0.125 321.125 600.625 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.125 -0 -0 0.125 489.125 600.625 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.125 -0 -0 0.125 193.125 600.625 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I Elli 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.5 -0 -0 0.5 144 410 ] concat 453 529 448 32 Elli End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0.75 SetP [ 0.125 -0 -0 0.125 486.625 598.125 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl none SetB %I b n 0 0 0 SetCFg 1 1 1 SetCBg 1 SetP [ 0.084375 -0 -0 0.084375 505.272 606.534 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0.75 SetP [ 0.125 -0 -0 0.125 318.625 598.125 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl none SetB %I b n 0 0 0 SetCFg 1 1 1 SetCBg 1 SetP [ 0.084375 -0 -0 0.084375 337.272 606.534 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0.75 SetP [ 0.125 -0 -0 0.125 190.625 598.125 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl none SetB %I b n 0 0 0 SetCFg 1 1 1 SetCBg 1 SetP [ 0.084375 -0 -0 0.084375 209.272 606.534 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0.75 SetP [ 0.125 -0 -0 0.125 134.625 502.125 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl none SetB %I b n 0 0 0 SetCFg 1 1 1 SetCBg 1 SetP [ 0.084375 -0 -0 0.084375 153.272 510.534 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 1 SetP [ 0.125 -0 -0 0.125 262.625 502.125 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 1 SetP [ 0.125 -0 -0 0.125 430.625 502.125 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I Elli 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.125 -0 -0 0.125 362.875 611.625 ] concat 617 99 16 16 Elli End Begin %I Elli 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.125 -0 -0 0.125 378.875 611.625 ] concat 617 99 16 16 Elli End Begin %I Line 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.5 -0 -0 0.5 133.5 439.5 ] concat 117 369 181 369 Line End Begin %I Line 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.5 -0 -0 0.5 261.5 439.5 ] concat 117 369 181 369 Line End Begin %I Line 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.5 -0 -0 0.5 429.5 439.5 ] concat 117 369 181 369 Line End Begin %I Text 0 0 0 SetCFg Times-Roman 12 SetF [ 1 0 0 1 179.5 532 ] concat [ (Procr) ] Text End Begin %I Text 0 0 0 SetCFg Times-Roman 12 SetF [ 1 0 0 1 307.5 532 ] concat [ (Procr) ] Text End Begin %I Text 0 0 0 SetCFg Times-Roman 12 SetF [ 1 0 0 1 475.5 532 ] concat [ (Procr) ] Text End Begin %I Text 0 0 0 SetCFg Times-Roman 12 SetF [ 1 0 0 1 235.5 628 ] concat [ (Mem) ] Text End Begin %I Text 0 0 0 SetCFg Times-Roman 12 SetF [ 1 0 0 1 363.5 628 ] concat [ (Mem) ] Text End Begin %I Text 0 0 0 SetCFg Times-Roman 12 SetF [ 1 0 0 1 531.5 628 ] concat [ (Mem) ] Text End Begin %I Elli 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.125 -0 -0 0.125 346.875 611.625 ] concat 617 99 16 16 Elli End Begin %I BSpl 1 0 1 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.5 -0 -0 0.5 19.5 393.5 ] concat 457 333 473 381 441 365 457 413 4 BSpl End Begin %I BSpl 1 0 1 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.5 -0 -0 0.5 147.5 393.5 ] concat 457 333 473 381 441 365 457 413 4 BSpl End Begin %I Text 0 0 0 SetCFg Times-Roman 12 SetF [ 1 0 0 1 360 555 ] concat [ (Remote) (memory) ] Text End Begin %I Text 0 0 0 SetCFg Times-Roman 12 SetF [ 1 0 0 1 232 555 ] concat [ (Local) (memory) ] Text End Begin %I BSpl 1 0 1 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 0.5 -0 -0 0.5 315.5 393.5 ] concat 457 333 473 381 441 365 457 413 4 BSpl End Begin %I Text 0 0 0 SetCFg Times-Roman 12 SetF [ 1 0 0 1 528 555 ] concat [ (Remote) (memory) ] Text End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0.75 SetP [ 0.125 -0 -0 0.125 134.625 550.125 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl none SetB %I b n 0 0 0 SetCFg 1 1 1 SetCBg 1 SetP [ 0.084375 -0 -0 0.084375 153.272 558.534 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 1 SetP [ 0.125 -0 -0 0.125 262.625 550.125 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I CBSpl 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 1 SetP [ 0.125 -0 -0 0.125 430.625 550.125 ] concat 267 271 267 335 331 335 587 335 651 335 651 271 651 143 651 79 587 79 331 79 267 79 267 143 12 CBSpl End Begin %I Text 0 0 0 SetCFg Times-Roman 12 SetF [ 1 0 0 1 177 580 ] concat [ (Cache) ] Text End Begin %I Text 0 0 0 SetCFg Times-Roman 12 SetF [ 1 0 0 1 305 580 ] concat [ (Cache) ] Text End Begin %I Text 0 0 0 SetCFg Times-Roman 12 SetF [ 1 0 0 1 473 580 ] concat [ (Cache) ] Text End Begin %I Line 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 1 -0 -0 1 60 364 ] concat 132 180 132 196 Line End Begin %I Line 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 1 -0 -0 1 188 364 ] concat 132 180 132 196 Line End Begin %I Line 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 1 -0 -0 1 356 364 ] concat 132 180 132 196 Line End Begin %I Line 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 1 -0 -0 1 49 237 ] concat 143 355 143 435 Line End Begin %I Line 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 1 -0 -0 1 49 237 ] concat 439 355 439 427 Line End Begin %I Line 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 1 -0 -0 1 49 237 ] concat 271 355 271 427 Line End Begin %I Elli 1 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 1 SetP [ 0.5 -0 -0 0.5 141.5 407.5 ] concat 453 529 448 32 Elli End Begin %I Text 0 0 0 SetCFg Times-Roman 12 SetF [ 1 0 0 1 306.5 676 ] concat [ (Interconnection Network) ] Text End End %I eop showpage end %%EndDocument @endspecial 295 598 a Fo(Figure)11 b(1:)18 b(Scalable)12 b(shared-memory)f (multiprocessor)h(architecture.)4 719 y(partitioning)g(data)i(and)g (computations)f(in)g(a)h(way)f(that)h(minimizes)f(interprocessor)g (communications.)4 773 y(On)f(SSMMs,)h(processors)f(communicate)f(through)g (shared)g(memory)m(,)h(and)f(the)h(cost)g(of)f(interprocessor)4 827 y(communications)h(\(i.e.,)i(remote)e(memory)f(access\))j(is)f (relatively)e(inexpensive.)19 b(W)l(e)13 b(show)g(that)f(cache)4 881 y(af)o(\256nity)m(,)i(memory)f(contention)h(and)g(false)g(sharing)g(are)g (additional)g(factors)g(that)f(must)i(be)f(considered)4 935 y(in)i(the)g(selection)g(of)f(data)h(distributions.)28 b(Furthermore,)16 b(the)g(presence)g(of)f(a)h(single)g(shared)g(address)4 989 y(space)i(allows)g(\257exibility)f(in)g(the)h(selection)g(of)f(a)h (computation)e(partition.)33 b(Speci\256cally)m(,)19 b(we)f(show)4 1044 y(that)h(relaxing)g(the)h(commonly)e(used)i(owner)o(-computes)f(rule)g ([15)o(])h(has)g(performance)e(advantages.)4 1098 y(W)l(e)d(present)g (experimental)f(results)i(to)e(support)h(our)f(conclusions)i(using)f(three)f (applications)h(on)g(two)4 1152 y(SSMMs,)e(the)g(Hector)f(and)g(the)g(KSR1)h (multiprocessors.)77 1228 y(The)g(remainder)f(of)g(this)g(paper)g(is)h(or)o (ganized)f(as)h(follows.)18 b(Section)12 b(2)h(presents)f(an)h(overview)f (data)4 1282 y(distributions.)35 b(Section)18 b(3)g(describes)h(the)f (factors)f(that)h(impact)g(on)g(the)h(selection)f(of)g(computation)4 1336 y(and)i(data)g(partitions)f(on)g(SSMMs.)41 b(Section)19 b(4)h(gives)g(experimental)f(evidence)h(of)f(the)h(impact)f(of)4 1390 y(cache)e(af)o(\256nity)e(and)h(false)g(sharing)g(on)g(the)g(choice)h (of)e(data)h(partitions.)29 b(Section)16 b(5)g(presents)h(results)4 1444 y(to)d(show)g(that)g(the)g(\257exibility)f(in)h(selecting)h(the)f (computation)f(partitioning)g(can)h(be)g(used)h(to)f(improve)4 1498 y(performance.)j(Section)9 b(6)i(reviews)f(related)g(work.)17 b(Finally)m(,)11 b(Section)e(7)i(presents)f(concluding)g(remarks)4 1553 y(and)j(directions)e(for)h(future)f(work.)4 1734 y Fn(2)71 b(Data)19 b(Distributions)4 1861 y Fo(Data)10 b(distribution)f([15)o(,)i(16]) e(is)i(achieved)f(by)g(specifying)f(a)h(partitioning)f(scheme)h(for)f(each)i (array)e(in)h(the)4 1915 y(program)h(and)i(by)f(specifying)g(a)g(processor)h (geometry)e(to)i(which)f(array)g(partitions)f(map.)18 b(A)13 b(processor)4 1970 y(geometry)g(is)i(an)f Fj(n)p Fo(-dimensional)f(Cartesian) h(grid)f(of)h(virtual)f(processors)h Fi(\()p Fj(V)1385 1977 y Fm(0)1404 1970 y Fj(;)8 b(V)1454 1977 y Fm(1)1473 1970 y Fj(;)g Fh(\001)g(\001)g(\001)g Fj(;)g(V)1611 1977 y Fg(n)p Ff(\000)p Fm(1)1679 1970 y Fi(\))p Fo(,)15 b(where)4 2024 y Fj(V)32 2031 y Fg(i)63 2024 y Fo(is)i(the)f(number)g(of)g(processors)h(in)f (the)g Fj(i)793 2006 y Fg(th)844 2024 y Fo(dimension)g(of)g(the)h(grid,)g (and)f Fj(V)1430 2031 y Fm(0)1463 2024 y Fh(\002)d Fj(V)1543 2031 y Fm(1)1575 2024 y Fh(\002)g(\001)8 b(\001)g(\001)14 b(\002)f Fj(V)1779 2031 y Fg(n)p Ff(\000)p Fm(1)4 2078 y Fo(=)i Fj(P)7 b Fo(,)16 b(the)f(total)f(number)h(of)f(processors.)26 b(A)15 b(partitioning)e(scheme)i(assigns)h(a)f Fe(partitioning)f(attribute)4 2132 y Fo(to)k(each)g(dimension)g(the)g(array)m(.)34 b(There)18 b(are)g(four)f(partitioning)f(attributes.)35 b(The)18 b Fd(Block)g Fo(attribute)4 2186 y(divides)f(the)g(corresponding)g(dimension)g(of)f(the)h (array)g(in)g(approximately)f(equal)h(size)h(blocks)f(such)4 2240 y(that)j(a)g(processor)g(owns)h(a)f(contiguous)g(range)g(of)f(that)h (dimension)g(of)g(the)g(array)m(.)41 b(The)20 b Fd(Cyclic)4 2295 y Fo(attribute)11 b(divides)h(the)h(corresponding)e(array)g(dimension)h (by)g(distributing)f(the)h(array)f(elements)i(in)f(this)4 2349 y(dimension)g(to)g(processors)h(in)f(a)h(round-robin)d(fashion.)18 b(The)13 b Fd(BlockCyclic)f Fo(attribute)f(\256rst)h(groups)4 2403 y(array)f(elements)h(in)f(the)g(corresponding)g(dimension)g(in)g (contiguous)g(blocks)h(of)f(a)h(given)f(size,)h(and)g(then)4 2457 y(assigns)k(the)f(blocks)f(to)h(processors)g(in)g(a)g(round-robin)d (fashion.)26 b(The)15 b(block)f(size,)j(called)d(the)h Fe(block-)4 2511 y(cyclic)10 b(factor)p Fo(,)h(is)e(supplied)h(by)f(the)h(programmer)m(.) 16 b(Finally)m(,)9 b(the)h Fd(*)f Fo(attribute)g(is)h(used)g(to)f(indicate)g (that)h(the)4 2565 y(corresponding)f(dimension)g(of)g(the)h(array)f(is)h(not) f(distributed.)17 b(The)10 b(processor)g(geometry)f(on)g(which)h(the)4 2620 y(array)h(is)i(mapped)e(determines)h(the)g(number)f(of)g(processors)i (assigned)f(to)g(each)g(distributed)f(dimension)4 2674 y(of)h(the)f(array)m (.)18 b(For)11 b(example,)h(distributing)e(a)i(two)g(dimensional)f(array)h (using)f(the)h Fd(\(Block,Block\))4 2728 y Fo(attributes)g(onto)h(a)g(two)f (dimensional)h(processor)f(geometry)g(of)h(\(2,4\),)f(distributes)h(the)f (array)h(on)f(to)h(the)4 2782 y(8)k(processors,)i(assigning)f(2)f(processors) g(to)g(the)g(\256rst)g(dimension)g(and)g(4)g(processors)g(to)g(the)g(second)4 2836 y(dimension.)p eop %%Page: 3 3 3 2 bop 4 -21 a Fn(3)71 b(Performance)21 b(Factors)4 106 y Fo(The)15 b(main)g(factor)f(that)g(af)o(fects)h(the)f(performance)g(of)g(a)h (parallel)f(application)g(on)h(a)g(DMM)g(is)g(the)g(rel-)4 160 y(atively)i(high)f(cost)h(of)g(interprocessor)f(communication.)30 b(For)17 b(example,)h(the)f(latency)f(for)g(a)h(remote)4 215 y(memory)e(access)i(on)e(the)h(CM5)g(multiprocessor)f(is)h(approximately)e (2560)h(processor)h(cycles)1699 196 y Fm(2)1718 215 y Fo(.)28 b(This)4 269 y(necessitates)16 b(the)e(selection)h(of)f(computation)f(and)h (data)h(partitions)f(that)g(minimize)f(the)i(cost)f(of)g(com-)4 323 y(munication.)27 b(In)15 b(contrast,)h(on)f(SSMMs,)i(processors)f (communicate)e(through)h(shared)g(memory)g(and)4 377 y(the)j(cost)h(of)f (remote)f(memory)h(access)h(is)g(relatively)e(small.)36 b(For)17 b(example,)j(the)f(cost)f(of)g(a)g(remote)4 431 y(read)11 b(operation)g(on)g (the)h(KSR1)f(is)h(approximately)e(170)h(processor)g(cycles)h([24].)17 b(Consequently)m(,)12 b(other)4 485 y(factors)h(come)f(into)h(play)g(in)f (the)h(selection)g(of)f(computation)g(and)h(data)g(partitions.)19 b(In)13 b(this)g(section)g(we)4 540 y(elaborate)h(on)g(these)g(factors)g(and) g(on)g(how)g(they)g(af)o(fect)g(performance,)f(and)h(consequently)m(,)h(af)o (fect)f(the)4 594 y(choice)f(of)f(data)g(and)g(computation)g(partitions.)4 755 y Fc(3.1)58 b(Cache)14 b(Af\256nity)4 853 y Fo(Caches)j(are)e(used)h(in)f (SSMMs)h(to)g(reduce)f(ef)o(fective)g(memory)f(access)j(time)e(and)h(reduce)f (contention)4 907 y(in)e(the)h(interconnection)e(network.)21 b(Data)14 b(is)g(transferred)e(between)i(cache)g(and)f(memory)g(in)g(units)g (of)h(a)4 961 y Fe(cache)g(line)p Fo(,)h(typically)e(a)h(multiple)f(of)g(the) h(processor)g(word)f(size.)24 b Fe(Spatial)13 b(r)n(euse)i Fo(occurs)e(when)h(other)4 1015 y(words)h(on)g(the)g(same)g(line)g(are)g (used)g(by)g(the)g(processor)g(before)f(the)h(line)g(is)g(\257ushed)g(from)f (the)h(cache.)4 1070 y(Analogously)m(,)g Fe(temporal)f(r)n(euse)i Fo(occurs)e(when)g(data)h(on)f(a)g(cache)h(line)f(is)h(used)g(again)f(before) g(the)g(line)4 1124 y(is)i(evicted)g(from)e(the)i(cache.)29 b(The)16 b(performance)e(of)i(an)f(application)h(depends)g(to)f(a)h(lar)o(ge) f(extent)h(on)4 1178 y(the)g(ability)g(of)g(the)g(caches)h(to)f(exploit)g (spatial)h(and)f(temporal)f(reuse.)31 b(In)16 b(some)g(cases,)j(this)d(may)h (be)4 1232 y(dif)o(\256cult)9 b(because)i(of)f(the)g(limited)f(capacity)h (and)g(associativity)g(of)g(caches.)18 b(Data)10 b(brought)f(into)h(a)g (cache)4 1286 y(by)16 b(a)g(reference)f(or)h(a)g(prefetch)f(may)h(be)g (evicted)f(before)h(being)f(used)h(or)g(reused,)h(because)g(of)e(either)4 1340 y(a)i(capacity)g(or)g(a)g(con\257ict)f(miss)i(caused)f(by)g(a)g (subsequent)h(reference.)31 b(Cache)18 b(misses)f(on)g(SSMMs)4 1395 y(adversely)f(af)o(fect)g(performance,)g(since)h(evicted)f(data)g(must)g (be)g(retrieved)f(from)g(its)i(home)e(memory)m(,)4 1449 y(which)k(may)g(be)g (remote)f(to)h(the)f(processor)m(.)38 b(Caches)20 b(play)f(less)h(of)e(an)h (important)f(role)g(in)h(DMMs)4 1503 y(because)g(cache)f(misses)h(result)e (exclusively)h(in)f(local)h(memory)f(accesses,)k(which)d(are)g(inexpensive)4 1557 y(relative)12 b(to)g(interprocessor)g(communications.)4 1718 y Fc(3.2)58 b(False)14 b(Sharing)4 1816 y Fo(In)g(SSMMs)h(data)f(on)h (the)f(same)h(cache)g(line)f(may)g(be)h(shared)f(by)h(more)e(than)i(one)f (processor)n(,)h(and)g(the)4 1870 y(line)j(may)g(exit)g(in)g(more)g(than)g (one)g(processor)r(')m(s)g(cache)h(at)f(the)g(same)h(time.)35 b(Hardware)18 b(is)g(used)h(to)4 1925 y(maintain)13 b(the)f(consistency)i(of) e(the)h(multiple)g(copies)g(of)f(the)h(line,)h(typically)e(using)h(a)g (write-invalidate)4 1979 y(protocol)e([24,)h(14].)18 b Fe(T)m(rue)12 b(sharing)g Fo(occurs)g(when)g(two)g(or)f(more)g(processors)i(access)g(the)f (same)g(data)g(on)4 2033 y(a)k(cache)f(line,)i(and)e(it)g(re\257ects)g (necessary)h(data)f(communications)g(in)g(an)g(application.)27 b(On)15 b(the)g(other)4 2087 y(hand,)h Fe(false)e(sharing)h Fo(occurs)f(when)h(two)f(processors)h(access)h(dif)o(ferent)d(pieces)i(of)f (data)h(on)f(the)g(same)4 2141 y(cache)e(line.)18 b(If)11 b(processors)h (write)g(to)f(the)h(same)g(cache)g(line,)g(the)g(cache)g(consistency)h (hardware)e(causes)4 2195 y(the)j(cache)g(line)g(to)g(be)g(transferred)f (back)h(and)g(forth)f(between)h(processors)g(leading)g(to)g(a)g (\252ping-pong\272)4 2250 y(ef)o(fect)h([8)o(].)27 b(False)16 b(sharing)f(causes)h(extensive)g(invalidation)e(traf)o(\256c)g(and)i(can)f (considerably)g(degrade)4 2304 y(performance.)i(False)c(sharing)f(is)h (non-existent)e(on)i(DMMs.)4 2465 y Fc(3.3)58 b(Memory)14 b(Contention)4 2563 y Fo(Memory)i(contention)g(occurs)g(when)g(many)g(processors)h(access)h (data)e(in)g(a)g(single)h(memory)e(module)4 2617 y(at)j(the)g(same)h(time.)35 b(Since)18 b(the)g(communication)f(protocol)g(in)h(SSMMs)g(is)g(receiver)o (-initiated,)h(and)4 2671 y(transfers)i(data)g(in)f(units)h(of)g(relatively)f (small)h(cache)g(lines,)j(a)d(lar)o(ge)g(number)f(of)h(requests)g(to)g(the)4 2725 y(same)12 b(memory)f(can)h(over\257ow)f(memory)g(buf)o(fers)g(and)h (cause)g(excessive)h(delays)f(in)g(memory)e(response)4 2780 y(time)20 b([13].)42 b(Contention)20 b(has)h(been)g(considered)g(less)g(of)f (a)h(performance)e(bottleneck)h(on)h(DMMs)p 4 2825 737 2 v 62 2855 a Fl(2)79 2870 y Fk(Calculated)10 b(based)h(on)f(the)g(elapsed)h (time)f(for)g(a)g(send-reply)g(message)i(of)e(128)g(bytes)g([19)o(].)p eop %%Page: 4 4 4 3 bop 4 -27 a Fo(because)16 b(a)g(sender)o(-initiated)e(communication)h (protocol)f(is)i(employed,)g(and)g(because)g(programmers)4 27 y(typically)f(communicate)f(data)i(in)f(lar)o(ge)g(infrequent)f(messages.) 28 b(Applications)15 b(on)g(DMMs)h(also)f(use)4 82 y(collective)d (communications)g([15)o(])g(that)h(further)e(reduce)h(contention.)4 243 y Fc(3.4)58 b(Over)o(head)14 b(of)g(Parallelism)4 341 y Fo(In)g(DMM,)i(synchronization)e(is)h(achieved)f(through)g(data)g (communication.)24 b(However)n(,)15 b(on)g(SSMMs,)4 395 y(synchronization)9 b(is)h(explicit)e(and)i(is)g(independent)f(of)f(data)i(communication.)16 b(The)10 b(resulting)f(overhead)4 449 y(can)14 b(become)f(a)h(performance)e (bottleneck)h([27)o(],)h(and)f(must)h(be)f(minimized.)21 b(The)14 b(performance)e(of)h(an)4 503 y(application)e(is)h(also)h(af)o(fected)e(by)h (the)f(overhead)h(involved)f(in)h(parallelizing)e(loops,)j(manifested)e(in)h (the)4 557 y(form)h(of)h(computation)f(partitioning)f(tests)j([25)o(].)23 b(These)15 b(tests)g(can)f(be)g(eliminated)g(in)f(some)i(cases)g(by)4 612 y(compiler)g(analysis,)i(but)d(when)i(not)f(possible,)h(can)g(degrade)f (performance.)26 b(This)15 b(overhead)g(though)4 666 y(also)d(present)g(in)f (the)h(case)g(of)f(DMMs,)j(is)e(not)f(considered)h(signi\256cant)f(because)h (of)g(the)f(predominantly)4 720 y(high)h(cost)h(of)f(remote)g(memory)f (access.)4 902 y Fn(4)71 b(Impact)19 b(on)f(Data)h(Distribution)4 1029 y Fo(In)e(this)g(section)g(we)g(use)g(two)g(applications,)h Fd(Multigrid)e Fo(and)h Fd(Tred2)p Fo(,)h(to)f(illustrate)f(the)h(impact)4 1083 y(of)f(cache)h(af)o(\256nity)f(and)h(false)g(sharing)f(on)h(the)f (choice)h(of)f(a)h(data)g(distribution.)30 b(The)17 b(KSR1)f(system)4 1137 y(is)f(used)f(because)h(of)f(its)h(lar)o(ge)f(cache)g(size,)i(and)e (because)h(of)f(the)g(presence)h(of)f(monitoring)e(hardware)4 1191 y(that)i(enables)h(the)g(measurement)f(of)g(the)g(number)g(of)f (non-local)h(memory)g(accesses)i(and)e(the)g(number)4 1245 y(of)e(caches)h(misses)h(for)d(a)i(processor)m(.)77 1321 y(The)j(KSR1)e(is)h (a)g(Cache)g(only)g(Memory)f(Architecture)g(\(COMA\))g(con\256gured)g(as)h(a) g(hierarchy)f(of)4 1375 y(slotted)c(rings)g(with)g(processing)g(cells)h(on)f (the)g(leaf-level)f(rings.)18 b(The)10 b(local)g(portion)g(of)f(shared)i (memory)4 1429 y(associated)g(with)e(a)i(processor)e(is)i(or)o(ganized)e(as)i (a)f(cache.)18 b(There)10 b(is)g(no)g(home)g(location)f(for)g(data,)i(rather) n(,)4 1483 y(data)k(may)g(exist)f(in)h(more)f(than)h(one)f(local)h(memory)m (.)24 b(The)16 b(hardware)e(maintains)g(the)h(consistency)g(of)4 1538 y(possible)e(multiple)e(copies)i(of)f(the)g(data.)77 1613 y(The)e(KSR1)g(implicitly)e(implements)i(the)f(owner)o(-computes)g(rule,)h (since)g(data)g(written)f(by)g(a)h(proces-)4 1667 y(sor)j(must)f(exclusively) g(reside)h(in)f(the)h(processor)r(')m(s)f(local)g(portion)g(of)g(the)g (shared)h(memory)m(.)k(Hardware)4 1722 y(automatically)j(migrates)g(data)h (to)g(the)f(processor)h(that)f(requests)h(the)g(data)f(in)h(units)g(of)f Fe(subpages)p Fo(.)4 1776 y(Hence,)13 b(the)f(computation)g(partitioning)e (of)i(a)g(loop)g(dictates)h(the)f(residence)g(of)g(a)g(data)h(item)e(and)i (hence)4 1830 y(the)k(distribution)f(of)h(the)g(arrays)g(in)g(the)g(loop.)33 b(Data)17 b(which)g(is)h(read)f(by)g(the)g(processors)h(may)f(exist)4 1884 y(in)e(multiple)e(local)i(memories,)g(and)g(read)f(requests)h(to)g(this) g(data)f(from)g(dif)o(ferent)f(processors)i(may)g(be)4 1938 y(satis\256ed)e(from)e(dif)o(ferent)g(portions)h(of)g(the)g(shared)h(memory)m (.)4 2099 y Fc(4.1)58 b(Cache-Conscious)13 b(Data)i(Distribution)4 2197 y Fo(The)j Fd(Multigrid)e Fo(application)g(from)g(the)h(NAS)f(suite)h (of)g(benchmarks)f(illustrates)h(how)g(data)g(dis-)4 2252 y(tributions)d (must)h(be)g(cache-conscious.)27 b Fd(Multigrid)14 b Fo(is)h(a)g(three)g (dimensional)f(solver)h(calculating)4 2306 y(the)j(potential)f(\256eld)h(on)f (a)h(cubical)g(grid.)34 b(W)l(e)18 b(focus)g(on)f(the)h(subroutine)f Fd(psinv)h Fo(which)f(uses)i(two)4 2360 y(3-dimensional)13 b(arrays)h Fj(U)20 b Fo(and)14 b Fj(R)p Fo(.)25 b(The)14 b(subroutine)g (mainly)f(performs)h(the)g(following)f(computation)4 2414 y(inside)i(a)h (triply)e(nested)h(loop:)23 b Fj(U)5 b Fi(\()p Fj(i;)23 b(j;)h(k)r Fi(\))15 b(+)j(=)33 b Fj(\013)p Fi(\()15 b Fj(R)p Fi(\()p Fj(f)5 b Fi(\()p Fj(i)p Fi(\))p Fj(;)24 b(g)r Fi(\()p Fj(j)s Fi(\))p Fj(;)f(h)p Fi(\()p Fj(k)r Fi(\)\)\))p Fo(,)16 b(where)f Fj(f)5 b Fi(\()p Fj(i)p Fi(\))15 b Fo(=)h Fj(i)c Fh(\000)g Fo(1,)4 2468 y Fj(i)18 b Fo(or)g Fj(i)13 b Fi(+)i Fo(1,)20 b(as)e(are)g(the)g (functions)g Fj(g)i Fo(and)e Fj(h)p Fo(.)36 b(The)18 b(loop)g(nest)g(is)h (fully)e(parallel.)35 b(The)18 b(application)4 2522 y(has)e(nearest)g (neighbor)e(communications)h(along)g(all)g(three)g(dimensions,)i(which)e(is)h (typical)f(of)g(many)4 2577 y(scienti\256c)d(applications.)77 2652 y(In)d(this)g(application,)g(we)g(choose)h(not)e(to)h(parallelize)f(the) h(innermost)g(loop)f(to)h(avoid)g(cache)g(line)g(false)4 2706 y(sharing)k(and)g(cache)h(interference;)e(successive)j(iterations)e(of)f (this)i(loop)f(access)h(successive)h(elements)4 2761 y(on)h(the)f(same)i (cache)f(line.)28 b(Hence)16 b(we)g(use)g(a)g(two)g(dimensional)f(grid)g(for) g(the)h(processor)f(geometry)m(.)4 2815 y(Since)10 b(the)g(application)g(has) h(nearest)f(neighbor)f(communications,)i Fd(Block)f Fo(distribution)f (performs)g(the)4 2869 y(best.)18 b(The)10 b(restriction)e(of)h(the)h (innermost)f(loop)g(to)g(be)h(sequential)f(requires)g(the)g(arrays)h(to)f(be) h(distributed)p eop %%Page: 5 5 5 4 bop 503 532 a @beginspecial 50 @llx 50 @lly 410 @urx 302 @ury 2057 @rwi @setspecial %%BeginDocument: mg.ps /gnudict 40 dict def gnudict begin /Color false def /Solid false def /gnulinewidth 5.000 def /vshift -33 def /dl {10 mul} def /hpt 31.5 def /vpt 31.5 def /M {moveto} bind def /L {lineto} bind def /R {rmoveto} bind def /V {rlineto} bind def /vpt2 vpt 2 mul def /hpt2 hpt 2 mul def /Lshow { currentpoint stroke M 0 vshift R show } def /Rshow { currentpoint stroke M dup stringwidth pop neg vshift R show } def /Cshow { currentpoint stroke M dup stringwidth pop -2 div vshift R show } def /DL { Color {setrgbcolor Solid {pop []} if 0 setdash } {pop pop pop Solid {pop []} if 0 setdash} ifelse } def /BL { stroke gnulinewidth 2 mul setlinewidth } def /AL { stroke gnulinewidth 2 div setlinewidth } def /PL { stroke gnulinewidth setlinewidth } def /LTb { BL [] 0 0 0 DL } def /LTa { AL [1 dl 2 dl] 0 setdash 0 0 0 setrgbcolor } def /LT0 { PL [] 0 1 0 DL } def /LT1 { PL [4 dl 2 dl] 0 0 1 DL } def /LT2 { PL [2 dl 3 dl] 1 0 0 DL } def /LT3 { PL [1 dl 1.5 dl] 1 0 1 DL } def /LT4 { PL [5 dl 2 dl 1 dl 2 dl] 0 1 1 DL } def /LT5 { PL [4 dl 3 dl 1 dl 3 dl] 1 1 0 DL } def /LT6 { PL [2 dl 2 dl 2 dl 4 dl] 0 0 0 DL } def /LT7 { PL [2 dl 2 dl 2 dl 2 dl 2 dl 4 dl] 1 0.3 0 DL } def /LT8 { PL [2 dl 2 dl 2 dl 2 dl 2 dl 2 dl 2 dl 4 dl] 0.5 0.5 0.5 DL } def /P { stroke [] 0 setdash currentlinewidth 2 div sub M 0 currentlinewidth V stroke } def /D { stroke [] 0 setdash 2 copy vpt add M hpt neg vpt neg V hpt vpt neg V hpt vpt V hpt neg vpt V closepath stroke P } def /A { stroke [] 0 setdash vpt sub M 0 vpt2 V currentpoint stroke M hpt neg vpt neg R hpt2 0 V stroke } def /B { stroke [] 0 setdash 2 copy exch hpt sub exch vpt add M 0 vpt2 neg V hpt2 0 V 0 vpt2 V hpt2 neg 0 V closepath stroke P } def /C { stroke [] 0 setdash exch hpt sub exch vpt add M hpt2 vpt2 neg V currentpoint stroke M hpt2 neg 0 R hpt2 vpt2 V stroke } def /T { stroke [] 0 setdash 2 copy vpt 1.12 mul add M hpt neg vpt -1.62 mul V hpt 2 mul 0 V hpt neg vpt 1.62 mul V closepath stroke P } def /S { 2 copy A C} def end gnudict begin gsave 50 50 translate 0.100 0.100 scale 0 setgray /Times-Roman findfont 100 scalefont setfont newpath LTa 600 251 M 0 2218 V LTb LTa 600 473 M 2817 0 V LTb 600 473 M 63 0 V 2754 0 R -63 0 V 540 473 M (96) Rshow LTa 600 916 M 2817 0 V LTb 600 916 M 63 0 V 2754 0 R -63 0 V 540 916 M (98) Rshow LTa 600 1360 M 2817 0 V LTb 600 1360 M 63 0 V 2754 0 R -63 0 V -2814 0 R (100) Rshow LTa 600 1804 M 2817 0 V LTb 600 1804 M 63 0 V 2754 0 R -63 0 V -2814 0 R (102) Rshow LTa 600 2247 M 2817 0 V LTb 600 2247 M 63 0 V 2754 0 R -63 0 V -2814 0 R (104) Rshow LTa 600 251 M 0 2218 V LTb 600 251 M 0 63 V 0 2155 R 0 -63 V 600 151 M (\(16,1\)) Cshow LTa 1304 251 M 0 2218 V LTb 1304 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (\(8,2\)) Cshow LTa 2009 251 M 0 2218 V LTb 2009 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (\(4,4\)) Cshow LTa 2713 251 M 0 2218 V LTb 2713 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (\(2,8\)) Cshow LTa 3417 251 M 0 2218 V LTb 3417 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (\(1,16\)) Cshow 600 251 M 2817 0 V 0 2218 V -2817 0 V 600 251 L 340 1360 M currentpoint gsave translate 90 rotate 0 0 M (Normalized Execution Time \(w.r.t \(16,1\)\)) Cshow grestore 2008 51 M (Processor Geometry - 16 Processors ) Cshow LT1 1654 2106 M (160x160x160) Rshow 1714 2106 M 180 0 V 1523 341 R 2713 2048 L 2009 1826 L 1304 939 L 600 1360 L 1774 2106 A 3417 2447 A 2713 2048 A 2009 1826 A 1304 939 A 600 1360 A LT2 1654 2006 M (144x144x144) Rshow 1714 2006 M 180 0 V 1523 241 R 2713 1715 L -705 -67 V 1304 850 L 600 1360 L 1774 2006 B 3417 2247 B 2713 1715 B 2009 1648 B 1304 850 B 600 1360 B LT4 1654 1906 M (64x64x64) Rshow 1714 1906 M 180 0 V 1523 -590 R 2713 1160 L 2009 340 L 1304 495 L 600 1360 L 1774 1906 T 3417 1316 T 2713 1160 T 2009 340 T 1304 495 T 600 1360 T stroke grestore end showpage %%EndDocument @endspecial 377 612 a Fo(Figure)11 b(2:)18 b(Normalized)11 b(Execution)i(time)f(of)g Fd(Multigrid)p Fo(.)4 733 y(with)17 b Fd(\(*,Block,Block\))f Fo(since)h(the)g(arrays)g(are)g(assumed)h(to)f(be)h (stored)f(using)g(column)f(major)4 787 y(ordering.)31 b(W)n(ith)16 b(16)h(processors,)h(it)f(is)g(possible)g(to)g(choose)g(one)g(of)f(the)h (\(16,1\),)h(\(8,2\),)f(\(4,4\),)h(\(2,8\))4 841 y(and)d(\(1,16\))g (processor)g(geometries.)27 b(The)15 b(choice)h(of)e(the)i(processor)f (geometry)f(af)o(fects)h(the)g(number)4 895 y(of)h(processors)g(that)g (execute)g(each)g(parallel)f(loop.)29 b(For)15 b(example,)i(a)f(processor)g (geometry)f(of)h(\(8,2\),)4 949 y(implies)11 b(8)g(processors)h(assigned)g (to)e(the)i(inner)e(parallel)h(loop)g(and)g(2)g(processors)g(assigned)h(to)f (the)g(outer)4 1004 y(parallel)h(loop.)77 1079 y(Figure)17 b(2)g(shows)h(the)f(execution)g(time)g(of)g(the)h(application)e(for)h (various)g(processor)g(geometries)4 1133 y(with)e(the)g Fd(\(*,Block,Block\)) e Fo(distribution)h(for)g(the)h(arrays)g(on)g(the)g(KSR1)f(with)h(16)g (processors,)4 1188 y(normalized)d(with)h(respect)g(to)g(the)g(\(16,1\))f (processor)h(geometry)m(.)19 b(For)12 b(a)h(small)g(data)g(size)h (\(64x64x64\),)4 1242 y(execution)22 b(time)f(is)h(minimized)e(by)i(a)g (distribution)e(with)h(equal)h(number)f(of)g(processors)h(in)f(each)4 1296 y(dimension,)15 b(i.e.,)i(\(4,4\).)24 b(This)16 b(is)f(the)f(same)h (distribution)e(scheme)j(suggested)f(in)f(the)h(Syracuse)f(High)4 1350 y(Performance)9 b(Fortran)h(applications)g(suite)772 1332 y Fm(3)801 1350 y Fo(for)g(DMMs.)19 b(However)n(,)11 b(when)f(the)h(data)g (size)g(is)g(lar)o(ge,)g(the)4 1404 y(processor)h(geometry)f(\(4,4\))h(no)g (longer)f(performs)g(the)h(best.)19 b(The)12 b(execution)g(time)g(is)g (minimized)f(with)4 1458 y(a)i(processor)f(geometry)g(of)g(\(8,2\).)77 1534 y(The)20 b(impact)e(of)h(processor)g(geometry)f(on)g(performance)g(is)h (due)g(to)g(cache)g(af)o(\256nity)m(,)h(as)g(can)f(be)4 1588 y(deduced)12 b(from)f(Figures)h(3)g(and)g(4.)19 b(Figure)11 b(3)h(shows)h(the)f(measured)g(number)g(of)f(cache)i(lines)f(accessed)4 1642 y(from)17 b(remote)g(memory)f(modules,)j(normalized)e(with)g(respect)h (to)f(the)h(processor)f(geometry)g(\(16,1\).)4 1697 y(The)h(number)e(of)h (remote)f(memory)g(accesses)j(is)e(minimal)g(when)g(the)g(processor)g (geometry)f(is)h(\(4,4\))4 1751 y(for)h(all)g(data)h(sizes.)38 b(Figure)17 b(4)i(shows)g(the)g(average)f(measured)h(number)e(of)i(cache)g (misses)g(from)f(a)4 1805 y(processor)c(cache,)h(again)e(normalized)g(with)g (respect)h(to)g(the)f(processor)h(geometry)f(\(16,1\).)21 b(When)14 b(the)4 1859 y(data)e(size)h(is)f(small)g(\(64x64x64\),)g(the)g(data)g(used)g (by)g(a)h(processor)f(\256ts)g(into)f(the)h(256k)g(processor)g(cache)4 1913 y(and)19 b(the)g(misses)h(from)e(the)h(cache)h(in)f(this)g(case)h (re\257ect)f(remote)f(memory)g(accesses)j(that)e(occur)g(in)4 1967 y(the)13 b(parallel)g(program.)19 b(Hence,)14 b(the)f(predominant)f (factor)h(af)o(fecting)f(performance)g(is)h(interprocessor)4 2022 y(communication,)f(and)g(the)h(best)f(performance)g(is)g(attained)g (using)h(the)f(\(4,4\))g(geometry)m(.)77 2097 y(However)n(,)17 b(when)f(the)g(arrays)g(are)g(relatively)f(lar)o(ge)h(\(144x144x144\),)g(the) g(cache)g(capacity)h(is)f(no)4 2151 y(longer)g(suf)o(\256cient)h(to)g(hold)f (data)i(from)d(successive)k(iterations)d(of)h(the)g(outer)f(parallel)h(loop,) h(and)f(the)4 2206 y(number)10 b(of)h(cache)g(misses)h(increases.)19 b(When)11 b(the)g(number)f(of)h(processors)g(assigned)h(to)f(the)g(outer)f (loop)4 2260 y(increases,)j(the)f(number)f(of)h(misses)h(from)d(the)i(cache)h (also)f(increases.)19 b(The)12 b(\(4,4\))g(processor)g(geometry)4 2314 y(minimizes)d(the)f(amount)h(of)f(remote)g(memory)g(access,)k(but)c(the) h(\(16,1\))f(processor)h(geometry)f(minimizes)4 2368 y(the)k(amount)f(of)g (cache)h(misses.)19 b(The)12 b(distribution)e(with)i(\(8,2\))f(processor)g (geometry)g(strikes)h(a)g(balance)4 2422 y(between)17 b(the)g(cost)g(of)g (remote)f(memory)g(access)i(and)f(the)g(cost)g(of)g(cache)g(misses,)i (resulting)e(in)f(best)4 2476 y(overall)c(performance,)g(in)g(spite)g(of)g (higher)g(interprocessor)g(communication)f(cost.)4 2638 y Fc(4.2)58 b(False)14 b(Sharing)g(Conscious)g(Data)h(Distribution)4 2736 y Fo(The)d(programs)f Fd(Tred2)h Fo(\(which)f(is)h(part)f(of)g(Eispack\),)i Fd(mdg)p Fo(,)f(and)g Fd(trfd)f Fo(\(which)g(are)h(both)f(part)h(of)f(the)4 2790 y(Perfect)f(Club)h(Benchmark)f(Suite\))g(exhibit)h(parallelism)f(which)h (result)f(in)h(considerable)g(false)g(sharing.)p 4 2835 737 2 v 62 2865 a Fl(3)79 2880 y Fk(http://www)m(.npac.syr)n(.edu/hpfa/)c(.)p eop %%Page: 6 6 6 5 bop 47 586 a @beginspecial 50 @llx 50 @lly 230 @urx 176 @ury 2057 @rwi @setspecial %%BeginDocument: spmiss.ps /gnudict 40 dict def gnudict begin /Color false def /Solid false def /gnulinewidth 5.000 def /vshift -33 def /dl {10 mul} def /hpt 31.5 def /vpt 31.5 def /M {moveto} bind def /L {lineto} bind def /R {rmoveto} bind def /V {rlineto} bind def /vpt2 vpt 2 mul def /hpt2 hpt 2 mul def /Lshow { currentpoint stroke M 0 vshift R show } def /Rshow { currentpoint stroke M dup stringwidth pop neg vshift R show } def /Cshow { currentpoint stroke M dup stringwidth pop -2 div vshift R show } def /DL { Color {setrgbcolor Solid {pop []} if 0 setdash } {pop pop pop Solid {pop []} if 0 setdash} ifelse } def /BL { stroke gnulinewidth 2 mul setlinewidth } def /AL { stroke gnulinewidth 2 div setlinewidth } def /PL { stroke gnulinewidth setlinewidth } def /LTb { BL [] 0 0 0 DL } def /LTa { AL [1 dl 2 dl] 0 setdash 0 0 0 setrgbcolor } def /LT0 { PL [] 0 1 0 DL } def /LT1 { PL [4 dl 2 dl] 0 0 1 DL } def /LT2 { PL [2 dl 3 dl] 1 0 0 DL } def /LT3 { PL [1 dl 1.5 dl] 1 0 1 DL } def /LT4 { PL [5 dl 2 dl 1 dl 2 dl] 0 1 1 DL } def /LT5 { PL [4 dl 3 dl 1 dl 3 dl] 1 1 0 DL } def /LT6 { PL [2 dl 2 dl 2 dl 4 dl] 0 0 0 DL } def /LT7 { PL [2 dl 2 dl 2 dl 2 dl 2 dl 4 dl] 1 0.3 0 DL } def /LT8 { PL [2 dl 2 dl 2 dl 2 dl 2 dl 2 dl 2 dl 4 dl] 0.5 0.5 0.5 DL } def /P { stroke [] 0 setdash currentlinewidth 2 div sub M 0 currentlinewidth V stroke } def /D { stroke [] 0 setdash 2 copy vpt add M hpt neg vpt neg V hpt vpt neg V hpt vpt V hpt neg vpt V closepath stroke P } def /A { stroke [] 0 setdash vpt sub M 0 vpt2 V currentpoint stroke M hpt neg vpt neg R hpt2 0 V stroke } def /B { stroke [] 0 setdash 2 copy exch hpt sub exch vpt add M 0 vpt2 neg V hpt2 0 V 0 vpt2 V hpt2 neg 0 V closepath stroke P } def /C { stroke [] 0 setdash exch hpt sub exch vpt add M hpt2 vpt2 neg V currentpoint stroke M hpt2 neg 0 R hpt2 vpt2 V stroke } def /T { stroke [] 0 setdash 2 copy vpt 1.12 mul add M hpt neg vpt -1.62 mul V hpt 2 mul 0 V hpt neg vpt 1.62 mul V closepath stroke P } def /S { 2 copy A C} def end gnudict begin gsave 50 50 translate 0.050 0.050 scale 0 setgray /Times-Roman findfont 100 scalefont setfont newpath LTa 600 251 M 0 2218 V LTb LTa 600 251 M 2817 0 V LTb 600 251 M 63 0 V 2754 0 R -63 0 V 540 251 M (30) Rshow LTa 600 568 M 2817 0 V LTb 600 568 M 63 0 V 2754 0 R -63 0 V 540 568 M (40) Rshow LTa 600 885 M 2817 0 V LTb 600 885 M 63 0 V 2754 0 R -63 0 V 540 885 M (50) Rshow LTa 600 1202 M 2817 0 V LTb 600 1202 M 63 0 V 2754 0 R -63 0 V -2814 0 R (60) Rshow LTa 600 1518 M 2817 0 V LTb 600 1518 M 63 0 V 2754 0 R -63 0 V -2814 0 R (70) Rshow LTa 600 1835 M 2817 0 V LTb 600 1835 M 63 0 V 2754 0 R -63 0 V -2814 0 R (80) Rshow LTa 600 2152 M 2817 0 V LTb 600 2152 M 63 0 V 2754 0 R -63 0 V -2814 0 R (90) Rshow LTa 600 2469 M 2817 0 V LTb 600 2469 M 63 0 V 2754 0 R -63 0 V -2814 0 R (100) Rshow LTa 600 251 M 0 2218 V LTb 600 251 M 0 63 V 0 2155 R 0 -63 V 600 151 M (\(16,1\)) Cshow LTa 1304 251 M 0 2218 V LTb 1304 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (\(8,2\)) Cshow LTa 2009 251 M 0 2218 V LTb 2009 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (\(4,4\)) Cshow LTa 2713 251 M 0 2218 V LTb 2713 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (\(2,8\)) Cshow LTa 3417 251 M 0 2218 V LTb 3417 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (\(1,16\)) Cshow 600 251 M 2817 0 V 0 2218 V -2817 0 V 600 251 L 340 1360 M currentpoint gsave translate 90 rotate 0 0 M (Normalized subpage misses \(w.r.t. \(16,1\)\)) Cshow grestore 2008 51 M (Processor Geometry - 16 Processors ) Cshow LT1 1654 2106 M (160x160x160) Rshow 1714 2106 M 180 0 V 600 2469 M 1304 1154 L 2009 495 L 705 646 V 3417 2453 L 1774 2106 A 600 2469 A 1304 1154 A 2009 495 A 2713 1141 A 3417 2453 A LT2 1654 2006 M (144x144x144) Rshow 1714 2006 M 180 0 V 600 2469 M 1304 1072 L 2009 473 L 705 567 V 3417 2387 L 1774 2006 B 600 2469 B 1304 1072 B 2009 473 B 2713 1040 B 3417 2387 B LT4 1654 1906 M (64x64x64) Rshow 1714 1906 M 180 0 V 600 2469 M 1304 1113 L 2009 543 L 705 649 V 3417 2444 L 1774 1906 T 600 2469 T 1304 1113 T 2009 543 T 2713 1192 T 3417 2444 T stroke grestore end showpage %%EndDocument @endspecial 124 640 a Fo(Figure)12 b(3.)18 b(Remote)12 b(Memory)g(Access.) 899 586 y @beginspecial 50 @llx 50 @lly 230 @urx 176 @ury 2057 @rwi @setspecial %%BeginDocument: datac.ps /gnudict 40 dict def gnudict begin /Color false def /Solid false def /gnulinewidth 5.000 def /vshift -33 def /dl {10 mul} def /hpt 31.5 def /vpt 31.5 def /M {moveto} bind def /L {lineto} bind def /R {rmoveto} bind def /V {rlineto} bind def /vpt2 vpt 2 mul def /hpt2 hpt 2 mul def /Lshow { currentpoint stroke M 0 vshift R show } def /Rshow { currentpoint stroke M dup stringwidth pop neg vshift R show } def /Cshow { currentpoint stroke M dup stringwidth pop -2 div vshift R show } def /DL { Color {setrgbcolor Solid {pop []} if 0 setdash } {pop pop pop Solid {pop []} if 0 setdash} ifelse } def /BL { stroke gnulinewidth 2 mul setlinewidth } def /AL { stroke gnulinewidth 2 div setlinewidth } def /PL { stroke gnulinewidth setlinewidth } def /LTb { BL [] 0 0 0 DL } def /LTa { AL [1 dl 2 dl] 0 setdash 0 0 0 setrgbcolor } def /LT0 { PL [] 0 1 0 DL } def /LT1 { PL [4 dl 2 dl] 0 0 1 DL } def /LT2 { PL [2 dl 3 dl] 1 0 0 DL } def /LT3 { PL [1 dl 1.5 dl] 1 0 1 DL } def /LT4 { PL [5 dl 2 dl 1 dl 2 dl] 0 1 1 DL } def /LT5 { PL [4 dl 3 dl 1 dl 3 dl] 1 1 0 DL } def /LT6 { PL [2 dl 2 dl 2 dl 4 dl] 0 0 0 DL } def /LT7 { PL [2 dl 2 dl 2 dl 2 dl 2 dl 4 dl] 1 0.3 0 DL } def /LT8 { PL [2 dl 2 dl 2 dl 2 dl 2 dl 2 dl 2 dl 4 dl] 0.5 0.5 0.5 DL } def /P { stroke [] 0 setdash currentlinewidth 2 div sub M 0 currentlinewidth V stroke } def /D { stroke [] 0 setdash 2 copy vpt add M hpt neg vpt neg V hpt vpt neg V hpt vpt V hpt neg vpt V closepath stroke P } def /A { stroke [] 0 setdash vpt sub M 0 vpt2 V currentpoint stroke M hpt neg vpt neg R hpt2 0 V stroke } def /B { stroke [] 0 setdash 2 copy exch hpt sub exch vpt add M 0 vpt2 neg V hpt2 0 V 0 vpt2 V hpt2 neg 0 V closepath stroke P } def /C { stroke [] 0 setdash exch hpt sub exch vpt add M hpt2 vpt2 neg V currentpoint stroke M hpt2 neg 0 R hpt2 vpt2 V stroke } def /T { stroke [] 0 setdash 2 copy vpt 1.12 mul add M hpt neg vpt -1.62 mul V hpt 2 mul 0 V hpt neg vpt 1.62 mul V closepath stroke P } def /S { 2 copy A C} def end gnudict begin gsave 50 50 translate 0.050 0.050 scale 0 setgray /Times-Roman findfont 100 scalefont setfont newpath LTa 600 251 M 0 2218 V LTb LTa 600 251 M 2817 0 V LTb 600 251 M 63 0 V 2754 0 R -63 0 V 540 251 M (90) Rshow LTa 600 528 M 2817 0 V LTb 600 528 M 63 0 V 2754 0 R -63 0 V 540 528 M (100) Rshow LTa 600 806 M 2817 0 V LTb 600 806 M 63 0 V 2754 0 R -63 0 V 540 806 M (110) Rshow LTa 600 1083 M 2817 0 V LTb 600 1083 M 63 0 V 2754 0 R -63 0 V -2814 0 R (120) Rshow LTa 600 1360 M 2817 0 V LTb 600 1360 M 63 0 V 2754 0 R -63 0 V -2814 0 R (130) Rshow LTa 600 1637 M 2817 0 V LTb 600 1637 M 63 0 V 2754 0 R -63 0 V -2814 0 R (140) Rshow LTa 600 1915 M 2817 0 V LTb 600 1915 M 63 0 V 2754 0 R -63 0 V -2814 0 R (150) Rshow LTa 600 2192 M 2817 0 V LTb 600 2192 M 63 0 V 2754 0 R -63 0 V -2814 0 R (160) Rshow LTa 600 2469 M 2817 0 V LTb 600 2469 M 63 0 V 2754 0 R -63 0 V -2814 0 R (170) Rshow LTa 600 251 M 0 2218 V LTb 600 251 M 0 63 V 0 2155 R 0 -63 V 600 151 M (\(16,1\)) Cshow LTa 1304 251 M 0 2218 V LTb 1304 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (\(8,2\)) Cshow LTa 2009 251 M 0 2218 V LTb 2009 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (\(4,4\)) Cshow LTa 2713 251 M 0 2218 V LTb 2713 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (\(2,8\)) Cshow LTa 3417 251 M 0 2218 V LTb 3417 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (\(1,16\)) Cshow 600 251 M 2817 0 V 0 2218 V -2817 0 V 600 251 L 340 1360 M currentpoint gsave translate 90 rotate 0 0 M (Normalized cache misses \(w.r.t \(16,1\)\)) Cshow grestore 2008 51 M (Processor Geometry - 16 Processors ) Cshow LT1 1654 2106 M (160x160x160) Rshow 1714 2106 M 180 0 V 600 528 M 1304 817 L 705 626 V 705 721 V 704 250 V 1774 2106 A 600 528 A 1304 817 A 2009 1443 A 2713 2164 A 3417 2414 A LT2 1654 2006 M (144x144x144) Rshow 1714 2006 M 180 0 V 600 528 M 1304 678 L 705 488 V 705 610 V 704 333 V 1774 2006 B 600 528 B 1304 678 B 2009 1166 B 2713 1776 B 3417 2109 B LT4 1654 1906 M (64x64x64) Rshow 1714 1906 M 180 0 V 600 528 M 1304 329 L 705 -36 V 705 133 V 704 97 V 1774 1906 T 600 528 T 1304 329 T 2009 293 T 2713 426 T 3417 523 T stroke grestore end showpage %%EndDocument @endspecial 1085 640 a(Figure)g(4.)18 b(Cache)13 b(Misses.)47 1297 y @beginspecial 50 @llx 50 @lly 410 @urx 302 @ury 2057 @rwi @setspecial %%BeginDocument: tred2.ps /gnudict 40 dict def gnudict begin /Color false def /Solid false def /gnulinewidth 5.000 def /vshift -33 def /dl {10 mul} def /hpt 31.5 def /vpt 31.5 def /M {moveto} bind def /L {lineto} bind def /R {rmoveto} bind def /V {rlineto} bind def /vpt2 vpt 2 mul def /hpt2 hpt 2 mul def /Lshow { currentpoint stroke M 0 vshift R show } def /Rshow { currentpoint stroke M dup stringwidth pop neg vshift R show } def /Cshow { currentpoint stroke M dup stringwidth pop -2 div vshift R show } def /DL { Color {setrgbcolor Solid {pop []} if 0 setdash } {pop pop pop Solid {pop []} if 0 setdash} ifelse } def /BL { stroke gnulinewidth 2 mul setlinewidth } def /AL { stroke gnulinewidth 2 div setlinewidth } def /PL { stroke gnulinewidth setlinewidth } def /LTb { BL [] 0 0 0 DL } def /LTa { AL [1 dl 2 dl] 0 setdash 0 0 0 setrgbcolor } def /LT0 { PL [] 0 1 0 DL } def /LT1 { PL [4 dl 2 dl] 0 0 1 DL } def /LT2 { PL [2 dl 3 dl] 1 0 0 DL } def /LT3 { PL [1 dl 1.5 dl] 1 0 1 DL } def /LT4 { PL [5 dl 2 dl 1 dl 2 dl] 0 1 1 DL } def /LT5 { PL [4 dl 3 dl 1 dl 3 dl] 1 1 0 DL } def /LT6 { PL [2 dl 2 dl 2 dl 4 dl] 0 0 0 DL } def /LT7 { PL [2 dl 2 dl 2 dl 2 dl 2 dl 4 dl] 1 0.3 0 DL } def /LT8 { PL [2 dl 2 dl 2 dl 2 dl 2 dl 2 dl 2 dl 4 dl] 0.5 0.5 0.5 DL } def /P { stroke [] 0 setdash currentlinewidth 2 div sub M 0 currentlinewidth V stroke } def /D { stroke [] 0 setdash 2 copy vpt add M hpt neg vpt neg V hpt vpt neg V hpt vpt V hpt neg vpt V closepath stroke P } def /A { stroke [] 0 setdash vpt sub M 0 vpt2 V currentpoint stroke M hpt neg vpt neg R hpt2 0 V stroke } def /B { stroke [] 0 setdash 2 copy exch hpt sub exch vpt add M 0 vpt2 neg V hpt2 0 V 0 vpt2 V hpt2 neg 0 V closepath stroke P } def /C { stroke [] 0 setdash exch hpt sub exch vpt add M hpt2 vpt2 neg V currentpoint stroke M hpt2 neg 0 R hpt2 vpt2 V stroke } def /T { stroke [] 0 setdash 2 copy vpt 1.12 mul add M hpt neg vpt -1.62 mul V hpt 2 mul 0 V hpt neg vpt 1.62 mul V closepath stroke P } def /S { 2 copy A C} def end gnudict begin gsave 50 50 translate 0.100 0.100 scale 0 setgray /Times-Roman findfont 100 scalefont setfont newpath LTa 600 251 M 0 2218 V LTb LTa 600 251 M 2817 0 V LTb 600 251 M 63 0 V 2754 0 R -63 0 V 540 251 M (1e+06) Rshow LTb 600 2469 M 63 0 V 2754 0 R -63 0 V 540 2469 M (4e+06) Rshow LTa 600 1360 M 2817 0 V LTb 600 1360 M 31 0 V 2786 0 R -31 0 V LTa 600 2009 M 2817 0 V LTb 600 2009 M 31 0 V 2786 0 R -31 0 V LTa 600 2469 M 2817 0 V LTb 600 2469 M 31 0 V 2786 0 R -31 0 V LTa 600 251 M 0 2218 V LTb 600 251 M 0 63 V 0 2155 R 0 -63 V 600 151 M (0) Cshow LTa 952 251 M 0 2218 V LTb 952 251 M 0 63 V 0 2155 R 0 -63 V 952 151 M (2) Cshow LTa 1304 251 M 0 2218 V LTb 1304 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (4) Cshow LTa 1656 251 M 0 2218 V LTb 1656 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (6) Cshow LTa 2009 251 M 0 2218 V LTb 2009 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (8) Cshow LTa 2361 251 M 0 2218 V LTb 2361 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (10) Cshow LTa 2713 251 M 0 2218 V LTb 2713 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (12) Cshow LTa 3065 251 M 0 2218 V LTb 3065 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (14) Cshow LTa 3417 251 M 0 2218 V LTb 3417 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (16) Cshow 600 251 M 2817 0 V 0 2218 V -2817 0 V 600 251 L 340 1260 M currentpoint gsave translate 90 rotate 0 0 M (Execution Time \(Micro Seconds\)) Cshow grestore 2008 51 M (Number of Processors) Cshow LT0 3054 2306 M ("Cyclic") Rshow 3114 2306 M 180 0 V 776 910 M 176 318 V 352 533 V 705 -387 V 704 -149 V 704 707 V 3174 2306 D 776 910 D 952 1228 D 1304 1761 D 2009 1374 D 2713 1225 D 3417 1932 D LT1 3054 2206 M ("BlockCyclic") Rshow 3114 2206 M 180 0 V 776 942 M 952 817 L 1304 662 L 705 -34 V 704 110 V 704 1480 V 3174 2206 A 776 942 A 952 817 A 1304 662 A 2009 628 A 2713 738 A 3417 2218 A stroke grestore end showpage %%EndDocument @endspecial 140 1351 a(Figure)f(5.)18 b(Ef)o(fect)12 b(of)g(False)h (Sharing.)899 1297 y @beginspecial 50 @llx 50 @lly 410 @urx 302 @ury 2057 @rwi @setspecial %%BeginDocument: tred2c.ps /gnudict 40 dict def gnudict begin /Color false def /Solid false def /gnulinewidth 5.000 def /vshift -33 def /dl {10 mul} def /hpt 31.5 def /vpt 31.5 def /M {moveto} bind def /L {lineto} bind def /R {rmoveto} bind def /V {rlineto} bind def /vpt2 vpt 2 mul def /hpt2 hpt 2 mul def /Lshow { currentpoint stroke M 0 vshift R show } def /Rshow { currentpoint stroke M dup stringwidth pop neg vshift R show } def /Cshow { currentpoint stroke M dup stringwidth pop -2 div vshift R show } def /DL { Color {setrgbcolor Solid {pop []} if 0 setdash } {pop pop pop Solid {pop []} if 0 setdash} ifelse } def /BL { stroke gnulinewidth 2 mul setlinewidth } def /AL { stroke gnulinewidth 2 div setlinewidth } def /PL { stroke gnulinewidth setlinewidth } def /LTb { BL [] 0 0 0 DL } def /LTa { AL [1 dl 2 dl] 0 setdash 0 0 0 setrgbcolor } def /LT0 { PL [] 0 1 0 DL } def /LT1 { PL [4 dl 2 dl] 0 0 1 DL } def /LT2 { PL [2 dl 3 dl] 1 0 0 DL } def /LT3 { PL [1 dl 1.5 dl] 1 0 1 DL } def /LT4 { PL [5 dl 2 dl 1 dl 2 dl] 0 1 1 DL } def /LT5 { PL [4 dl 3 dl 1 dl 3 dl] 1 1 0 DL } def /LT6 { PL [2 dl 2 dl 2 dl 4 dl] 0 0 0 DL } def /LT7 { PL [2 dl 2 dl 2 dl 2 dl 2 dl 4 dl] 1 0.3 0 DL } def /LT8 { PL [2 dl 2 dl 2 dl 2 dl 2 dl 2 dl 2 dl 4 dl] 0.5 0.5 0.5 DL } def /P { stroke [] 0 setdash currentlinewidth 2 div sub M 0 currentlinewidth V stroke } def /D { stroke [] 0 setdash 2 copy vpt add M hpt neg vpt neg V hpt vpt neg V hpt vpt V hpt neg vpt V closepath stroke P } def /A { stroke [] 0 setdash vpt sub M 0 vpt2 V currentpoint stroke M hpt neg vpt neg R hpt2 0 V stroke } def /B { stroke [] 0 setdash 2 copy exch hpt sub exch vpt add M 0 vpt2 neg V hpt2 0 V 0 vpt2 V hpt2 neg 0 V closepath stroke P } def /C { stroke [] 0 setdash exch hpt sub exch vpt add M hpt2 vpt2 neg V currentpoint stroke M hpt2 neg 0 R hpt2 vpt2 V stroke } def /T { stroke [] 0 setdash 2 copy vpt 1.12 mul add M hpt neg vpt -1.62 mul V hpt 2 mul 0 V hpt neg vpt 1.62 mul V closepath stroke P } def /S { 2 copy A C} def end gnudict begin gsave 50 50 translate 0.100 0.100 scale 0 setgray /Times-Roman findfont 100 scalefont setfont newpath LTa 600 251 M 2817 0 V LTb 600 251 M 63 0 V 2754 0 R -63 0 V -2814 0 R (40000) Rshow LTa 600 251 M 2817 0 V LTb 600 251 M 31 0 V 2786 0 R -31 0 V LTa 600 497 M 2817 0 V LTb 600 497 M 31 0 V 2786 0 R -31 0 V LTa 600 697 M 2817 0 V LTb 600 697 M 31 0 V 2786 0 R -31 0 V LTa 600 867 M 2817 0 V LTb 600 867 M 31 0 V 2786 0 R -31 0 V LTa 600 1014 M 2817 0 V LTb 600 1014 M 31 0 V 2786 0 R -31 0 V LTa 600 1144 M 2817 0 V LTb 600 1144 M 31 0 V 2786 0 R -31 0 V LTa 600 1260 M 2817 0 V LTb 600 1260 M 63 0 V 2754 0 R -63 0 V -2814 0 R (100000) Rshow LTa 600 2023 M 2817 0 V LTb 600 2023 M 31 0 V 2786 0 R -31 0 V LTa 600 2469 M 2817 0 V LTb 600 2469 M 31 0 V 2786 0 R -31 0 V LTa 600 251 M 0 2218 V LTb 600 251 M 0 63 V 0 2155 R 0 -63 V 600 151 M (0) Cshow LTa 952 251 M 0 2218 V LTb 952 251 M 0 63 V 0 2155 R 0 -63 V 952 151 M (2) Cshow LTa 1304 251 M 0 2218 V LTb 1304 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (4) Cshow LTa 1656 251 M 0 2218 V LTb 1656 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (6) Cshow LTa 2009 251 M 0 2218 V LTb 2009 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (8) Cshow LTa 2361 251 M 0 2218 V LTb 2361 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (10) Cshow LTa 2713 251 M 0 2218 V LTb 2713 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (12) Cshow LTa 3065 251 M 0 2218 V LTb 3065 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (14) Cshow LTa 3417 251 M 0 2218 V LTb 3417 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (16) Cshow 600 251 M 2817 0 V 0 2218 V -2817 0 V 600 251 L 400 1660 M currentpoint gsave translate 90 rotate 0 0 M (Cache Misses) Cshow grestore 2008 51 M (Number of Processors) Cshow LT0 3054 2306 M ("Cyclic") Rshow 3114 2306 M 180 0 V 123 -856 R -704 -3 V -704 172 V -705 449 V 952 1885 L 776 2180 L 3174 2306 D 3417 1450 D 2713 1447 D 2009 1619 D 1304 2068 D 952 1885 D 776 2180 D LT1 3054 2206 M ("BlockCyclic") Rshow 3114 2206 M 180 0 V 3417 1029 M 2713 642 L 2009 475 L -705 601 V 952 1596 L 776 2166 L 3174 2206 A 3417 1029 A 2713 642 A 2009 475 A 1304 1076 A 952 1596 A 776 2166 A stroke grestore end showpage %%EndDocument @endspecial 1085 1351 a(Figure)f(6.)18 b(Cache)13 b(Misses.)4 1510 y(These)18 b(programs)f(have)g(triangular)f(iteration)g(spaces)i(which)f (necessitate)h(cyclical)f(distribution)f(for)4 1564 y(load)c(balancing.)19 b(The)13 b(choice)f(of)g(this)h(distribution)e(combined)h(with)g(the)h (storage)f(order)g(of)g(the)g(arrays)4 1619 y(cause)17 b(more)f(than)g(one)g (processor)g(to)g(share)g(the)g(same)h(cache)f(line,)i(leading)e(to)g(false)g (sharing.)29 b(The)4 1673 y(impact)17 b(of)f(this)h(false)h(sharing)e(is)i (shown)f(in)g(Figure)f(5)h(for)f(the)h Fd(Tred2)g Fo(application)f(on)h(the)g (KSR1)4 1727 y(multiprocessor)m(.)37 b(The)19 b(\256gure)f(shows)h(the)g (execution)f(time)g(of)g(the)h(application)f(for)g Fd(Cyclic)g Fo(and)4 1781 y Fd(BlockCyclic)12 b Fo(distributions)g(using)i(1)f(to)g(16)g (processors.)20 b(The)14 b(use)g(of)e(the)h Fd(Cyclic)g Fo(distribution)4 1835 y(results)f(in)g(a)h(lar)o(ge)f(number)f(of)h(cache)h(misses,)g(as)g (can)f(be)g(seen)h(in)f(Figure)g(6.)18 b(The)13 b(resulting)e(overhead)4 1889 y(causes)20 b(execution)f(time)g(to)g(increase)g(as)g(the)g(number)g(of) f(processors)i(increases.)39 b(The)19 b(arrays)g(are)4 1944 y(distributed)c(using)g(a)g Fd(BlockCyclic)f Fo(distribution,)h(where)g(the)g (size)h(of)f(the)g(block)g(is)h(equal)f(to)g(the)4 1998 y(size)22 b(of)e(the)h(cache)g(line,)i(which)e(ef)o(fectively)f(eliminates)h(false)g (sharing.)43 b(When)21 b(the)g(number)f(of)4 2052 y(processors)14 b(is)g(small,)g(the)f(load)h(is)g(relatively)e(well-balanced,)i(and)f(the)h (elimination)e(of)h(false)h(sharing)4 2106 y(improves)h(performance.)25 b(However)n(,)15 b(as)h(the)f(number)f(of)h(processors)g(increases,)i(the)e (load)f(becomes)4 2160 y(increasingly)f(imbalanced,)h(and)f(the)h(negative)f (impact)g(of)g(this)g(load)g(imbalance)h(begins)f(to)g(outweigh)4 2214 y(the)h(bene\256ts)h(of)e(eliminating)h(false)g(sharing.)24 b(A)14 b(compiler)g(for)f(SSMM)h(must)g(consider)h(this)f(tradeof)o(f)4 2269 y(between)f(load)f(imbalance)g(and)g(false)h(sharing)f(when)g (determining)g(data)g(distributions.)4 2450 y Fn(5)71 b(Impact)19 b(on)f(Computation)i(Partitioning)4 2577 y Fo(The)12 b(owner)o(-computes)f (rule)g(has)h(been)f(the)h(computation)f(partitioner)f(of)h(choice)g(for)g (compiling)g(HPF-)4 2631 y(type)17 b(languages)g(on)g(DMMs)h([16].)32 b(The)17 b(owner)o(-computes)f(rule)h(maps)g(a)g(statement)h(such)f(that)g (the)4 2686 y(the)h(computation)e(is)i(executed)g(on)g(the)f(processor)h(on)f (which)h(the)f(data)h(element)f(that)h(is)g(written)e(is)4 2740 y(local.)27 b(All)15 b(the)g(data)g(elements)g(that)g(are)g(required)f (to)h(compute)g(the)g(result)g(\(which)g(may)g(be)g(remote\))4 2794 y(are)h(communicated)f(to)h(the)g(processor)m(.)29 b(A)16 b(strict)g(rule)f(such)i(as)f(owner)o(-computes)f(is)h(not)g(necessary)4 2848 y(on)h(a)f(SSMM)h(because)g(message)h(passing)f(code)g(is)g(not)f (generated)g(at)h(compile)f(time)g([3].)30 b(In)17 b(some)p eop %%Page: 7 7 7 6 bop 482 311 a @beginspecial 127 @llx 520 @lly 393 @urx 632 @ury 2160 @rwi @setspecial %%BeginDocument: adi.idraw /arrowhead { 0 begin transform originalCTM itransform /taily exch def /tailx exch def transform originalCTM itransform /tipy exch def /tipx exch def /dy tipy taily sub def /dx tipx tailx sub def /angle dx 0 ne dy 0 ne or { dy dx atan } { 90 } ifelse def gsave originalCTM setmatrix tipx tipy translate angle rotate newpath arrowHeight neg arrowWidth 2 div moveto 0 0 lineto arrowHeight neg arrowWidth 2 div neg lineto patternNone not { originalCTM setmatrix /padtip arrowHeight 2 exp 0.25 arrowWidth 2 exp mul add sqrt brushWidth mul arrowWidth div def /padtail brushWidth 2 div def tipx tipy translate angle rotate padtip 0 translate arrowHeight padtip add padtail add arrowHeight div dup scale arrowheadpath ifill } if brushNone not { originalCTM setmatrix tipx tipy translate angle rotate arrowheadpath istroke } if grestore end } dup 0 9 dict put def /arrowheadpath { newpath arrowHeight neg arrowWidth 2 div moveto 0 0 lineto arrowHeight neg arrowWidth 2 div neg lineto } def /leftarrow { 0 begin y exch get /taily exch def x exch get /tailx exch def y exch get /tipy exch def x exch get /tipx exch def brushLeftArrow { tipx tipy tailx taily arrowhead } if end } dup 0 4 dict put def /rightarrow { 0 begin y exch get /tipy exch def x exch get /tipx exch def y exch get /taily exch def x exch get /tailx exch def brushRightArrow { tipx tipy tailx taily arrowhead } if end } dup 0 4 dict put def /arrowHeight 10 def /arrowWidth 5 def /IdrawDict 51 dict def IdrawDict begin /reencodeISO { dup dup findfont dup length dict begin { 1 index /FID ne { def }{ pop pop } ifelse } forall /Encoding ISOLatin1Encoding def currentdict end definefont } def /ISOLatin1Encoding [ /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /space/exclam/quotedbl/numbersign/dollar/percent/ampersand/quoteright /parenleft/parenright/asterisk/plus/comma/minus/period/slash /zero/one/two/three/four/five/six/seven/eight/nine/colon/semicolon /less/equal/greater/question/at/A/B/C/D/E/F/G/H/I/J/K/L/M/N /O/P/Q/R/S/T/U/V/W/X/Y/Z/bracketleft/backslash/bracketright /asciicircum/underscore/quoteleft/a/b/c/d/e/f/g/h/i/j/k/l/m /n/o/p/q/r/s/t/u/v/w/x/y/z/braceleft/bar/braceright/asciitilde /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef /.notdef/dotlessi/grave/acute/circumflex/tilde/macron/breve /dotaccent/dieresis/.notdef/ring/cedilla/.notdef/hungarumlaut /ogonek/caron/space/exclamdown/cent/sterling/currency/yen/brokenbar /section/dieresis/copyright/ordfeminine/guillemotleft/logicalnot /hyphen/registered/macron/degree/plusminus/twosuperior/threesuperior /acute/mu/paragraph/periodcentered/cedilla/onesuperior/ordmasculine /guillemotright/onequarter/onehalf/threequarters/questiondown /Agrave/Aacute/Acircumflex/Atilde/Adieresis/Aring/AE/Ccedilla /Egrave/Eacute/Ecircumflex/Edieresis/Igrave/Iacute/Icircumflex /Idieresis/Eth/Ntilde/Ograve/Oacute/Ocircumflex/Otilde/Odieresis /multiply/Oslash/Ugrave/Uacute/Ucircumflex/Udieresis/Yacute /Thorn/germandbls/agrave/aacute/acircumflex/atilde/adieresis /aring/ae/ccedilla/egrave/eacute/ecircumflex/edieresis/igrave /iacute/icircumflex/idieresis/eth/ntilde/ograve/oacute/ocircumflex /otilde/odieresis/divide/oslash/ugrave/uacute/ucircumflex/udieresis /yacute/thorn/ydieresis ] def /Helvetica reencodeISO def /none null def /numGraphicParameters 17 def /stringLimit 65535 def /Begin { save numGraphicParameters dict begin } def /End { end restore } def /SetB { dup type /nulltype eq { pop false /brushRightArrow idef false /brushLeftArrow idef true /brushNone idef } { /brushDashOffset idef /brushDashArray idef 0 ne /brushRightArrow idef 0 ne /brushLeftArrow idef /brushWidth idef false /brushNone idef } ifelse } def /SetCFg { /fgblue idef /fggreen idef /fgred idef } def /SetCBg { /bgblue idef /bggreen idef /bgred idef } def /SetF { /printSize idef /printFont idef } def /SetP { dup type /nulltype eq { pop true /patternNone idef } { dup -1 eq { /patternGrayLevel idef /patternString idef } { /patternGrayLevel idef } ifelse false /patternNone idef } ifelse } def /BSpl { 0 begin storexyn newpath n 1 gt { 0 0 0 0 0 0 1 1 true subspline n 2 gt { 0 0 0 0 1 1 2 2 false subspline 1 1 n 3 sub { /i exch def i 1 sub dup i dup i 1 add dup i 2 add dup false subspline } for n 3 sub dup n 2 sub dup n 1 sub dup 2 copy false subspline } if n 2 sub dup n 1 sub dup 2 copy 2 copy false subspline patternNone not brushLeftArrow not brushRightArrow not and and { ifill } if brushNone not { istroke } if 0 0 1 1 leftarrow n 2 sub dup n 1 sub dup rightarrow } if end } dup 0 4 dict put def /Circ { newpath 0 360 arc patternNone not { ifill } if brushNone not { istroke } if } def /CBSpl { 0 begin dup 2 gt { storexyn newpath n 1 sub dup 0 0 1 1 2 2 true subspline 1 1 n 3 sub { /i exch def i 1 sub dup i dup i 1 add dup i 2 add dup false subspline } for n 3 sub dup n 2 sub dup n 1 sub dup 0 0 false subspline n 2 sub dup n 1 sub dup 0 0 1 1 false subspline patternNone not { ifill } if brushNone not { istroke } if } { Poly } ifelse end } dup 0 4 dict put def /Elli { 0 begin newpath 4 2 roll translate scale 0 0 1 0 360 arc patternNone not { ifill } if brushNone not { istroke } if end } dup 0 1 dict put def /Line { 0 begin 2 storexyn newpath x 0 get y 0 get moveto x 1 get y 1 get lineto brushNone not { istroke } if 0 0 1 1 leftarrow 0 0 1 1 rightarrow end } dup 0 4 dict put def /MLine { 0 begin storexyn newpath n 1 gt { x 0 get y 0 get moveto 1 1 n 1 sub { /i exch def x i get y i get lineto } for patternNone not brushLeftArrow not brushRightArrow not and and { ifill } if brushNone not { istroke } if 0 0 1 1 leftarrow n 2 sub dup n 1 sub dup rightarrow } if end } dup 0 4 dict put def /Poly { 3 1 roll newpath moveto -1 add { lineto } repeat closepath patternNone not { ifill } if brushNone not { istroke } if } def /Rect { 0 begin /t exch def /r exch def /b exch def /l exch def newpath l b moveto l t lineto r t lineto r b lineto closepath patternNone not { ifill } if brushNone not { istroke } if end } dup 0 4 dict put def /Text { ishow } def /idef { dup where { pop pop pop } { exch def } ifelse } def /ifill { 0 begin gsave patternGrayLevel -1 ne { fgred bgred fgred sub patternGrayLevel mul add fggreen bggreen fggreen sub patternGrayLevel mul add fgblue bgblue fgblue sub patternGrayLevel mul add setrgbcolor eofill } { eoclip originalCTM setmatrix pathbbox /t exch def /r exch def /b exch def /l exch def /w r l sub ceiling cvi def /h t b sub ceiling cvi def /imageByteWidth w 8 div ceiling cvi def /imageHeight h def bgred bggreen bgblue setrgbcolor eofill fgred fggreen fgblue setrgbcolor w 0 gt h 0 gt and { l w add b translate w neg h scale w h true [w 0 0 h neg 0 h] { patternproc } imagemask } if } ifelse grestore end } dup 0 8 dict put def /istroke { gsave brushDashOffset -1 eq { [] 0 setdash 1 setgray } { brushDashArray brushDashOffset setdash fgred fggreen fgblue setrgbcolor } ifelse brushWidth setlinewidth originalCTM setmatrix stroke grestore } def /ishow { 0 begin gsave fgred fggreen fgblue setrgbcolor /fontDict printFont printSize scalefont dup setfont def /descender fontDict begin 0 [FontBBox] 1 get FontMatrix end transform exch pop def /vertoffset 1 printSize sub descender sub def { 0 vertoffset moveto show /vertoffset vertoffset printSize sub def } forall grestore end } dup 0 3 dict put def /patternproc { 0 begin /patternByteLength patternString length def /patternHeight patternByteLength 8 mul sqrt cvi def /patternWidth patternHeight def /patternByteWidth patternWidth 8 idiv def /imageByteMaxLength imageByteWidth imageHeight mul stringLimit patternByteWidth sub min def /imageMaxHeight imageByteMaxLength imageByteWidth idiv patternHeight idiv patternHeight mul patternHeight max def /imageHeight imageHeight imageMaxHeight sub store /imageString imageByteWidth imageMaxHeight mul patternByteWidth add string def 0 1 imageMaxHeight 1 sub { /y exch def /patternRow y patternByteWidth mul patternByteLength mod def /patternRowString patternString patternRow patternByteWidth getinterval def /imageRow y imageByteWidth mul def 0 patternByteWidth imageByteWidth 1 sub { /x exch def imageString imageRow x add patternRowString putinterval } for } for imageString end } dup 0 12 dict put def /min { dup 3 2 roll dup 4 3 roll lt { exch } if pop } def /max { dup 3 2 roll dup 4 3 roll gt { exch } if pop } def /midpoint { 0 begin /y1 exch def /x1 exch def /y0 exch def /x0 exch def x0 x1 add 2 div y0 y1 add 2 div end } dup 0 4 dict put def /thirdpoint { 0 begin /y1 exch def /x1 exch def /y0 exch def /x0 exch def x0 2 mul x1 add 3 div y0 2 mul y1 add 3 div end } dup 0 4 dict put def /subspline { 0 begin /movetoNeeded exch def y exch get /y3 exch def x exch get /x3 exch def y exch get /y2 exch def x exch get /x2 exch def y exch get /y1 exch def x exch get /x1 exch def y exch get /y0 exch def x exch get /x0 exch def x1 y1 x2 y2 thirdpoint /p1y exch def /p1x exch def x2 y2 x1 y1 thirdpoint /p2y exch def /p2x exch def x1 y1 x0 y0 thirdpoint p1x p1y midpoint /p0y exch def /p0x exch def x2 y2 x3 y3 thirdpoint p2x p2y midpoint /p3y exch def /p3x exch def movetoNeeded { p0x p0y moveto } if p1x p1y p2x p2y p3x p3y curveto end } dup 0 17 dict put def /storexyn { /n exch def /y n array def /x n array def n 1 sub -1 0 { /i exch def y i 3 2 roll put x i 3 2 roll put } for } def /SSten { fgred fggreen fgblue setrgbcolor dup true exch 1 0 0 -1 0 6 -1 roll matrix astore } def /FSten { dup 3 -1 roll dup 4 1 roll exch newpath 0 0 moveto dup 0 exch lineto exch dup 3 1 roll exch lineto 0 lineto closepath bgred bggreen bgblue setrgbcolor eofill SSten } def /Rast { exch dup 3 1 roll 1 0 0 -1 0 6 -1 roll matrix astore } def Begin [ 0.799705 0 0 0.799705 0 0 ] concat /originalCTM matrix currentmatrix def Begin %I Pict Begin %I Pict Begin %I Line 0 0 1 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 1 -0 -0 1 97 141 ] concat 87 643 119 643 Line End Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 224 787 ] concat [ (Phase 1) ] Text End End %I eop Begin %I Pict Begin %I Line 0 0 1 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 1 -0 -0 1 97 141 ] concat 87 643 87 619 Line End Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 6.12303e-17 1 -1 6.12303e-17 176.5 714.5 ] concat [ (Phase2) ] Text End End %I eop Begin %I Pict [ 1 0 0 1 -96 48 ] concat Begin %I Pict [ 1 0 0 1 -8 0 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 344 715 ] concat [ (P0) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 224 280 ] concat 79 419 175 443 Rect End End %I eop Begin %I Pict [ 1 0 0 1 -8 8 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 344 683 ] concat [ (P1) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 224 248 ] concat 79 419 175 443 Rect End End %I eop Begin %I Pict Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 336 667 ] concat [ (P2) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 216 232 ] concat 79 419 175 443 Rect End End %I eop Begin %I Pict [ 1 0 0 1 0 8 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 336 635 ] concat [ (P3) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 216 200 ] concat 79 419 175 443 Rect End End %I eop End %I eop End %I eop Begin %I Pict [ 1 0 0 1 15 -1 ] concat Begin %I Pict [ 1 0 0 1 176 192 ] concat Begin %I Pict Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 184 571 ] concat [ (P0) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 92 144 ] concat 87 411 111 435 Rect End End %I eop Begin %I Pict [ 1 0 0 1 -168 -72 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 376 643 ] concat [ (P1) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 100 144 ] concat 271 483 295 507 Rect End End %I eop Begin %I Pict [ 1 0 0 1 -200 8 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 432 563 ] concat [ (P2) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 100 144 ] concat 327 403 351 427 Rect End End %I eop Begin %I Pict [ 1 0 0 1 -280 104 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 536 467 ] concat [ (P3) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 100 144 ] concat 431 307 455 331 Rect End End %I eop Begin %I Pict [ 1 0 0 1 72 -24 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 184 571 ] concat [ (P0) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 92 144 ] concat 87 411 111 435 Rect End End %I eop Begin %I Pict [ 1 0 0 1 48 -48 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 184 571 ] concat [ (P0) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 92 144 ] concat 87 411 111 435 Rect End End %I eop Begin %I Pict [ 1 0 0 1 24 -72 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 184 571 ] concat [ (P0) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 92 144 ] concat 87 411 111 435 Rect End End %I eop Begin %I Pict [ 1 0 0 1 -192 -96 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 376 643 ] concat [ (P1) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 100 144 ] concat 271 483 295 507 Rect End End %I eop Begin %I Pict [ 1 0 0 1 -120 -120 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 376 643 ] concat [ (P1) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 100 144 ] concat 271 483 295 507 Rect End End %I eop Begin %I Pict [ 1 0 0 1 -144 -144 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 376 643 ] concat [ (P1) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 100 144 ] concat 271 483 295 507 Rect End End %I eop Begin %I Pict [ 1 0 0 1 -224 -16 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 432 563 ] concat [ (P2) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 100 144 ] concat 327 403 351 427 Rect End End %I eop Begin %I Pict [ 1 0 0 1 -248 -40 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 432 563 ] concat [ (P2) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 100 144 ] concat 327 403 351 427 Rect End End %I eop Begin %I Pict [ 1 0 0 1 -176 -64 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 432 563 ] concat [ (P2) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 100 144 ] concat 327 403 351 427 Rect End End %I eop Begin %I Pict [ 1 0 0 1 -304 80 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 536 467 ] concat [ (P3) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 100 144 ] concat 431 307 455 331 Rect End End %I eop Begin %I Pict [ 1 0 0 1 -352 32 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 536 467 ] concat [ (P3) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 100 144 ] concat 431 307 455 331 Rect End End %I eop Begin %I Pict [ 1 0 0 1 -328 56 ] concat Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 536 467 ] concat [ (P3) ] Text End Begin %I Rect 0 0 0 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg none SetP %I p n [ 1 -0 -0 1 100 144 ] concat 431 307 455 331 Rect End End %I eop End %I eop Begin %I Pict [ 1 0 0 1 160 0 ] concat Begin %I Line 0 0 1 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 1 -0 -0 1 97 141 ] concat 87 643 87 619 Line End Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 6.12303e-17 1 -1 6.12303e-17 176.5 714.5 ] concat [ (Phase2) ] Text End End %I eop Begin %I Pict [ 1 0 0 1 160 0 ] concat Begin %I Line 0 0 1 [] 0 SetB 0 0 0 SetCFg 1 1 1 SetCBg 0 SetP [ 1 -0 -0 1 97 141 ] concat 87 643 119 643 Line End Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 224 787 ] concat [ (Phase 1) ] Text End End %I eop End %I eop Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 160.25 662.724 ] concat [ (\(a\) Row Block Distribution.) ] Text End Begin %I Text 0 0 0 SetCFg Helvetica 12 SetF [ 1 0 0 1 323.75 664 ] concat [ (\(b\) Our Proposed Distribution.) ] Text End End %I eop showpage end %%EndDocument @endspecial 269 391 a Fo(Figure)11 b(7:)18 b(Data)13 b(Distribution)e(used)i (to)f(alleviate)g(Memory)g(Contention.)503 1045 y @beginspecial 50 @llx 50 @lly 410 @urx 302 @ury 2057 @rwi @setspecial %%BeginDocument: 256.adi.ps /gnudict 40 dict def gnudict begin /Color false def /Solid false def /gnulinewidth 5.000 def /vshift -33 def /dl {10 mul} def /hpt 31.5 def /vpt 31.5 def /M {moveto} bind def /L {lineto} bind def /R {rmoveto} bind def /V {rlineto} bind def /vpt2 vpt 2 mul def /hpt2 hpt 2 mul def /Lshow { currentpoint stroke M 0 vshift R show } def /Rshow { currentpoint stroke M dup stringwidth pop neg vshift R show } def /Cshow { currentpoint stroke M dup stringwidth pop -2 div vshift R show } def /DL { Color {setrgbcolor Solid {pop []} if 0 setdash } {pop pop pop Solid {pop []} if 0 setdash} ifelse } def /BL { stroke gnulinewidth 2 mul setlinewidth } def /AL { stroke gnulinewidth 2 div setlinewidth } def /PL { stroke gnulinewidth setlinewidth } def /LTb { BL [] 0 0 0 DL } def /LTa { AL [1 dl 2 dl] 0 setdash 0 0 0 setrgbcolor } def /LT0 { PL [] 0 1 0 DL } def /LT1 { PL [4 dl 2 dl] 0 0 1 DL } def /LT2 { PL [2 dl 3 dl] 1 0 0 DL } def /LT3 { PL [1 dl 1.5 dl] 1 0 1 DL } def /LT4 { PL [5 dl 2 dl 1 dl 2 dl] 0 1 1 DL } def /LT5 { PL [4 dl 3 dl 1 dl 3 dl] 1 1 0 DL } def /LT6 { PL [2 dl 2 dl 2 dl 4 dl] 0 0 0 DL } def /LT7 { PL [2 dl 2 dl 2 dl 2 dl 2 dl 4 dl] 1 0.3 0 DL } def /LT8 { PL [2 dl 2 dl 2 dl 2 dl 2 dl 2 dl 2 dl 4 dl] 0.5 0.5 0.5 DL } def /P { stroke [] 0 setdash currentlinewidth 2 div sub M 0 currentlinewidth V stroke } def /D { stroke [] 0 setdash 2 copy vpt add M hpt neg vpt neg V hpt vpt neg V hpt vpt V hpt neg vpt V closepath stroke P } def /A { stroke [] 0 setdash vpt sub M 0 vpt2 V currentpoint stroke M hpt neg vpt neg R hpt2 0 V stroke } def /B { stroke [] 0 setdash 2 copy exch hpt sub exch vpt add M 0 vpt2 neg V hpt2 0 V 0 vpt2 V hpt2 neg 0 V closepath stroke P } def /C { stroke [] 0 setdash exch hpt sub exch vpt add M hpt2 vpt2 neg V currentpoint stroke M hpt2 neg 0 R hpt2 vpt2 V stroke } def /T { stroke [] 0 setdash 2 copy vpt 1.12 mul add M hpt neg vpt -1.62 mul V hpt 2 mul 0 V hpt neg vpt 1.62 mul V closepath stroke P } def /S { 2 copy A C} def end gnudict begin gsave 50 50 translate 0.100 0.100 scale 0 setgray /Times-Roman findfont 100 scalefont setfont newpath LTa 600 251 M 0 2218 V LTb LTa 600 251 M 2817 0 V LTb 600 251 M 63 0 V 2754 0 R -63 0 V 540 251 M (1000) Rshow LTa 600 585 M 2817 0 V LTb 600 585 M 31 0 V 2786 0 R -31 0 V LTa 600 780 M 2817 0 V LTb 600 780 M 31 0 V 2786 0 R -31 0 V LTa 600 919 M 2817 0 V LTb 600 919 M 31 0 V 2786 0 R -31 0 V LTa 600 1026 M 2817 0 V LTb 600 1026 M 31 0 V 2786 0 R -31 0 V LTa 600 1114 M 2817 0 V LTb 600 1114 M 31 0 V 2786 0 R -31 0 V LTa 600 1188 M 2817 0 V LTb 600 1188 M 31 0 V 2786 0 R -31 0 V LTa 600 1253 M 2817 0 V LTb 600 1253 M 31 0 V 2786 0 R -31 0 V LTa 600 1309 M 2817 0 V LTb 600 1309 M 31 0 V 2786 0 R -31 0 V LTa 600 1360 M 2817 0 V LTb 600 1360 M 63 0 V 2754 0 R -63 0 V -2814 0 R (10000) Rshow LTa 600 1694 M 2817 0 V LTb 600 1694 M 31 0 V 2786 0 R -31 0 V LTa 600 1889 M 2817 0 V LTb 600 1889 M 31 0 V 2786 0 R -31 0 V LTa 600 2028 M 2817 0 V LTb 600 2028 M 31 0 V 2786 0 R -31 0 V LTa 600 2135 M 2817 0 V LTb 600 2135 M 31 0 V 2786 0 R -31 0 V LTa 600 2223 M 2817 0 V LTb 600 2223 M 31 0 V 2786 0 R -31 0 V LTa 600 2297 M 2817 0 V LTb 600 2297 M 31 0 V 2786 0 R -31 0 V LTa 600 2362 M 2817 0 V LTb 600 2362 M 31 0 V 2786 0 R -31 0 V LTa 600 2418 M 2817 0 V LTb 600 2418 M 31 0 V 2786 0 R -31 0 V LTa 600 2469 M 2817 0 V LTb 600 2469 M 63 0 V 2754 0 R -63 0 V -2814 0 R (100000) Rshow LTa 600 251 M 0 2218 V LTb 600 251 M 0 63 V 0 2155 R 0 -63 V 600 151 M (0) Cshow LTa 952 251 M 0 2218 V LTb 952 251 M 0 63 V 0 2155 R 0 -63 V 952 151 M (2) Cshow LTa 1304 251 M 0 2218 V LTb 1304 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (4) Cshow LTa 1656 251 M 0 2218 V LTb 1656 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (6) Cshow LTa 2009 251 M 0 2218 V LTb 2009 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (8) Cshow LTa 2361 251 M 0 2218 V LTb 2361 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (10) Cshow LTa 2713 251 M 0 2218 V LTb 2713 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (12) Cshow LTa 3065 251 M 0 2218 V LTb 3065 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (14) Cshow LTa 3417 251 M 0 2218 V LTb 3417 251 M 0 63 V 0 2155 R 0 -63 V 0 -2255 R (16) Cshow 600 251 M 2817 0 V 0 2218 V -2817 0 V 600 251 L 220 1260 M currentpoint gsave translate 90 rotate 0 0 M (Execution Time \(milli sec\)) Cshow grestore 2008 51 M (Number of Processors) Cshow LT0 2009 1253 M (Owner Computes\(Block, Block\)) Rshow 2069 1253 M 180 0 V 952 2200 M 352 -24 V 705 86 V 1408 89 V 2129 1253 D 952 2200 D 1304 2176 D 2009 2262 D 3417 2351 D LT1 2009 1153 M (Sequential) Rshow 2069 1153 M 180 0 V 776 2146 M 2129 1153 A 776 2146 A LT2 2009 1053 M (No Distributions) Rshow 2069 1053 M 180 0 V 3417 2056 M -704 3 V -704 6 V -705 8 V -352 61 V 2129 1053 B 3417 2056 B 2713 2059 B 2009 2065 B 1304 2073 B 952 2134 B LT3 2009 953 M (\(*,Cyclic\)) Rshow 2069 953 M 180 0 V 952 2090 M 352 -33 V 705 -48 V 704 25 V 704 18 V 2129 953 C 952 2090 C 1304 2057 C 2009 2009 C 2713 2034 C 3417 2052 C LT4 2009 853 M (\(*,Block\)) Rshow 2069 853 M 180 0 V 952 2070 M 352 -53 V 705 -45 V 704 23 V 704 9 V 2129 853 T 952 2070 T 1304 2017 T 2009 1972 T 2713 1995 T 3417 2004 T LT5 2009 753 M (Owner Computes\(*,Cyclic\)) Rshow 2069 753 M 180 0 V 952 2080 M 352 -252 V 705 -78 V 352 216 V 352 20 V 704 128 V 2129 753 S 952 2080 S 1304 1828 S 2009 1750 S 2361 1966 S 2713 1986 S 3417 2114 S LT6 2009 653 M (Owner Computes\(*,Block\)) Rshow 2069 653 M 180 0 V 3417 1820 M 2713 1611 L -704 -55 V -705 118 V 952 1954 L 2129 653 D 3417 1820 D 2713 1611 D 2009 1556 D 1304 1674 D 952 1954 D LT7 2009 553 M (\(Block,Block\)) Rshow 2069 553 M 180 0 V 1168 547 R -704 222 V -704 85 V -705 249 V 952 1951 L 2129 553 A 3417 1100 A 2713 1322 A 2009 1407 A 1304 1656 A 952 1951 A stroke grestore end showpage %%EndDocument @endspecial 532 1111 a(Figure)f(8:)18 b(ADI)12 b(Performance)f(\(256x256\).) 4 1246 y(cases)18 b(adhering)d(to)i(owner)o(-computes)e(rule)h(can)h(incur)e (severe)i(synchronization)f(or)g(ownership)f(test)4 1300 y(overhead)c(which)f (exceeds)h(the)g(cost)g(of)f(accessing)i(remote)e(memory)m(.)17 b(W)l(e)11 b(use)g(the)g(Altering)f(Direction)4 1355 y(Integration)i(\()p Fd(ADI)p Fo(\))f(to)i(illustrate)f(that)h(the)g(shared)f(address)i(space)f (provides)g(\257exibility)e(in)i(the)g(choice)4 1409 y(of)i(computation)f (partitions,)h(reducing)f(contention)g(and)h(synchronization)f(overhead,)i (and)e(resulting)4 1463 y(in)e(signi\256cant)g(performance)g(improvements.)77 1539 y(W)l(e)k(use)f(the)g(Hector)n(,)h(a)f(Non-Uniform)e(Memory)i(Access)h (multiprocessor)n(,)f(as)g(an)h(experimental)4 1593 y(platform.)21 b(Hector)13 b(consists)h(of)f(4)h(sets)g(of)f(processor)o(-memory)g(pairs)g (connected)h(by)f(a)h(bus)g(to)f(form)g(a)4 1647 y(station;)g(4)g(stations)g (are)f(connected)h(by)g(a)g(local)f(ring)g(to)h(form)f(a)h(cluster;)f(4)h (local)g(rings)f(are)h(connected)4 1701 y(by)g(a)g(global)g(ring.)19 b(W)l(e)14 b(use)f(a)g(system)h(with)e(one)h(cluster)m(.)21 b(Each)13 b(processor)o(-memory)f(pair)g(consists)i(of)4 1755 y(a)f(Motorola)f(MC88100)h(CPU,)g(a)g(16)f(KB)h(instruction)f(cache,)i(a)f (16)f(KB)h(data)g(cache)g(and)g(4)f(MB)i(of)e(the)4 1809 y(globally)i (addressable)g(memory)m(.)23 b(The)15 b(hardware)e(provides)h(no)g(support)g (for)f(cache)i(coherence.)23 b(The)4 1864 y(coherence)12 b(of)f(data)h(is)g (maintained)f(by)h(software)f(at)h(cache)g(line)f(granularity)g([10)o(].)18 b(Data)12 b(distributions)4 1918 y(are)g(implemented)g(using)g(the)h(array)e (allocation)h(techniques)h(described)f(in)g([21,)h(3].)4 2079 y Fc(5.1)58 b(Contention)15 b(and)f(Synchr)n(onization)h(Conscious)e (Distribution)4 2177 y Fo(The)21 b Fd(ADI)e Fo(program)g(has)h(two)g(phases)h (with)e(parallelism)g(along)h(orthogonal)f(dimensions)h(in)f(each)4 2231 y(phase.)k(It)13 b(operates)g(on)h(three)f(2-dimensional)g(arrays)g Fj(A)p Fo(,)h Fj(B)i Fo(and)e Fj(X)t Fo(.)22 b(A)14 b(single)g(iteration)e (of)i(an)f(outer)4 2285 y(sequentially)d(iterated)f(loop)h(consists)g(of)g(a) g(forward)f(and)g(a)i(backward)e(sweep)i(phase)f(along)g(the)f(rows)h(of)4 2339 y(three)f(arrays,)h(followed)e(by)h(another)g(forward)e(and)i(backward)g (sweep)h(phase)f(along)g(the)g(columns)g(of)g(the)4 2394 y(arrays)g([18].)17 b(This)10 b(application)f(is)h(typical)f(of)g(other)g(programs)g(such)g(as)h Fd(2D-FFT)f Fo(and)h Fd(Erlebacher)4 2448 y Fo(that)i(have)h(parallelism)f (in)g(orthogonal)f(directions)h(in)g(dif)o(ferent)f(phases)j(of)e(the)g (program.)77 2523 y(The)k(best)g(data)f(distribution)f(scheme)i(for)e Fd(ADI)h Fo(remains)g(an)g(issue)h(of)f(debate)h([18)o(,)g(4].)26 b(The)16 b(two)4 2578 y(proposed)h(schemes)g(partition)f(arrays)g(along)h(a)f (single)h(dimension,)h(either)e(in)h(blocks)g(or)f(cyclically)m(.)4 2632 y(These)f(distributions,)g(in)e(conjunction)h(with)g(the)g(owner)o (-computes)f(rule)h(result)f(in)h(a)h(wavefront)e(type)4 2686 y(computation,)e(leading)g(to)g(heavy)g(synchronization)g(overhead)g(in)g (one)g(of)f(the)i(phases)g(of)e(the)h(program.)4 2740 y(Figure)i(7\(a\))g (shows)i(a)f Fd(Block)f Fo(distribution)g(of)g(the)h(rows)g(of)f(the)h (arrays.)23 b(W)n(ith)13 b(such)h(a)g(distribution,)4 2794 y(during)g(the)h(\256rst)g(phase)g(of)g(the)g(program)e(all)i(the)g (processors)g(access)i(data)e(that)f(is)i(local)f(and)g(require)4 2848 y(no)g(communication.)25 b(During)14 b(the)h(second)g(phase,)h(however)n (,)g(the)f(parallelism)f(is)h(orthogonal)f(to)h(the)p eop %%Page: 8 8 8 7 bop 35 1 a Fo(T)m(able)12 b(1:)18 b(Performance)11 b(Bottlenecks)i(for)e (various)h(data)h(and)f(computation)g(partitioning)e(for)i(ADI.)p 244 33 1363 2 v 243 84 2 51 v 252 84 V 277 69 a Fb(Data)g(Distribution)p 620 84 V 77 w(Compute)f(Rule)p 993 84 V 135 w(Performance)i(Bottleneck)p 1597 84 V 1606 84 V 244 85 1363 2 v 243 136 2 51 v 252 136 V 387 121 a(None)p 620 136 V 247 w(Relaxed)p 993 136 V 228 w(Memory)e(Contention)p 1597 136 V 1606 136 V 243 187 V 252 187 V 344 172 a(\(*,)h(Block\))p 620 187 V 117 w(Owner)o(-Computes)p 993 187 V 125 w(High)f(Synchronization)p 1597 187 V 1606 187 V 243 238 V 252 238 V 344 223 a(\(*,)h(Block\))p 620 238 V 204 w(Relaxed)p 993 238 V 228 w(Memory)f(Contention)p 1597 238 V 1606 238 V 243 289 V 252 289 V 339 273 a(\(*,)h(Cyclic\))p 620 289 V 112 w(Owner)o(-Computes)p 993 289 V 125 w(High)f(Synchronization)p 1597 289 V 1606 289 V 243 340 V 252 340 V 339 324 a(\(*,)h(Cyclic\))p 620 340 V 199 w(Relaxed)p 993 340 V 228 w(Memory)f(Contention)p 1597 340 V 1606 340 V 243 390 V 252 390 V 301 375 a(\(Block,)h(Block\))p 620 390 V 74 w(Owner)o(-Computes)p 993 390 V 180 w(Ownership)f(tests)p 1597 390 V 1606 390 V 243 441 V 252 441 V 301 426 a(\(Block,)h(Block\))p 620 441 V 161 w(Relaxed)p 993 441 V 137 w(High)f(Remote)g(Memory)g(Access)p 1597 441 V 1606 441 V 244 443 1363 2 v 4 606 a Fo(direction)17 b(of)g(distribution.)32 b(Strict)16 b(adherence)h(to)g(the)h(owner)o (-computes)e(rule)h(implies)g(ordering)f(of)4 660 y(the)d(computations)f(by)g (processors)h(on)f(the)g(corresponding)g(chunk)g(of)g(the)h(columns)f(they)g (own.)19 b(Thus,)4 715 y(processor)10 b Fj(i)g Fo(has)h(to)f(wait)f(for)h (processor)g Fj(i)c Fh(\000)g Fo(1)j(to)h(\256nish)g(the)g(computation)f(on)h (its)h(chunk)f(of)f(the)h(column)4 769 y(before)g(proceeding.)17 b(A)10 b(lar)o(ger)f(number)h(of)f(synchronizations)h(are)g(required)f(to)h (maintain)g(the)g(ordering)4 823 y(involved)i(in)g(the)g(wavefront)g (computation.)77 899 y(The)i(synchronization)e(overhead)h(can)g(be)g (eliminated)f(by)h(relaxing)f(the)h(owner)o(-computes)f(rule)h(in)4 953 y(the)18 b(second)g(phase)h(and)f(allowing)f(the)h(processor)g(to)f (write)h(the)f(results)i(to)e(remote)h(memory)f(mod-)4 1007 y(ules.)24 b(This)15 b(eliminates)f(synchronization)f(overhead)h(at)g(the)g (expense)g(of)g(increased)g(remote)g(memory)4 1061 y(accesses.)26 b(However)n(,)15 b(the)g(use)g(of)f(this)g(relaxed)g(compute)g(rule)g(with)g (the)g Fd(\(*,Block\))g Fo(distribution)4 1115 y(results)9 b(in)g(heavy)g(contention.)17 b(Each)9 b(processor)g(is)g(responsible)g(for)f (computing)g(a)i(column,)f(and)g(hence,)4 1169 y(each)14 b(processor)g (accesses)h(every)e(memory)g(module)g(in)g(sequence.)23 b(Thus,)15 b(a)e(given)h(memory)e(module)4 1224 y(is)h(accessed)h(by)e(every)g (processor)g(at)h(the)f(same)h(time,)f(leading)g(to)h(contention.)77 1299 y(The)k(data)e(distribution)g(scheme)h(depicted)f(in)h(Figure)f(7\(b\)) 1149 1281 y Fm(4)1182 1299 y Fo(eliminates)h(contention)f(and)h(results)4 1353 y(in)21 b(the)g(best)h(possible)f(performance)f(with)h(the)g(relaxed)g (compute)g(rule.)44 b(W)n(ith)21 b(this)g(distribution,)4 1408 y(processors)13 b(access)g(data)g(from)e(remote)g(memory)h(modules)g(in)g (both)g(phases)h(of)f(the)g(program.)17 b(In)12 b(both)4 1462 y(phases,)h(processors)f(start)g(working)e(on)i(the)f(columns)h(assigned)g (to)g(them)f(by)g(accessing)i(data)f(that)f(is)h(in)4 1516 y(dif)o(ferent)f(memory)f(modules)i(thus)g(avoiding)f(contention.)18 b(There)12 b(is)g(no)f(wavefront)g(type)h(parallelism,)4 1570 y(and)h(hence)f(no)g(overhead)g(involved)g(due)h(to)f(synchronization.)77 1646 y(The)19 b(use)f(of)f(owner)o(-computes)g(rule)h(with)f(the)h (distribution)f(of)g(Figure)g(7\(b\))g(will)h(not)f(result)h(in)4 1700 y(good)f(performance.)31 b(Either)17 b(ownership)g(tests)h(must)f(be)g (introduced)f(in)h(the)g(body)g(of)g(the)g(loops)g(to)4 1754 y(enforce)c(the)g(owner)o(-computes)f(rule,)i(or)e(the)i(loops)f(must)g(be)g (rewritten)g(with)f(additional)h(strip-mined)4 1808 y(controlling)f(loops)h (to)g(schedule)h(the)f(computations)f(on)h(sub-blocks)g(of)g(the)g(array)m(.) 20 b(The)14 b(former)d(leads)4 1862 y(to)h(overhead)g(and)h(the)f(latter)g (introduces)g(synchronization)g(similar)f(to)i(the)f(wavefront)f (computation.)77 1938 y(The)j(result)g(of)f(executing)g(the)h Fd(ADI)f Fo(application)g(on)h(the)f(Hector)g(multiprocessor)g(for)g(a)h (data)f(size)4 1992 y(of)18 b(256x256)f(with)h(various)g(data)g (distributions)g(and)g(compute)g(rules)g(is)g(shown)g(in)g(Figure)g(8.)35 b(The)4 2046 y Fd(\(Block,Block\))17 b Fo(data)h(distribution)f(that)h (relaxes)g(the)g(owner)o(-computes)f(rule)g(outperforms)g(all)4 2101 y(data)d(distribution)e(schemes)i(that)f(adhere)g(to)g(the)g(rule.)21 b(The)14 b(\256gure)f(also)g(indicates)h(that)f(the)g(overhead)4 2155 y(due)j(to)g(the)f(ownership)h(tests)g(when)g(using)g(the)f(owner)o (-computes)g(rule)h(with)f(a)h Fd(\(Block,Block\))4 2209 y Fo(distribution)d(degrades)h(performance.)21 b(It)14 b(is)g(also)g(clear)g (that)f(the)h(use)g(of)g(data)g(distribution)e(improves)4 2263 y(performance)i(over)h(the)g(use)h(of)f(operating)f(system)i(policies)f(to)g (manage)g(data)h(\(the)e(no)i(distributions)4 2317 y(curve\).)35 b(The)19 b(performance)e(bottlenecks)h(of)g(various)g(distributions)g(for)f Fd(ADI)h Fo(are)g(summarized)g(in)4 2371 y(T)m(able)13 b(1.)p 4 2406 737 2 v 62 2437 a Fl(4)79 2452 y Fb(This)f(is)h(equivalent)g(to)f (!HPF$)i(PROCESSORS)i(PROCS\(N\))g(with)c(!HPF$)i(DISTRIBUTE)h(B\(BLOCK,)4 2503 y(BLOCK\),)10 b(X\(BLOCK,)g(BLOCK\))g(ON)e(PROCS)j(in)d Fa(HPF)p Fb(.)i(In)f(the)f(current)h Fa(HPF)h Fb(speci\256cation,)f(this)f (distribution)4 2554 y(is)18 b(not)g(valid;)k(the)c(rank)h(of)g(each)g (distributee)f(must)f(equal)i(the)g(rank)f(of)h(the)g(named)f(processor)h (grid)f([16].)4 2604 y(Distributions)7 b(in)i(which)g(this)g(is)f(not)h(the)g (case)h(introduce)f(additional)g(complexity)e(on)j(DMMs)e([17].)16 b(In)10 b(contrast,)4 2655 y(SSMMs)h(provide)g(the)g(\257exibility)f(to)h (implement)f(these)h(distributions.)p eop %%Page: 9 9 9 8 bop 4 -21 a Fn(6)71 b(Related)19 b(W)l(ork)4 106 y Fo(Several)12 b(researchers)g(have)g(focused)g(on)g(the)g(problem)f(of)g(deriving)g(data)h (distributions)f(automatically)4 160 y(for)g(DMMs.)20 b(Li)12 b(and)g(Chen)h([22)o(],)f(Gupta)g(and)g(Banerjee)h([12)o(],)f(Zima)h(et)f (al.)h([9)o(])f(and)g(Garcia)g(et)g(al.)h([11)o(])4 215 y(follow)e(the)h (approach)g(of)f(\256nding)h(the)f(alignment)h(constraints)g(between)g(dif)o (ferent)e(dimensions)i(of)g(the)4 269 y(arrays)g(and)g(derive)g(a)g(data)g (distribution)f(that)h(minimizes)g(interprocessor)g(communication.)17 b(T)m(o)12 b(avoid)4 323 y(a)f(heuristic)g(approach,)g(Bixby)g(et)g(al.)h([7) o(])f(formulate)e(a)j(0-1)e(integer)g(programming)g(problem)g(for)g(deriv-)4 377 y(ing)k(data)g(distributions.)21 b(Their)14 b(approach)g(relies)g(on)f (the)h(assumption)g(that)g(a)g(good)f(data)h(distribution)4 431 y(for)h(the)i(entire)e(program)g(can)i(be)f(found)f(by)i(mer)o(ging)e (the)h(data)g(distributions)g(of)f(smaller)h(segments)4 485 y(of)g(the)g(program.)27 b(They)17 b(minimize)e(the)h(interprocessor)f (communication)g(using)h(the)g(\252performance)4 540 y(estimator)r(\272)c (developed)h(by)g(Balasundaram)g(et)g(al.)g([6)o(].)20 b(Anderson)12 b([5])g(presents)i(an)e(algebraic)h(frame-)4 594 y(work)g(for)g(determining)f (data)h(and)h(computation)e(partitions)h(by)g(minimizing)g(communication)f (across)4 648 y(processors.)28 b(Data)16 b(transformations)e(are)i(then)f (applied)h(so)f(that)h(the)f(processors)h(access)h(contiguous)4 702 y(data)g(regions)f(to)h(reduce)g(false)g(sharing.)31 b(This)17 b(technique)g(is)g(oblivious)f(to)h(SSMM)g(speci\256c)g(issues)4 756 y(such)c(as)g(contention)f(and)g(cache)h(af)o(\256nity)m(.)4 938 y Fn(7)71 b(Concluding)19 b(Remarks)4 1065 y Fo(Although)9 b(lar)o(ge)g(SSMMs)i(are)e(built)g(based)h(on)g(an)g(architecture)e(with)i (distributed)f(memory)m(,)g(the)h(shared)4 1119 y(memory)15 b(paradigm)g(introduces)g(performance)g(issues)i(that)e(are)h(dif)o(ferent)f (from)f(those)i(encountered)4 1173 y(in)e(DMMs.)24 b(The)14 b(high)f(cost)i(of)e(interprocessor)g(communication)g(in)h(distributed)f (memory)f(multipro-)4 1227 y(cessors)18 b(makes)e(the)h(minimization)e(of)h (communication)g(the)g(predominant)g(issue)h(in)f(selecting)h(data)4 1282 y(distributions)h(and)i(in)e(partitioning)g(computations.)38 b(On)19 b(SSMMs,)j(a)d(methodology)f(for)h(selecting)4 1336 y(data)14 b(distributions)g(must)g(also)g(consider)g(cache)h(af)o(\256nity)m (,)f(memory)f(contention)h(and)g(false)g(sharing)g(in)4 1390 y(addition)d(to)g(the)g(cost)h(of)f(interprocessor)g(communication.)17 b(Furthermore,)10 b(the)h(single)h(shared)f(address)4 1444 y(space)j(present)f(in)g(SSMMs)g(provides)g(\257exibility)f(in)h(the)g (selection)g(of)f(computation)h(partitions.)19 b(This)4 1498 y(should)e(be)f(exploited)g(in)g(applications)h(in)f(which)g(owner)o (-computes)g(results)g(in)h(poor)f(performance.)4 1552 y(The)f Fe(Jasmine)g Fo(compiler)f(project)g([2)o(])h(is)g(investigating)f(the)g (issues)i(discussed)f(in)g(this)f(paper)h(through)4 1607 y(the)d(development) g(of)g(a)h(framework)e(for)g(automatically)h(deriving)f(data)i(distributions) e(on)i(SSMMs.)4 1777 y Fn(Refer)o(ences)29 1896 y Fo([1])24 b(T)l(.S.)17 b(Abdelrahman)f(et)g(al.)29 b(An)16 b(overview)f(of)h(the)g (NUMAchine)h(multiprocessor)e(project.)28 b(In)112 1943 y Fe(Pr)n(oc.)13 b(of)g(the)f(Canadian)g(Super)n(computing)g(Conf.)p Fo(,)h(pages)g (283\261295,)f(1994.)29 2032 y([2])24 b(T)l(.S.)12 b(Abdelrahman,)f(N.)h (Manjikian,)g(and)f(S.)g(T)m(andri.)16 b(The)11 b(Jasmine)h(Compiler.)k(In)10 b(preparation.)29 2120 y([3])24 b(T)l(.S.)e(Abdelrahman)e(and)h(T)l(.N.)h(W)l (ong.)41 b(Distributed)20 b(array)g(data)h(management)g(on)f(NUMA)112 2167 y(multiprocessors.)d(In)12 b Fe(Pr)n(oc.)i(of)e(SHPCC)p Fo(,)i(pages)f(551\261559,)f(1994.)29 2256 y([4])24 b(S.P)-6 b(.)10 b(Amarasinghe,)g(J.M.)h(Anderson,)f(M.S.)h(Lam,)g(and)e(A.W)-5 b(.)11 b(Lim.)j(An)9 b(overview)g(of)g(a)h(compiler)112 2303 y(for)k(scalable)j(parallel)e(machines.)27 b(In)15 b Fe(Languages)h(and)f (Compilers)i(for)f(Parallel)g(Computing)p Fo(,)112 2350 y(pages)c (253\261272.)h(Springer)o(-V)-6 b(erlag)10 b(LNCS-768,)j(1993.)29 2438 y([5])24 b(J.M.)13 b(Anderson.)j(Demonstration)11 b(of)g(automatic)g (data)h(and)f(computation)g(decomposition)g(tech-)112 2485 y(niques.)f(In)e Fe(Pr)n(oc.)g(of)g(the)g(W)-5 b(orkshop)8 b(on)g(Automatic)g(Data)g(Layout)g(and)g(Performance)g(Pr)n(ediction)p Fo(,)112 2532 y(1995.)29 2620 y([6])24 b(V)-6 b(.)12 b(Balasundaram,)i(G.)f (Fox,)g(K.)g(Kennedy)m(,)g(and)f(U.)i(Kremer)m(.)k(A)13 b(static)g (performance)e(estimator)112 2667 y(to)h(guide)g(data)g(partitioning)f (decisions.)19 b(In)12 b Fe(Pr)n(oc.)i(of)e(PPOPP)p Fo(,)j(pages)e (213\261223,)f(1991.)29 2756 y([7])24 b(R.)9 b(Bixby)m(,)h(K.)g(Kennedy)m(,)f (and)h(U.)f(Kremer)m(.)j(Automatic)d(data)g(layout)f(using)h(0-1)g(integer)f (program-)112 2803 y(ming.)15 b(In)c Fe(Pr)n(oc.)i(of)e(the)g(Int'l)f(Conf.)i (on)f(Parallel)g(Ar)n(chitectur)n(es)i(and)e(Compilation)g(T)-5 b(echniques)p Fo(,)112 2850 y(pages)12 b(111\261122,)h(1994.)p eop %%Page: 10 10 10 9 bop 29 -27 a Fo([8])24 b(W)-5 b(.J.)19 b(Bolosky)f(and)g(M.L.)h(Scott.) 32 b(False)18 b(sharing)f(and)h(its)g(ef)o(fect)f(on)g(shared)h(memory)f (multi-)112 20 y(processors.)k(In)13 b Fe(Pr)n(oc.)j(of)d(4th)g(Symp.)h(on)g (Experiences)h(with)e(Distributed)h(and)f(Multipr)n(ocessor)112 67 y(Systems)p Fo(,)g(pages)g(57\26171,)f(1993.)29 155 y([9])24 b(B.M.)14 b(Chapman,)g(T)l(.)h(Fahringer)n(,)e(and)g(H.)h(Zima.)21 b(Automatic)12 b(support)h(for)f(data)i(distribution)e(on)112 202 y(distributed)g(memory)g(multiprocessor)g(systems.)22 b(In)12 b Fe(Languages)h(and)g(Compilers)h(for)f(Parallel)112 249 y(Computing)p Fo(,)f(pages)h(184\261199.)f(Springer)o(-V)-6 b(erlag)11 b(LNCS-768,)h(1993.) 4 337 y([10])24 b(B.)11 b(Gamsa.)k(Region-oriented)9 b(main)h(memory)f (management)h(in)g(shared-memory)f(NUMA)i(mul-)112 384 y(tiprocessors.)19 b(Master)r(')m(s)13 b(thesis,)h(Department)d(of)i(Computer)f(Science,)h (University)f(of)g(T)m(oronto,)112 431 y(T)m(oronto,)f(CANADA,)i(1992.)4 519 y([11])24 b(J.)f(Garcia,)h(E.)f(A)-5 b(yguade,)26 b(and)c(J.)h(Labarta.) 44 b(A)22 b(novel)g(approach)g(towards)g(automatic)g(data)112 566 y(distribution.)33 b(In)18 b Fe(Pr)n(oc.)i(of)e(the)h(W)-5 b(orkshop)20 b(on)e(Automatic)g(Data)g(Layout)g(and)h(Performance)112 613 y(Pr)n(ediction)p Fo(,)13 b(1995.)4 701 y([12])24 b(M.)16 b(Gupta)f(and)h(P)-6 b(.)17 b(Banerjee.)27 b(Automatic)15 b(data)g (partitioning)g(on)g(distributed)g(memory)g(multi-)112 748 y(processors.)j Fe(IEEE)c(T)m(rans.)f(on)f(Parallel)h(and)f(Distributed)h (Systems)p Fo(,)g(3\(2\):179\261193,)e(1992.)4 836 y([13])24 b(K.)15 b(Harzallah)g(and)g(K.C.)h(Sevcik.)25 b(Hot)15 b(spot)g(analysis)g (in)g(lar)o(ge)g(scale)h(shared)f(memory)f(multi-)112 883 y(processors.)k(In) 12 b Fe(Pr)n(oc.)i(of)e(Super)n(computing'93)p Fo(,)g(pages)h(895\261905.)f (ACM,)i(1993.)4 971 y([14])24 b(M.)11 b(Heinrich)f(et)h(al.)16 b(The)11 b(Stanford)f(FLASH)g(Multiprocessor.)16 b(In)10 b Fe(Pr)n(oc.)i(of)f(the)g(21st)g(Int'l)e(Symp.)112 1018 y(on)j(Computer)g(Ar)n (chitectur)n(e)p Fo(,)j(pages)e(302\261313,)f(1994.)4 1106 y([15])24 b(S.)15 b(Hiranandani,)i(K.)f(Kennedy)m(,)g(and)g(C.)g(T)m(seng.)27 b(Compiler)15 b(optimizations)g(for)g(Fortran)f(D)i(on)112 1153 y(MIMD)f(distributed-memory)e(machines.)25 b(In)15 b Fe(Pr)n(oc.)h(of)f (Super)n(computing'91)p Fo(,)g(pages)h(86\261100,)112 1200 y(Albuquerque,)c(NM,)h(1991.)4 1288 y([16])24 b(HPF)l(.)33 b(High)17 b(Performance)g(Fortran)g(Language)i(Speci\256cation)e(\(High)g (Performance)g(Fortran)112 1335 y(Forum\).)f(T)m(echnical)d(report)e (CRPC-TR92225,)i(Rice)g(University)m(,)f(1994.)4 1424 y([17])24 b(C.)13 b(Koelbel.)18 b(HPF)12 b(constraints.)18 b(Personal)12 b(Communications,)g(1995.)4 1512 y([18])24 b(U.)14 b(Kremer)m(.)23 b(Automatic)14 b(data)g(layout)g(for)g(distributed-memory)e(multiprocessors.) 23 b(T)m(echnical)112 1559 y(report)11 b(CRPC-TR93229-S,)h(Center)h(for)e (Research)i(on)f(Parallel)g(Computation,)g(1993.)4 1647 y([19])24 b(T)l(.T)l(.)15 b(Kwan,)f(B.K.)h(T)m(otty)m(,)e(and)g(D.A.)h(Reed.)21 b(Communication)12 b(and)i(computation)e(performance)112 1694 y(of)f(the)i(CM5.)19 b(In)12 b Fe(Pr)n(oc.)h(of)g(Super)n(computing'93)p Fo(,)f(pages)h(192\261201.)f(ACM,)h(1993.)4 1782 y([20])24 b(D.)15 b(Lenoski)h(et)f(al.)26 b(The)15 b(Stanford)f(DASH)h(multiprocessor)m (.)25 b Fe(IEEE)16 b(Computer)p Fo(,)h(25\(3\):63\26179,)112 1829 y(1992.)4 1917 y([21])24 b(H.)12 b(Li)g(and)g(K.C.)h(Sevcik.)k (Numacros:)g(Data)12 b(parallel)f(programming)f(on)i(NUMA)g(multiproces-)112 1964 y(sors.)h(In)c Fe(Pr)n(oc.)i(of)e(4th)g(Symp.)h(on)g(Experiences)g(with) g(Distributed)f(and)g(Multipr)n(ocessor)j(Systems)p Fo(,)112 2011 y(pages)g(247\261263,)h(1993.)4 2099 y([22])24 b(J.)11 b(Li)h(and)f(M.)h(Chen.)k(Compiling)10 b(communication-ef)o(\256cient)f (programs)i(for)f(massively)h(parallel)112 2146 y(machines.)18 b Fe(Journal)12 b(of)h(Parallel)f(and)h(Distributed)f(Computing)p Fo(,)g(2\(3\):361\261376,)f(1991.)4 2234 y([23])24 b(Cray)14 b(Research.)25 b(The)16 b(Cray)e(Research)h(Massively)h(Parallel)e(Processor) g(System)h(-)f(Cray)h(T3D.)112 2281 y(T)m(echnical)d(report)f(80922,)i (Munchen,)g(Germany)m(,)f(1993.)4 2369 y([24])24 b(Kendall)12 b(Square)f(Research.)19 b Fe(KSR1)13 b(Principles)h(of)e(Operation)p Fo(.)18 b(W)l(altham,)13 b(MA,)g(1991.)4 2457 y([25])24 b(J.)12 b(T)m(orres,)g(E.)h(A)-5 b(yguade,)13 b(J.)g(Labarta,)f(and)g(M.)h(V)-6 b(alero.)17 b(Align)12 b(and)g(distribute-based)f(linear)g(loop)112 2504 y(transformations.)16 b(In)11 b Fe(Languages)g(and)h(Compilers)h(for)f (Parallel)g(Computing)p Fo(,)g(pages)g(321\261339.)112 2551 y(Springer)o(-V)-6 b(erlag)10 b(LNCS-768,)j(1993.)4 2639 y([26])24 b(Z.)17 b(V)m(ranesic,)h(M.)f(Stumm,)f(R.)i(White,)f(and)f(D.)h(Lewis.)30 b(The)16 b(Hector)g(Multiprocessor.)29 b Fe(IEEE)112 2686 y(Computer)p Fo(,)13 b(24\(1\):72\26180,)e(1991.)4 2774 y([27])24 b(R.W)-5 b(.)12 b(W)n(isniewski,)g(L.I.)g(Kontothanassis,)h(and)e(M.L.)h(Scott.)k (High)11 b(performance)e(synchroniza-)112 2821 y(tion)i(algorithms)h(for)g (multiprogrammed)e(multiprocessors.)18 b(In)12 b Fe(Pr)n(oc.)h(of)g(PPOPP)p Fo(,)h(1995.)p eop %%Trailer end userdict /end-hook known{end-hook}if %%EOF |
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Unrau_PhD.ps.Z version [32bddefae3].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Unrau_etal_EuroPar95.ps.Z version [9505eb5632].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Unrau_etal_JSC94.ps.Z version [a726b2c85a].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Unrau_etal_OSDI94.ps.Z version [83f3c82777].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Vranesic_etal_IEEEC.ps.Z version [1d9c37a6ec].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Wilton_Vranesic_SPDP.ps.Z version [dc31e62471].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Wu_MASc.ps.Z version [528d8fad81].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/Zhou_Brecht_SM91.ps.Z version [3646cac530].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/depth-guide.ps.Z version [2e30273645].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/headerize version [882320c08b].
> > > > > > > > > > > > > > > > > > > > > > > > > > > > |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
#!/bin/sh # # to run (if called headerize): headerize <text> < in_file > out_file # e.g.: if heading is: "a heading", source file is sfile.ps, destination # file is dfile.ps, then use: # # headerize a heading < sfile.ps > dfile.ps # or # headerize "a heading" < sfile.ps > dfile.ps # gawk -v MYHEADING="$*" ' BEGIN{ begin=1 } begin==1 && $1 !~ /^%.*/ { begin=0 printf "save\n" printf "gsave\n" printf "/Times-Italic findfont 9 scalefont setfont\n" printf "72 750 moveto (%s) show\n", MYHEADING printf "grestore\n" printf "restore\n" } { print } ' |
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/ldreport.ps.Z version [a667289c44].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/OLD/titlepage.ps.Z version [97f6c1729a].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Okrieg_PhD.ps.Z version [2a337a5117].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Orran_etal_SPDPW95.ps.Z version [6417b8ad38].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Parsons_Sevcik_IPPS95.ps.Z version [10a026c65e].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Parsons_etal_IWOOS95.ps.Z version [65a84877c7].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/README.Z version [2cf48fe915].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Ravi_Stumm_ICPP95.ps.Z version [9e21d5d59f].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Ravi_Stumm_JIEICE96.ps.Z version [4f687d2453].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Sandhu_et_al_PPOPP.ps.Z version [7cc4c5d88e].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Sevcik_JPE.ps.Z version [450977ab1f].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Sevcik_Zhou_PERF93.ps.Z version [5ff752a92b].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Stumm_Unrau_Krieger_USENIX92.ps.Z version [e4c619b8e2].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Stumm_Vranesic_White_IPPS93.ps.Z version [8754f4d7f3].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Tandri_Abdel_PDPTA95.ps.Z version [bfeee3e72d].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Unrau_PhD.ps.Z version [16101bfa79].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Unrau_etal_EuroPar95.ps.Z version [e35f3da997].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Unrau_etal_JSC94.ps.Z version [f41221d04e].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Unrau_etal_OSDI94.ps.Z version [1bf1a8082f].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Vranesic_etal_IEEEC.ps.Z version [5a576842b8].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Wilton_Vranesic_SPDP.ps.Z version [1d8313eac2].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Wu_MASc.ps.Z version [077988d1fe].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Zhou_Brecht_SM91.ps.Z version [e3e1148cb8].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/depth-guide.ps.Z version [2e30273645].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/headerize version [882320c08b].
> > > > > > > > > > > > > > > > > > > > > > > > > > > > |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
#!/bin/sh # # to run (if called headerize): headerize <text> < in_file > out_file # e.g.: if heading is: "a heading", source file is sfile.ps, destination # file is dfile.ps, then use: # # headerize a heading < sfile.ps > dfile.ps # or # headerize "a heading" < sfile.ps > dfile.ps # gawk -v MYHEADING="$*" ' BEGIN{ begin=1 } begin==1 && $1 !~ /^%.*/ { begin=0 printf "save\n" printf "gsave\n" printf "/Times-Italic findfont 9 scalefont setfont\n" printf "72 750 moveto (%s) show\n", MYHEADING printf "grestore\n" printf "restore\n" } { print } ' |
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/ldreport.ps version [8895be96a1].
more than 10,000 changes
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/titlepage.ps.Z version [97f6c1729a].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/EECG/RESEARCH/ParallelSys/images/arch_button.gif version [02eea96c8d].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/EECG/RESEARCH/ParallelSys/images/comments.gif version [12d694f7ba].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/EECG/RESEARCH/ParallelSys/images/comp_button.gif version [f9a5424e0a].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/EECG/RESEARCH/ParallelSys/images/data_button.gif version [511ec86037].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/EECG/RESEARCH/ParallelSys/images/journal.gif version [ac0b61628e].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/EECG/RESEARCH/ParallelSys/images/music1.gif version [b184bb7d75].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/EECG/RESEARCH/ParallelSys/images/newban_t.gif version [83c41abb45].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/EECG/RESEARCH/ParallelSys/images/os_button.gif version [1a8a14d7d4].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/EECG/RESEARCH/ParallelSys/images/people_button.gif version [c1ff15f4b0].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/EECG/RESEARCH/ParallelSys/images/perf_button.gif version [df68c2fce4].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/EECG/RESEARCH/ParallelSys/images/proj_button.gif version [d41126a7f0].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/EECG/RESEARCH/ParallelSys/images/publ_button.gif version [58b6234da6].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/EECG/RESEARCH/ParallelSys/images/sch_button.gif version [b394c1e472].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/Welcome.html version [1af9f0f1ff].
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 |
<!-- RCS $Id: Welcome.html,v 1.2 1994/11/02 16:13:18 caranci Exp caranci --> <HTML> <body bgcolor="#ffffff" text="#000000" link="#0000ff" vlink="#aaaaff" alink="#0077FF"> </body> <FONT SIZE=4> <HEAD> <TITLE>Parallel Systems Group: Home page</TITLE> </HEAD> <BODY> <H1><img align="center" src="../EECG/RESEARCH/ParallelSys/images/newban_t.gif"></A> Parallel Systems Group</H1> <HR> The Parallel Systems Group comprises of researchers from the <A HREF="http://www.utoronto.ca/uoft.html">University of Toronto</A> working in all aspects of parallel systems, including computer architecture, operating systems, compilers, performance evaluation and applications. <P> Previous projects include the <A HREF="hector.html">Hector</A> shared memory multiprocessor, and the <A HREF="hurricane.html">Hurricane</A> multiprocessor operating system. <P> The group is currently building the <A HREF="parallel/NUMA.Welcome.html">NUMAchine</A> multiprocessor, the <A HREF="tornado.html">Tornado</A> operating system, and the <A HREF="../~tsa/jasmine.html">Jasmine</A> compiler. <P> <BR> <BR> <BR> <center> <A HREF="publications.html"> <img align="center" src="../EECG/RESEARCH/ParallelSys/images/publ_button.gif"></A> <P> <A HREF="people.html"> <img align="center" src="../EECG/RESEARCH/ParallelSys/images/people_button.gif"></A> <P> <A HREF="parallel/projects.html"> <img align="center" src="../EECG/RESEARCH/ParallelSys/images/proj_button.gif"></A> <P> <BR> <BR> <BR> <H2>Other Resources</H2> <H3>University of Toronto Resources</H3> <UL> <LI><A HREF="http://www.hprc.utoronto.ca"> University of Toronto High Performance Computing Research Center </A> <LI><A HREF="http://www.eecg.toronto.edu/EECG/EECGhome.html"> University of Toronto Electrical Engineering Computer Group</A> <LI><A HREF="http://www.cdf.toronto.edu"> University of Toronto Department of Computer Science</A> </UL> <H3>Computing Resources</H3> <UL> <LI><A HREF="http://www.ccsf.caltech.edu/other_sites.html"> Supercomputing Web pages</A> <LI><A HREF="http://www.cs.cmu.edu/afs/cs.cmu.edu/project/scandal/public/www/research-groups.html"> Supercomputing & Parallel Computing Research Groups</A> <LI><A HREF="http://www.cs.dartmouth.edu/pario.html"> Parallel I/O archive at Dartmouth</A> </UL> <EM> <!-- <HR> These pages will look best if displayed with the <A HREF="http://home.mcom.com/home/faq_docs/faq_client.html"> Mosaic Netscape</A> web browser. Check it out!<BR> <HR> --> This is still a work in progress... Mail suggestions to:<BR> <A HREF="mailto:kulki@cs.toronto.edu"> kulki </A> or <A HREF="mailto:okrieg@eecg.toronto.edu"> Orran </A> </EM> </BODY> </HTML> |
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/hector-sys-raw.gif version [0359f1f3c2].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/hector.html version [10c005e5a3].
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
<TITLE>Hector</TITLE> <H1>Hector</H1> <P> Under Construction. <P> <A HREF="pubs_abs.html#Stumm_Vranesic_White_IPPS93"> Hector</A> is a shared memory multiprocessor based on a hierarchy of unidirectional slotted rings. The main objective was a simple architecture that is size and generation scalable. The machine was built from scratch with off-the-shelf processors. Please see <a href="publications.html">publications</a> for details such as performance. <HR> <h3> Hector Processor Board </h3> <IMG ALIGN=LEFT HSPACE=15 SRC="hectorboard.gif"> <BR> Each board contains: <ul> <li> MC88100 cpu <li> 4 MB of memory <li> 16 KB of data cache <li> 16 KB of instruction cache </ul> <BR CLEAR=ALL> <HR> <h3> Hector System </h3> <IMG ALIGN=RIGHT SRC="hector-sys-raw.gif"> <BR> This system contains: <ul> <li> 16 MC88100 cpus <li> 16 x 4 MB memory <li> ring interconnect </ul> <BR CLEAR=ALL> <HR> |
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/hectorboard.gif version [76b1a3a213].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/hurricane.html version [0c2caec185].
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
<TITLE>Hurricane</TITLE> <!-- Changed by: Orran Y. Krieger, 2-Oct-1995 --> <H1>Hurricane</H1> <P> Under Construction <P> The <A HREF="pubs_abs.html#Unrau_etal_JSC94">Hurricane</A> operating system is a hierarchically clustered operating system implemented on the Hector multiprocessor. <P> Hierarchical clustering manages the system resources in clusters, using tight coupling within a cluster, and loose coupling across clusters. Distributed systems principles are applied by distributing and replicating system services and data objects to increase locality, increase concurrency, and to avoid centralized bottlenecks, thus making the system scalable. However, tight coupling is used within a cluster, so the system performs well for local interactions. Hierarchical clustering maximizes locality which is key to good performance in large systems, and systems based on hierarchical clustering can easily be adapted to different hardware configurations and architectures by changing the size of the clusters. Finally, hierarchical clustering leads to a modular system composed from easy-to-design and hence efficient building blocks. <P> All the papers are available from <A HREF="publications.html#os"> here.</A> </UL> |
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/images/comments.gif version [12d694f7ba].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/images/homeblue.gif version [a77b950d99].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/images/redline.GIF version [59a7418809].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/parallel/NUMA.Welcome.html version [2a20af86f3].
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 |
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> <HTML> <HEAD> <TITLE> NUMAchine Home Page </TITLE> <META NAME="GENERATOR" CONTENT="Mozilla/3.0Gold (X11; I; SunOS 5.5 sun4m) [Netscape]"> </HEAD> <BODY BACKGROUND="images/maple_back.gif"> <CENTER><P><IMG SRC="numahw/NUMAchine-med.gif" > <clear=left><BR> <BR> </P></CENTER> <H2 ALIGN=CENTER>The NUMAchine Multiprocessor Project</H2> <P>The <I>NUMAchine</I> project at the <A HREF="http://www.utoronto.ca/uoft.html">University of Toronto</A> is a major research project aimed at developing a shared-memory multiprocessor architecture and software support for easy and efficient use of this architecture. Members of both the <A HREF="http://www.ece.toronto.edu/">Department of Electrical and Computer Engineering</A> and the <A HREF="http://www.cs.toronto.edu">Department of Computer Science</A> are collaborating on this project.</P> <P>A key objective is to develop a high-performance architecture that is modular, cost-effective and scalable. At the present time, a prototype machine is being designed and built, and the system software is being developed. Follow the links below for more information. </P> <TABLE BORDER=0 CELLSPACING=1 CELLPADDING=0 WIDTH="100%"> <TR ALIGN=LEFT VALIGN=CENTER> <TD WIDTH="65%"> <P> <IMG SRC="images/computer.gif" ALIGN=CENTER HSPACE=5> <A HREF="numahw/numahw.html">Hardware description with photographs</A> </P> <P> <IMG SRC="images/archiv.gif" ALIGN=CENTER HSPACE=5> <A HREF="numadocs.html">Papers and technical documentation</A> </P> <P> <IMG SRC="images/disk.gif" HSPACE=5 ALIGN=CENTER> System software: </P> <UL> <P> <IMG SRC="images/wh_ball.gif" HSPACE=5 HEIGHT=16 WIDTH=17 ALIGN=BOTTOM> <A HREF="tornado.html">The Tornado Operating System</A><BR> <BR> </P> <P> <IMG SRC="images/wh_ball.gif" HSPACE=5 HEIGHT=16 WIDTH=17 ALIGN=BOTTOM> <A HREF="../../~tsa/jasmine.html">The Jasmine Compiler</A> </P> </UL> <TD> <P ALIGN=CENTER>Click on the figures below for the<BR> NUMAchine architecture and a hardware photo.<BR> <A HREF="images/NUMAfig.gif"> <IMG ALIGN=LEFT WIDTH=80 HEIGHT=80 SRC="images/NUMAfig.gif"> </A> <A HREF="numahw/pictures/dbgstn.jpg"> <IMG ALIGN=RIGHT WIDTH=80 HEIGHT=80 SRC="numahw/numa2.gif"> </A> </P> </TD> </TABLE> <P> <HR WIDTH="100%"></P> <P>Major funding from:<BR> <IMG HSPACE=5 VSPACE=5 SRC="images/NSERC.gif" ALIGN=CENTER> <A HREF="http://www.nserc.ca">Natural Sciences and Engineering Research Council of Canada (NSERC)</A> </P> </BODY> </HTML> |
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/parallel/hector-sys-raw.gif version [0359f1f3c2].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/parallel/hector.html version [501d4060d2].
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
<TITLE>Hector</TITLE> <H1>Hector</H1> <P> Under Construction. <P> <A HREF="http://www.eecg.toronto.edu/parallel/parallel/pubs_abs.html#Stumm_Vranesic_White_IPPS93"> Hector</A> is a shared memory multiprocessor based on a hierarchy of unidirectional slotted rings. The main objective was a simple architecture that is size and generation scalable. The machine was built from scratch with off-the-shelf processors. Please see <a href="http://www.eecg.toronto.edu/parallel/parallel/publications.html">publications</a> for details such as performance. <HR> <h3> Hector Processor Board </h3> <IMG ALIGN=LEFT HSPACE=15 SRC="hectorboard.gif"> <BR> Each board contains: <ul> <li> MC88100 cpu <li> 4 MB of memory <li> 16 KB of data cache <li> 16 KB of instruction cache </ul> <BR CLEAR=ALL> <HR> <h3> Hector System </h3> <IMG ALIGN=RIGHT SRC="hector-sys-raw.gif"> <BR> This system contains: <ul> <li> 16 MC88100 cpus <li> 16 x 4 MB memory <li> ring interconnect </ul> <BR CLEAR=ALL> <HR> |
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/parallel/hectorboard.gif version [76b1a3a213].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/parallel/hurricane.html version [f9112aeaaa].
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
<TITLE>Hurricane</TITLE> <!-- Changed by: Orran Y. Krieger, 2-Oct-1995 --> <H1>Hurricane</H1> <P> Under Construction <P> The <A HREF="http://www.eecg.toronto.edu/parallel/parallel/pubs_abs.html#Unrau_etal_JSC94">Hurricane</A> operating system is a hierarchically clustered operating system implemented on the Hector multiprocessor. <P> Hierarchical clustering manages the system resources in clusters, using tight coupling within a cluster, and loose coupling across clusters. Distributed systems principles are applied by distributing and replicating system services and data objects to increase locality, increase concurrency, and to avoid centralized bottlenecks, thus making the system scalable. However, tight coupling is used within a cluster, so the system performs well for local interactions. Hierarchical clustering maximizes locality which is key to good performance in large systems, and systems based on hierarchical clustering can easily be adapted to different hardware configurations and architectures by changing the size of the clusters. Finally, hierarchical clustering leads to a modular system composed from easy-to-design and hence efficient building blocks. <P> All the papers are available from <A HREF="http://www.eecg.toronto.edu/parallel/parallel/publications.html#os"> here.</A> </UL> |
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/parallel/images/NSERC.gif version [1f87abec52].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/parallel/images/NUMAchine-small.gif version [132567b0e6].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/parallel/images/NUMAfig.gif version [384e3cb6db].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/parallel/images/archiv.gif version [97177c2404].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/parallel/images/computer.gif version [c6c2ee2011].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/parallel/images/disk.gif version [3ae69e93dc].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/parallel/images/maple_back.gif version [058528c833].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/parallel/images/torn-small.gif version [deb0bec641].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/parallel/images/wh_ball.gif version [0952c81ba5].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/parallel/numachine.html version [2a20af86f3].
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 |
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> <HTML> <HEAD> <TITLE> NUMAchine Home Page </TITLE> <META NAME="GENERATOR" CONTENT="Mozilla/3.0Gold (X11; I; SunOS 5.5 sun4m) [Netscape]"> </HEAD> <BODY BACKGROUND="images/maple_back.gif"> <CENTER><P><IMG SRC="numahw/NUMAchine-med.gif" > <clear=left><BR> <BR> </P></CENTER> <H2 ALIGN=CENTER>The NUMAchine Multiprocessor Project</H2> <P>The <I>NUMAchine</I> project at the <A HREF="http://www.utoronto.ca/uoft.html">University of Toronto</A> is a major research project aimed at developing a shared-memory multiprocessor architecture and software support for easy and efficient use of this architecture. Members of both the <A HREF="http://www.ece.toronto.edu/">Department of Electrical and Computer Engineering</A> and the <A HREF="http://www.cs.toronto.edu">Department of Computer Science</A> are collaborating on this project.</P> <P>A key objective is to develop a high-performance architecture that is modular, cost-effective and scalable. At the present time, a prototype machine is being designed and built, and the system software is being developed. Follow the links below for more information. </P> <TABLE BORDER=0 CELLSPACING=1 CELLPADDING=0 WIDTH="100%"> <TR ALIGN=LEFT VALIGN=CENTER> <TD WIDTH="65%"> <P> <IMG SRC="images/computer.gif" ALIGN=CENTER HSPACE=5> <A HREF="numahw/numahw.html">Hardware description with photographs</A> </P> <P> <IMG SRC="images/archiv.gif" ALIGN=CENTER HSPACE=5> <A HREF="numadocs.html">Papers and technical documentation</A> </P> <P> <IMG SRC="images/disk.gif" HSPACE=5 ALIGN=CENTER> System software: </P> <UL> <P> <IMG SRC="images/wh_ball.gif" HSPACE=5 HEIGHT=16 WIDTH=17 ALIGN=BOTTOM> <A HREF="tornado.html">The Tornado Operating System</A><BR> <BR> </P> <P> <IMG SRC="images/wh_ball.gif" HSPACE=5 HEIGHT=16 WIDTH=17 ALIGN=BOTTOM> <A HREF="../../~tsa/jasmine.html">The Jasmine Compiler</A> </P> </UL> <TD> <P ALIGN=CENTER>Click on the figures below for the<BR> NUMAchine architecture and a hardware photo.<BR> <A HREF="images/NUMAfig.gif"> <IMG ALIGN=LEFT WIDTH=80 HEIGHT=80 SRC="images/NUMAfig.gif"> </A> <A HREF="numahw/pictures/dbgstn.jpg"> <IMG ALIGN=RIGHT WIDTH=80 HEIGHT=80 SRC="numahw/numa2.gif"> </A> </P> </TD> </TABLE> <P> <HR WIDTH="100%"></P> <P>Major funding from:<BR> <IMG HSPACE=5 VSPACE=5 SRC="images/NSERC.gif" ALIGN=CENTER> <A HREF="http://www.nserc.ca">Natural Sciences and Engineering Research Council of Canada (NSERC)</A> </P> </BODY> </HTML> |
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/parallel/numadocs.html version [05e65e3777].
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 |
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> <HTML> <HEAD> <TITLE> Documentation on the NUMAchine Multiprocessor </TITLE> <META NAME="GENERATOR" CONTENT="Mozilla/3.0Gold (X11; I; SunOS 5.5 sun4m) [Netscape]"> </HEAD> <BODY BACKGROUND="images/maple_back.gif"> <CENTER><P><IMG SRC="numahw/NUMAchine-med.gif" > <clear=left><BR> <BR> </P></CENTER> <H2 ALIGN=CENTER>Documentation on the NUMAchine Multiprocessor</H2> <P> <HR WIDTH="100%"><BR> <FONT SIZE=+1>Technical Report</FONT></P> <P>We have written a technical report that describes the NUMAchine architecture, outlines important aspects of its cache coherence protocol, and provides simulation results for parallel execution of a number of benchmark programs. </P> <UL> <LI><A HREF="http://www.eecg.toronto.edu/parallel/parallel/numachin.hier/numachin.html">Hierarchical HTML version of NUMAchine Technical Report</A><BR> <BR> </LI> <LI><A HREF="http://www.eecg.toronto.edu/parallel/parallel/numachin.flat/numachin.html">Monolithic HTML version of NUMAchine Technical Report (103 Kbytes)</A><BR> <BR> </LI> <LI><A HREF="http://www.eecg.toronto.edu/parallel/parallel/docs/techreport.ps">PostScript version of NUMAchine Technical Report (451 Kbytes)</A><BR> <BR> </LI> <LI><A HREF="http://www.eecg.toronto.edu/parallel/parallel/docs/techreport.pdf">PDF version of NUMAchine Technical Report (175 Kbytes)</A><BR> <I><B>Note: some figures do not come out properly in the PDF.<BR> Grab the PostScript version instead for now.</B></I></LI> </UL> <P> <HR WIDTH="100%"><BR> <FONT SIZE=+1>Papers</FONT></P> <UL> <li>R. Grindley, T. Abdelrahman, S. Brown, S. Caranci, D. DeVries, B. Gamsa, A. Grbic, M. Gusat, R. Ho, O. Krieger, G. Lemieux, K. Loveless, N. Manjikian, P. McHardy, S. Srbljic, M. Stumm, Z. Vranesic and Z. Zilic , "The NUMAchine Multiprocessor", <i>Proceedings of the 2000 International Conference on Parallel Processing</i>, Toronto, August 2000.<br> <a href="http://www.eecg.toronto.edu/parallel/parallel/docs/icpp00.pdf">full paper, (PDF, 109k)</a> <!--A pdf version of the paper is available in ~grbic/icpp00.pdf. Please make a copy of it in the webpage directory.--> <p> <LI>A. Grbic, S. Brown, S. Caranci, R. Grindley, M. Gusat, G. Lemieux, K. Loveless, N. Manjikian, S. Srbljic, M. Stumm, Z. Vranesic, and Z. Zilic, "Design and Implementation of the NUMAchine Multiprocessor," <I>Proceedings of the 35th IEEE Design Automation Conference</I>, San Francisco, June 1998.<BR> <A HREF="http://www.eecg.toronto.edu/parallel/parallel/docs/dac98.ps">full paper (PostScript, 160 Kbytes)</A><BR> <A HREF="http://www.eecg.toronto.edu/parallel/parallel/docs/dac98.pdf">full paper (PDF, 41 Kbytes)</A></LI><BR> <LI>S. Brown, N. Manjikian, Z. Vranesic, S. Caranci, A. Grbic, R. Grindley, M. Gusat, K. Loveless, Z. Zilic, and S. Srbljic, "Experience in Designing a Large-scale Multiprocessor using Field-Programmable Devices and Advanced CAD Tools," <I>Proceedings of the 33rd IEEE Design Automation Conference</I>, Las Vegas, June 1996.<BR> <A HREF="http://www.eecg.toronto.edu/~brown/DAC96.html">abstract<BR></A> <A HREF="http://www.eecg.toronto.edu/~brown/dac96.ps">full paper (PostScript, 177 Kbytes)</A><BR> <A HREF="http://www.eecg.toronto.edu/parallel/parallel/docs/dac96.pdf">full paper (PDF, 63 Kbytes)</A></LI><BR> <LI> Z. Zilic, G. Lemieux, K. Loveless, S. Brown, and Z. Vranesic, "Designing for High Speed-Performance in CPLDs and FPGAs," <I>Proc. 3rd Canadian Workshop on Field-Programable Devices (FPD'95): Technology, Tools, and Applications</I>, Montreal, Canada, pp. 108 - 113, May 1995.</A><BR> <A HREF="http://www.eecg.toronto.edu/parallel/parallel/docs/fpd95.pdf"> full paper (PDF, 31 Kbytes)</A> </LI><BR> <LI> T. Abdelrahman, S. Brown, T. Mowry, K. Sevcik, M. Stumm, Z. Vranesic, S. Zhou, A. Elkateeb, M. Gusat, P. Pereira, B. Gamsa, R. Grindley, O Kreiger, G. Lemieux, K. Loveless, N. Manjikian, G. Ravindran, S. Srbljic, Z. Zilic "An Overview of the NUMAchine Multiprocessor Project," <I>Proceedings of the 8th Canadian Supercomputing Conference</I>, June 1994.</A><BR> <A HREF="http://www.eecg.toronto.edu/parallel/parallel/docs/overview.ps">full paper (PostScript, 224 Kbytes)</A><BR> <A HREF="http://www.eecg.toronto.edu/parallel/parallel/docs/overview.pdf">full paper (PDF, 200 Kbytes)</A></LI><BR> </LI> </UL> <P> <HR WIDTH="100%"><BR> <A NAME="systemmanuals"> <FONT SIZE=+1>System Manuals</FONT></P></A> <P>We are developing the system-level programming documentation to provide details on the NUMAchine address space and describe various special functions controlled by system software.</P> <UL> <LI><A HREF="http://www.eecg.toronto.edu/parallel/parallel/docs/sys_prog_manual.pdf"> NUMAchine Principles of Operations for System Programmers (PDF, 228 Kbytes)<BR> </A><B><I>DISCLAIMER: this is a preliminary document and is subject to change at anytime.</I></B></LI> </UL> <P> Also, the hardware reference manual describes all of those nitty-gritty details that the software types don't really care about. Those who work closely with the hardware should be familiar with this manual.</P> <UL> <LI><A HREF="http://www.eecg.toronto.edu/parallel/parallel/docs/hw_maintenance_manual.pdf"> NUMAchine Hardware Reference and Maintenance Manual (PDF, 539 Kbytes)<BR> </A><B><I>DISCLAIMER: this is a preliminary document and is subject to change at anytime.</I></B></LI> </UL> <P> <HR WIDTH="100%"><BR> <FONT SIZE=+1>NUMAchine-related Theses</FONT></P> <UL> The following theses offer greater insight into the details of the NUMAchine hardware. However, note that the content of the theses is dated, and changes to the hardware have been made for various reasons (integration, economics, correctness, etc.). Consequently, the information below does not accurately document the state of the NUMAchine hardware as it is today. Instead, consult either the <A HREF="http://www.eecg.toronto.edu/parallel/numadocs.html#systemmanuals"> <I>System Programming Manual</I></A> or <A HREF="http://www.eecg.toronto.edu/parallel/numadocs.html#systemmanuals"> <I>Hardware Reference and Maintenance Manual</I></A>, as these will be kept as current as possible.<BR><BR> <LI>Eddy Ah Pin, "Hardware Performance Monitoring in Memory of NUMAchine Multiprocessor," <I>Undergraduate Thesis,</I> University of Toronto, 1997. <BR><A HREF="http://www.eecg.toronto.edu/parallel/parallel/theses/ahpin.pdf">PDF, 146k</A></LI><BR> <LI><A HREF="http://www.eecg.toronto.edu/~grbic">Alex Grbic</A>, "Hierarchical Directory Controllers in the NUMAchine Multiprocessor," <I>M.A.Sc. Thesis,</I> University of Toronto, 1996. <BR><A HREF="http://www.eecg.toronto.edu/parallel/parallel/theses/grbic.pdf">PDF, 4120k</A></LI><BR> <LI><A HREF="http://www.eecg.toronto.edu/~grbic">Alex Grbic</A>, "Assessment of Cache Coherence Protocols in Shared-Memory Multiprocessors," <I>Ph.D. Thesis,</I> University of Toronto, 2003. <BR><A HREF="http://www.eecg.toronto.edu/parallel/parallel/theses/grbic_phd.pdf">PDF, 1064k</A></LI><BR> <LI><A HREF="http://www.eecg.toronto.edu/~grindley">Robin Grindley</A>, "The NUMAchine Multiprocessor: Design and Analysis," <I>Ph.D. Thesis,</I> University of Toronto, 1999. <BR><A HREF="http://www.eecg.toronto.edu/parallel/parallel/theses/grindley.pdf">PDF, 1776k</A></LI><BR> <LI><A HREF="http://www.eecg.toronto.edu/~lemieux">Guy Lemieux</A>, "Hardware Performance Monitoring in Multiprocessors," <I>M.A.Sc. Thesis,</I> University of Toronto, 1996. <BR><A HREF="http://www.eecg.toronto.edu/parallel/parallel/theses/lemieux.pdf">PDF, 219k</A></LI><BR> <LI><A HREF="http://www.eecg.toronto.edu/~kelvin">Kelvin Loveless</A>, "The Implementation of Flexible Interconnect in the NUMAchine Multiprocessor," <I>M.A.Sc. Thesis,</I> University of Toronto, 1996. <BR><A HREF="http://www.eecg.toronto.edu/parallel/parallel/theses/loveless.pdf">PDF, 848k</A></LI><BR> <li>Karl Schabas, "The Implementation of Basic Monitoring Functions on the NUMAchine Multiprocessor", <i>Undergraduate Thesis</i>, University of Toronto, 2000.<br> <a href="http://www.eecg.toronto.edu/parallel/parallel/theses/schabas.pdf">PDF, 281K</A> <!-- A pdf version of the paper is available in ~grbic/schabas.pdf. Also make a copy of it in the webpage directory.--> <p> </UL> <HR WIDTH="100%"></P> <P><A HREF="NUMA.Welcome.html">Back to NUMAchine Home Page...</A></P> </BODY> </HTML> |
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/parallel/numahw/NUMAchine-med.gif version [663c7b0c5d].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/parallel/numahw/NUMAchine.arch.gif version [3cceeda173].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/parallel/numahw/mem.gif version [235779c2b4].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/parallel/numahw/ni.gif version [e6b2be8207].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/parallel/numahw/numa2.gif version [fa41f30a77].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/parallel/numahw/numahw.html version [a421b6b289].
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 |
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> <HTML> <HEAD> <TITLE>Hardware Development for the NUMAchine Multiprocessor</TITLE> <META NAME="GENERATOR" CONTENT="Mozilla/3.0Gold (X11; I; SunOS 4.1.3_U1 sun4m) [Netscape]"> <META NAME="Author" CONTENT="Naraig Manjikian"> </HEAD> <BODY background = "../images/maple_back.gif"> <CENTER><P><IMG SRC="NUMAchine-med.gif" HEIGHT=89 WIDTH=502></P></CENTER> <H2 ALIGN=CENTER>Hardware Development for the NUMAchine Multiprocessor<BR> at the University of Toronto</H2> <P>A 64-processor prototype of the <A HREF="../NUMA.Welcome.html">NUMAchine multiprocessor</A> architecture (illustrated below) is under construction in the <A HREF="http://www.ece.toronto.edu">Dept. of Electrical /Computer Engineering</A> at the <A HREF="http://www.toronto.edu">Univ. of Toronto</A>. </P> <CENTER><P><IMG SRC="NUMAchine.arch.gif" BORDER=2 HEIGHT=236 WIDTH=548></P></CENTER> <P>The implementation of each <I><A HREF="numahw.html#station">station</A></I> is based on the FutureBus+ physical standard, but NUMAchine utilizes a custom synchronous bus protocol.</P> <P>A number of printed circuit boards have been designed and fabricated:</P> <UL> <LI><I><A HREF="numahw.html#processor board">processor board</A></I> with a MIPS R4400 microprocessor and 1 MByte of SRAM cache<BR> <BR> </LI> <LI><A HREF="numahw.html#memory board"><I>memory board</I> </A>containing 32-128 MBytes of DRAM and 8 MBytes of SRAM for the cache coherence directory<BR> <BR> </LI> <LI><I><A HREF="numahw.html#network interface board">network interface board</A></I> to link a station to the ring hierarchy; this board also contains<BR> 8 MBytes of DRAM to cache remote data<BR> <BR> </LI> <LI><A HREF="numahw.html#clock generator board"><I>clock generator</I> </A>generates up to 18 differential ECL clocks<BR> <BR> </LI> <LI><A HREF="numahw.html#bus arbiter board"><I>bus arbiter</I> </A>a centralized bus arbiter controls access to the NUMAchine station bus<BR> <BR> </LI> </UL> <P><B>Status:</B> <EM>The I/O board has been fabricated and is working... pictures pending. Also, a number of circuit boards which implement the global ring for the top level of the interconnection network have been fabricated and are being tested.</EM></P> <P>All boards utilize field-programmable devices (FPDs) from the <A HREF="http://www.altera.com">Altera Corporation</A> for much of the control circuitry, such as the system interface for the <A HREF="http://www.mips.com">MIPS R4400 microprocessor</A>, the directory controller on the memory board, and the ring controller on the network interface board. Field-programmable devices provide shorter design cycles and cost-effectiveness (although good performance requires <A HREF="http://www.eecg.toronto.edu/~brown/DAC96.html">careful design</A>). In addition, FPDs provide flexibility to implement new protocols to support future research.<BR> <BR> <BR> <BR> </P> <CENTER><P><B><I>The NUMAchine Hardware Development Group</I></B> </P></CENTER> <CENTER><TABLE CELLSPACING=20 CELLPADDING=0 > <TR> <TD VALIGN=TOP> <LI>Prof. Zvonko G. Vranesic (<I>project leader</I>)</LI> <LI>Prof. Stephen D. Brown</LI> <LI>Prof. Michael Stumm</LI> <LI>Steve Caranci</LI> <LI>Alex Grbic</LI> <LI>Guy Lemieux</LI> <LI>Paul McHardy</LI> <LI>Peter Pereira</LI> </TD> <TD VALIGN=TOP>Major contributors who have moved on: <LI>Dr. Robin Grindley</LI> <LI>Mitch Gusat</LI> <LI>Dr. Orran Krieger</LI> <LI>Kelvin Loveless</LI> <LI>Dr. Naraig Manjikian</LI> <LI>Dr. Sinisa Srbljic</LI> <LI>Michael van Dam</LI> <LI>Dr. Zeljko Zilic</LI> <P>Summer students:</P> <LI>Eddy Ah Pin</LI> <LI>Terry Borer</LI> <LI>Jackson Fung</LI> <LI>Emanuel Istrate</LI> <LI>Daniel Levner</LI> <LI>Karl Schabas</LI> <LI>Deshanand P. Singh</LI> </TD> </TR> </TABLE></CENTER> <P> <HR WIDTH="100%"></P> <H2>Photographs of NUMAchine Hardware</H2> <P><A NAME="station"></A></P> <TABLE BORDER=1 CELLPADDING=10 > <TR> <TD> <A HREF="pictures/dbgstn.jpg"> <IMG SRC="numa2.gif" HEIGHT=225 WIDTH=262 ALIGN=TEXTTOP></A></TD> <TD> <CENTER><P><B><I><FONT SIZE=+1>A Fully-populated<BR> NUMAchine Station</FONT></I></B></P></CENTER> <P>The bus physical backplane is at the bottom of the photograph. The boards plug vertically into the backplane.</P> <P>From left to right:</P> <LI>bus arbiter board</LI> <LI>4 processor boards</LI> <LI>2 memory boards</LI> <LI>network interface board</LI> <P>The power supply is visible directly beneath the bus backplane. A clock generation and distribution board (not visible) is located underneath the backplane.</P> </TD> </TR> </TABLE> <P><A NAME="processor board"></A></P> <TABLE BORDER=1 CELLPADDING=10 > <TR> <TD> <A HREF="http://www.eecg.toronto.edu/parallel/parallel/numahw/pictures/proc.jpg"> <IMG SRC="proc.gif" HEIGHT=378 WIDTH=308 ALIGN=TEXTTOP></A></TD> <TD> <CENTER><P><B><I><FONT SIZE=+1>The Processor Board</FONT></I></B></P></CENTER> <P>At the top of the board are LED displays and connectors for diagnostics, EPROM to program the Altera FPDs, and EPROM with boot code for the R4400.</P> <P>The MIPS R4400 microprocessor with heat sink is at the center of the board, surrounded by SRAM cache chips.</P> <P>Directly below the R4400 is a row of Altera field-programmable devices which serve as the system interface for the R4400. Below these chips is a row of FIFO buffers to and from the NUMAchine station bus. Finally, below the FIFOs is a row of FutureBus+ BTL chips for listening to and driving the NUMAchine station bus.</P> <P>Click on the picture to see the latest version of the processor board, revision 3, in detail.</P> <P>The connector to the NUMAchine station bus is at the bottom of the board.</P> <P><A HREF="http://www.eecg.toronto.edu/parallel/parallel/numahw/procr.pic.ps">Block diagram (PostScript, 95 Kbytes)</A><P> </TD> </TR> </TABLE> <P><A NAME="memory board"></A></P> <TABLE BORDER=1 CELLPADDING=10 > <TR> <TD> <A HREF="http://www.eecg.toronto.edu/parallel/parallel/numahw/pictures/mem.jpg"> <IMG SRC="mem.gif" HEIGHT=450 WIDTH=450 ALIGN=TEXTTOP></A></TD> <TD> <CENTER><P><B><I><FONT SIZE=+1>The Memory Board</FONT></I></B></P></CENTER> <P>DRAM SIMMs occupy the left side of the board. The top right-hand corner is occupied by a bank of SRAM chips used in maintaining the directory for the cache coherence protocol. </P> <P>At the right-hand center of the board are the Altera FPDs which contain the control circuitry for the cache coherence protocol. There is also an Altera FPD at the top center of the board to control the DRAM array.</P> <P>FIFO buffers and BTL interface chips connect the memory board to the NUMAchine station bus through the connector at the bottom of the board.</P> <P>Click on the picture to see the latest version of the memory board, revision 2, in detail. Hardware monitoring, which was not present in the original revision, has been added in the Altera FLEX10K30 device. The patchwires were necessary to correct an FPGA programming problem, and have been eliminated with a final respin of the board.</P> <P><A HREF="http://www.eecg.toronto.edu/parallel/parallel/numahw/mem.pic.ps">Block diagram (PostScript, 99 Kbytes)</A><P> </TD> </TR> </TABLE> <P><A NAME="network interface board"></A></P> <TABLE BORDER=1 CELLPADDING=10 > <TR> <TD> <A HREF="http://www.eecg.toronto.edu/parallel/parallel/numahw/pictures/nic.jpg"> <IMG SRC="ni.gif" HEIGHT=445 WIDTH=313 ALIGN=TEXTTOP></A></TD> <TD> <CENTER><P><B><I><FONT SIZE=+1>The Network Interface Board</FONT></I></B></P></CENTER> <P>The ring connectors are visible in the top corners of the board. The buffers for the ring interconnect occupy the space between the connectors.</P> <P>The DRAM chips for the remote data cache occupy a small area on the underside of the board.</P> <P>The Altera FPDs containing the control circuitry for the cache coherence protocol, the rings, and the remote data cache are clearly visible in their sockets.</P> <P>Pipelining for the wide data paths on this board requires the large number of buffer chips which occupy much of the board.</P> <P>FIFO buffers and BTL chips are located at the bottom left and bottom right, as well as the the bottom edge of the board, directly above the connector to the NUMAchine station bus.</P> <P>Click on the picture to see the latest version of the network interface board, revision 2, in detail. You will notice that many of the discrete buffers have been replaced with Altera FLEX6016 FPGAs. Also, the SDRAM has been moved to the top surface.</P> <P><A HREF="http://www.eecg.toronto.edu/parallel/parallel/numahw/ni.pic.ps">Block diagram (PostScript, 97 Kbytes)</A><P> </TD> </TR> </TABLE> <P><A NAME="clock generator board"></A></P> <TABLE BORDER=1 CELLPADDING=10 > <TR> <TD> <A HREF="http://www.eecg.toronto.edu/parallel/parallel/numahw/pictures/clock.jpg"> <IMG SRC="pictures/clock_small.jpg" WIDTH=200 ALIGN=TEXTTOP></A></TD> <TD> <CENTER><P><B><I><FONT SIZE=+1>The Clock Generator Board</FONT></I></B></P></CENTER> <P>The clock generator board can be programmed to a wide range of frequencies by the red DIP switch block. Differential the ECL master clock is generated by the chip in the top, centre of the board and split 2:1 by the small chip in the centre. The left and right chips are 9:1 fanout replicators, giving a total of 18 ECL clock signals. We distribute the clocks to the NUMAchine backplane via twisted-pair cables. Of course, we must take care that the cables are all the same length to minimize skew mismatch between the signals.</P> </TD> </TR> </TABLE> <P><A NAME="bus arbiter board"></A></P> <TABLE BORDER=1 CELLPADDING=10 > <TR> <TD> <A HREF="http://www.eecg.toronto.edu/parallel/parallel/numahw/pictures/arb.jpg"> <IMG SRC="pictures/arb_small.jpg" WIDTH=300 ALIGN=TEXTTOP></A></TD> <TD> <CENTER><P><B><I><FONT SIZE=+1>The Bus Arbiter Board</FONT></I></B></P></CENTER> <P>The bus arbiter board is a centralized, synchronous arbiter that controls access to the NUMAchine bus. Since this was one of the first boards we made, a few miscellaneous test circuits were also added to experiment with high-speed signalling using Altera devices. These test circuits use the DIP switches to test different functions. Also, a NUMAchine station RESET switch is located on this board, just below the DIP switches.</P> <P>The bus arbiter function has been added to the latest version of the I/O Board. Unfortunately, we do not have scans of that board ready yet for display.</P> </TD> </TR> </TABLE> <P><A HREF="http://www.eecg.toronto.edu/parallel/NUMA.Welcome.html">Back to NUMAchine Home Page...</A></P> </BODY> </HTML> |
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/parallel/numahw/pictures/arb_small.jpg version [c4c911d4ec].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/parallel/numahw/pictures/clock_small.jpg version [86e2cf518a].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/parallel/numahw/pictures/dbgstn.jpg version [a2d59ea0b6].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/parallel/numahw/proc.gif version [0868fe3031].
cannot compute difference between binary files
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/parallel/projects.html version [3b9c6d187b].
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 |
<!-- RCS $Id: projects.html,v 1.3 1994/11/02 20:30:13 caranci Exp caranci --> <HTML> <body bgcolor="#ffffff" text="#000000" link="#0000ff" vlink="#aaaaff" alink="#0077FF"> </body> <FONT SIZE=4> <HEAD> <TITLE>Current UofT EECG Projects</TITLE> </HEAD> <BODY> <H1>Current Projects</H1> <DL> <DT><a href="hector.html"> Hector</A> <DD>Hector is a scalable shared memory multiprocessor with an interconnect of hierachical rings. <DT><a href="hurricane.html"> Hurricane</A> <DD> Hurricane is a hierarchically clustered operating system implemented on the Hector multiprocessor. <DT><a href="numachine.html"> <IMG SRC="images/NUMAchine-small.gif" ALT = "NUMAchine"></A> <DD> <A href="numachine.html">NUMAchine</A> is a next-generation implemenation of the basic Hector multi-processor architecture. Features include: hardware cache-coherency, network cache (a lockup-free tertiary cache), efficient multicast mechanism, and hardware performance monitoring support. <DT><a href="tornado.html"> <IMG SRC="images/torn-small.gif" border=0 vspace=5 hspace=5 ALT = "Tornado"></A> </A> <DD> <A href="tornado.html">Tornado</A> is the operating system being implemented for the NUMAchine multiprocessor. It is a multiuser, NUMA-aware, performance-oriented microkernel operating system. Most services are provided by servers and application-level run-time libraries. Tornado has a highly modular structure and is implemented in C++. </DL> <HR> <STRONG>This is still a work in progress...<BR> Please forward any comments, suggestions or questions to:</STRONG><BR> <a href="mailto:caranci@eecg.toronto.edu"><i>caranci@eecg.toronto.edu</i></a> </BODY> </HTML> |
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/parallel/tornado.html version [6421902085].
> > > > > > > > > > > > > > > > > > > > > > > > > |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <title>K42/Tornado Web Page Redirection</title> <META HTTP-EQUIV="Refresh" CONTENT="1; URL=http://www.eecg.toronto.edu/~tornado"> </head> <body> <h1>K42/Tornado Web Page Redirection</h1> <p> The University of Toronto K42/Tornado web page has moved to <a href="http://www.eecg.toronto.edu/~tornado/">http://www.eecg.toronto.edu/~tornado/</a>. If your browser doesn't automatically redirect to its new location, click the above link. </p> <hr> <address><a href="mailto:tamda@eecg.toronto.edu">David Kar-Fai Tam</a></address> <!-- Created: Tue Oct 7 17:23:32 EDT 2003 --> <!-- hhmts start --> Last modified: Tue Oct 7 17:36:05 EDT 2003 <!-- hhmts end --> </body> </html> |
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/people.html version [3499bb98b9].
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 |
<HTML> <body bgcolor="#ffffff" text="#000000" link="#0000ff" vlink="#aaaaff" alink="#0077FF"> </body> <FONT SIZE=4> <HEAD> <TITLE>People</TITLE> <!-- Changed by: Orran Y. Krieger, 18-Apr-1996 --> </HEAD> <center> <H1> <img align="center" src="../EECG/RESEARCH/ParallelSys/images/music1.gif"></A> <P> People</H1> </center> <BODY> <img align=left src="images/comments.gif"> Please mail changes and additions to <A HREF="mailto:kulki@cs.toronto.edu">Kulki</A> or <A HREF="mailto:okrieg@eecg.toronto.edu">Orran</A> <p> <br> <H2>Faculty</H2> <UL> <LI> <A href="http://www.eecg.toronto.edu/~tsa/Welcome.html">T. Abdelrahman</a> <LI> <A href="../~brown/Welcome.html">S. Brown </a> <LI> <A href="../~corinna.html">C. Lee </a> <LI> <A href="http://www.eecg.toronto.edu/~tcm/Welcome.html">T. Mowry </a> <LI> <A href="http://www.cs.toronto.edu/~kcs">K. Sevcik </a> <LI> <A href="../~stumm/Welcome.html">M. Stumm </a> <LI> <A href="../~zvonko/Welcome.html">Z. Vranesic </a> <LI> <A href="http://www.eecg.toronto.edu/~zhou/Welcome.html">S. Zhou </a> </UL> <P> <H2>Students</H2> <UL> <LI> <A href="http://www.eecg.toronto.edu/~bernecky">R. Bernecky</a> <LI> <A href="../~charlesc.html">C. Chan</a> <LI> <A href="http://www.eecg.toronto.edu/~demke">A. Demke</a> <LI> <A href="http://www.eecg.toronto.edu/~devrier">D. De Vries</a> <LI> <A href="../~dunc/index.html">D. Elliott</a> <LI> <A href="http://www.eecg.toronto.edu/~farkas/">K. Farkas</a> <LI> <A href="../~ben/Welcome.html">B. Gamsa </a> <LI> <A href="http://www.eecg.toronto.edu/~grindley/Welcome.html">R. Grindley </a> <LI> <A href="http://www.eecg.toronto.edu/~grbic">A. Grbic </a> <LI> <A href="http://www.eecg.toronto.edu/~gusat">R. Ho </a> <LI> <A href="http://www.eecg.toronto.edu/~shuynh/va/Welcome.html">S. Huynh </a> <LI> <A href="http://www.eecg.toronto.edu/~hora/Welcome.html">M. Gusat </a> <LI> <A href="http://www.cs.toronto.edu/~karim">K. Harzallah</a> <LI> <A href="http://www.eecg.toronto.edu/~jaseemud">M. Jaseemuddin</a> <LI> <A href="http://www.cs.toronto.edu/~kulki">D. Kulkarni</a> <LI> <A href="http://www.cs.toronto.edu/~lamma">M. Lam</a> <LI> <A href="http://www.eecg.toronto.edu/~lemieux/Welcome.html">G. Lemieux </a> <LI> <A href="http://www.cs.toronto.edu/~paullu">P. Lu</A> <LI> <A href="http://www.cs.toronto.edu/~luk">C. Luk</A> <LI> <A href="http://www.eecg.toronto.edu/~kma">K. Ma </a> <LI> <A href="http://www.cs.toronto.edu/~maione">I. Maione </a> <LI> <A href="http://www.eecg.toronto.edu/~nmanjiki/Welcome.html">N. Manjikian </a> <LI> <A href="http://www.cs.toronto.edu/~neto">D. Neto </a> <LI> <A href="http://www.cs.toronto.edu/~eparsons">E. Parsons</A> <LI> <A href="http://www.cs.toronto.edu/~phan">G. Phan</A> <LI> <A href="http://www.eecg.toronto.edu/~gravin/Welcome.html">G. Ravindran </a> <LI> <A href="http://www.eecg.toronto.edu/~reid/Welcome.html">K. Reid</A> <LI> <A href="http://www.eecg.toronto.edu/~reza">R. Solymaani</A> <LI> <A href="../~saghir.html">M. Saghir</A> <LI> <A href="../~steffan.html">G. Steffan</A> <LI> <A href="http://www.eecg.toronto.edu/~stoodla">M. Stoodley</A> <LI> <A href="http://www.eecg.toronto.edu/~tandri/Welcome.html">S. Tandri</a> <LI> <A href="http://www.eecg.toronto.edu/~zeljko/Welcome.html">Z. Zilic </a> </UL> <P> <H2>Staff</H2> <UL> <LI> <A href="http://www.eecg.toronto.edu/~caranci/Welcome.html">S. Caranci</A> <LI> <A href="http://www.eecg.toronto.edu/~okrieg/Welcome.html">O. Krieger </a> <LI> <A href="http://www.eecg.toronto.edu/~kelvin/Welcome.html">K. Loveless </a> <LI> <A href="../~peterp/Welcome.html">P. Pereira </a> </UL> <H2>Graduates</H2> <UL> <LI> <A href="http://www.cs.ualberta.ca/~unrau">R. Unrau</A> <LI> <A href="http://www.cs.yorku.ca/People/brecht">T. Brecht</A> <LI> <A href="http://www.eecg.toronto.edu/~okrieg/Welcome.html">O. Krieger </a> <LI> <A href="http://www.cs.yorku.ca/People/hsandhu">H. Sandhu</A> <LI> <A href="http://www.eecg.toronto.edu/~hui">H. Li</A> </UL> <A HREF="Welcome.html"> <img align="left" src="images/homeblue.gif"> Return to Parallel Systems Home </A> </BODY> </HTML> |
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/publications.html version [7c0b03c097].
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 |
<TITLE>Publications</TITLE> <A NAME="BEG"> </A> <!-- Changed by: Orran Y. Krieger, 12-Nov-1995 --> <body bgcolor="#ffffff" text="#000000" link="#0000ff" vlink="#aaaaff" alink="#0077FF"> </body> <FONT SIZE=4> <BR> <center> <H1> <img align="left" src="../EECG/RESEARCH/ParallelSys/images/journal.gif"> Publications</H1> </center> <BR> <BR> <P> Most of these papers can also be <A HREF="./../../manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel">accessed via ftp</A>. A full, un-sorted, <A HREF="pubs_abs.html">list of publications with abstracts</A> is also available. <P> <em> <img align="left" src="../EECG/RESEARCH/ParallelSys/images/comments.gif"><A HREF="mailto:kulki@cs.toronto.edu"> Please send your suggestions and comments.</A> <p> <BR> <em> Group members: Whenever you have a paper for public eyes please mail <strong> kulki@eecg.toronto.edu </strong> a text with author and source information, text abstract, and a postscript file with embedded source information. </em> <P> <BR> <BR> <BR> <center> <A HREF="publications.html#ca"> <img align="center" src="../EECG/RESEARCH/ParallelSys/images/arch_button.gif"></A> <P> <A HREF="publications.html#os"> <img align="center" src="../EECG/RESEARCH/ParallelSys/images/os_button.gif"></A> <P> <A HREF="publications.html#compilers"> <img align="center" src="../EECG/RESEARCH/ParallelSys/images/comp_button.gif"></A> <P> <A HREF="publications.html#scheduling"> <img align="center" src="../EECG/RESEARCH/ParallelSys/images/sch_button.gif"></A> <P> <A HREF="publications.html#pe"> <img align="center" src="../EECG/RESEARCH/ParallelSys/images/perf_button.gif"></A> <P> <A HREF="publications.html#db"> <img align="center" src="../EECG/RESEARCH/ParallelSys/images/data_button.gif"></A> </center> <P> <BR> <BR> <BR> <BR> <BR> <P> <A HREF="Welcome.html"> <img align="left" src="images/homeblue.gif"> Return to Parallel Systems Home </A> <P> <img src="images/redline.GIF"> <p> <A NAME="ca"><H2>Computer Architecture</H2></A> <UL> <LI> <A HREF="pubs_abs.html#Ravi_Stumm_JIEICE96"> A Comparison of Blocking and Non-blocking Packet Switching Techniques in Hierarchical Ring Networks </A> <br> IEICE Trans 1996 <LI> <A HREF="pubs_abs.html#Ravi_Stumm_ICPP95"> Hierarchical Ring Topologies and the effect of their Bisection Bandwidth Constraints </A> <br> ICPP 1995 <LI> <A HREF="pubs_abs.html#Wilton_Vranesic_SPDP">Architectural Support for Block Transfers in a Shared-Memory Multiprocessor</A> <br> SPDP 1993 <LI> <A HREF="pubs_abs.html#Stumm_Vranesic_White_IPPS93">Experience with the Hector Multiprocessor</A> <br> IPPS 1993 <LI> <A HREF="pubs_abs.html#Vranesic_etal_IEEEC">Hector -- A hierarchically structured shared memory multiprocessor</A> <br> IEEE Computer 1991 <LI> <A HREF="pubs_abs.html#Holliday_Stumm_IEEETC">Performance Evaluation of Hierarchical Ring-Based Shared Memory Multiprocessors </A> <br> IEEE Trans. Computer 1992 </UL> <P> <A HREF="publications.html#BEG" > Return to the LIST</A> <P> <img src="images/redline.GIF"> <p> <A NAME="os"><H2>Operating Systems</H2></A> <UL> <LI> <A HREF="pubs_abs.html#Orran_etal_SPDPW95"> Exploiting Mapped Files for Parallel I/O </A> <br> 1995 SPDP Workshop on Modeling and Specification of I/O (MSIO) <LI> <A HREF="pubs_abs.html#Parsons_etal_IWOOS95">(De-)Clustering Objects for Multiprocessor System Software </A> <br> IWOOS95 Workshop <LI> <A HREF="pubs_abs.html#Unrau_etal_EuroPar95">On the Scalability of Demand-Driven Parallel Systems</A> <br> EuroPar 95 <LI> <A HREF="pubs_abs.html#Ben_etal_OOPSLAW94">The Importance of Performance-Oriented Flexibility in System Software for Large-Scale Shared-Memory Multiprocessors </A> <br> OOPSLA 94 Workshop on Flexible System Software <LI> <A HREF="pubs_abs.html#Unrau_etal_OSDI94"> Experiences with Locking in a NUMA Multiprocessor Operating System Kernel </A> <br> OSDI 1994 <LI> <A HREF="pubs_abs.html#Unrau_etal_JSC94">Hierarchical clustering: A structure for scalable multiprocessor operating system design</A> <br> Journal of Supercomputing 1995 <LI> <A HREF="pubs_abs.html#Okrieg_PhD">HFS: A flexible file system for shared-memory multiprocessors</A> <br> PhD thesis, 1994 <LI> <A HREF="pubs_abs.html#Krieger_etal_IEEEComp94">The Alloc Stream Facility: A redesign of application-level Stream I/O</A> <br> IEEE Computer 1994 <LI> <A HREF="pubs_abs.html#Gamsa_etal_ICPP94">Optimizing IPC Performance for Shared-Memory Multiprocessors</A> <br> ICPP 1994 <LI> <A HREF="pubs_abs.html#Sandhu_et_al_PPOPP">The shared regions approach to software cache coherence on multiprocessors</A> <br> PPoPP 1993 <LI> <A HREF="pubs_abs.html#Krieger_Stumm_DAGS93">HFS: A Flexible File System for Large-Scale Multiprocessors</A> <br> DAGS 1993 <LI> <A HREF="pubs_abs.html#Krieger_etal_ICPP93">A Fair Fast Scalable Reader-Writer Lock</A> <br> ICPP 1993 <LI> <A HREF="pubs_abs.html#Stumm_Unrau_Krieger_USENIX92">Hierarchical Clustering: A Structure for Scalable Multiprocessor Operating System Design</A> <br> USENIX 1992 <LI> <A HREF="pubs_abs.html#Gamsa_MASc">Region-Oriented Main Memory Management in Shared-Memory NUMA Multiprocessors</A> <br> MASc thesis, 1992 <LI> <A HREF="pubs_abs.html#Unrau_PhD">Scalable Memory Management through Hierarchical Symmetric Multiprocessing</A> <br> PhD thesis, 1992 </UL> <P> <A HREF="publications.html#BEG" > Return to the LIST</A> <P> <img src="images/redline.GIF"> <p> <A NAME="compilers"><H2>Compilers</H2></A> <UL> <LI> <A HREF="pubs_abs.html#Tandri_Abdel_PDPTA95">Computation and Data Partitioning on Scalable Shared Memory Multiprocessors</A> PDPTA, November 1995 <LI> <A HREF="pubs_abs.html#Kulkarni_Stumm_Tut">Loop and Data Transformations:Tutorial</A> CSRI Tech Report 337, June 1993 <LI> <A HREF="pubs_abs.html#Li_Tandri_et">Locality and Loop Scheduling on Numa Multiprocessors</A> <br> ICPP 92 <LI> <A HREF="pubs_abs.html#Manjikian_Abdelrahaman_315">Fusion of Loops for Parallelism and Locality</A> <br> Tech Report <LI> <A HREF="pubs_abs.html#Kulkarni_Stumm_LCR95">CDA Loop Transformations</A> <br> 3rd LCR Workshop <LI> <A HREF="pubs_abs.html#Kulkarni_Stumm_Unrau_EuroPar95">Implementing Flexible Computation Rules with Subexpression-level Loop Transformations</A> <br> EuroPar 95 <LI> <A HREF="pubs_abs.html#Kulkarni_etal_317">A Generalized Theory of Linear Loop Transformations</A> <br> Tech Report <LI> <A HREF="pubs_abs.html#Kulkarni_Stumm_292">Computational Alignment: A new, unified program transformation for local and global optimization</A> <br> Tech Report <LI> <A HREF="pubs_abs.html#Kulkarni_Stumm_ACJ95">Linear Loop Transformations in Optimizing Compilers for Parallel Machines</A> <br> Australian Computer Journal <LI> <A HREF="pubs_abs.html#Kumar_Kulkarni_ICS92">Deriving Good Transformations for Mapping Nested Loops on Hierarchical Parallel Machines in Polynomial Time</A> <br> ICS 92 <LI> <A HREF="pubs_abs.html#Kumar_Kulkarni_ICPP91">Generalized Unimodular Loop Transformations for Distributed Memory Multiprocessors</A> (does not contain figures) <br> ICPP 91 </UL> <P> <A HREF="publications.html#BEG" > Return to the LIST</A> <P> <img src="images/redline.GIF"> <p> <A NAME="scheduling"><H2>Scheduling</H2></A> <UL> <LI> <A HREF="pubs_abs.html#Zhou_Brecht_SM91">Processor Pool-Based Scheduling for Large-Scale NUMA Multiprocessors</A> <br> Sigmetrics 91 <LI> <A HREF="pubs_abs.html#Brecht_SEDMS93">On the Importance of Parallel Application Placement in NUMA Multiprocessors</A> <br> SEDM 93 <LI> <A HREF="pubs_abs.html#Sevcik_JPE">Application Scheduling and Processor Allocation in Multiprogrammed Parallel Processing Systems</A> <br> (Journal of) Performance Evaluation 94 <LI> <A HREF="pubs_abs.html#Curran_Stumm_CS">A Comparison of basic CPU Scheduling Algorithms for Multiprocessor Unix</A> <br> (Journal) Computer Systems 90 <LI> <A HREF="pubs_abs.html#Brecht_PhD_303">Multiprogrammed Parallel Application Scheduling in NUMA Multiprocessors</A> <br> PhD thesis <LI> <A HREF="pubs_abs.html#Wu_MASc">Processor Scheduling in Multiprogrammed Shared Memory NUMA Multiprocessors</A> </UL> <P> <A HREF="publications.html#BEG" > Return to the LIST</A> <P> <img src="images/redline.GIF"> <p> <A NAME="pe"><H2>Performance Evaluation</H2></A> <UL> <LI> <A HREF="pubs_abs.html#Sevcik_Zhou_PERF93">Performance Benefits and Limitations of Large NUMA Multiprocessors</A> <br> Performance 93 <LI> <A HREF="pubs_abs.html#Harz_Sevcik_SC93">Hot Spot Analysis in Large Scale Shared Memory Multiprocessors</A> <br> Supercomputing 93 <LI> <A HREF="pubs_abs.html#Holliday_Stumm_IEEETC">Performance Evaluation of Hierarchical Ring-Based Shared Memory Multiprocessors </A> <br> IEEE Trans. Computer 1992 <LI> <A HREF="pubs_abs.html#Parsons_Sevcik_IPPS95">Multiprocessor Scheduling for High-Variability Service Time Distributions </A <br> IPPS Workshop on Job Scheduling 95 </UL> <P> <A HREF="publications.html#BEG" > Return to the LIST</A> <P> <img src="images/redline.GIF"> <p> <A NAME="db"><H2>Database Systems</H2></A> <UL> <LI> <A HREF="pubs_abs.html#Baru_Zilio_PADS93">Data reorganization in parallel database systems</A> <br> IEEE Workshop PADS 93 </UL> <P> <A HREF="publications.html#BEG" > Return to the LIST</A> <P> <A HREF="Welcome.html"> <img align="left" src="images/homeblue.gif"> Return to Parallel Systems Home </A> |
Added wiki_references/2017/software/eecg_toronto_edu/2017_05_12_wget_copy_of_http_www_eecg_toronto_edu_parallel_publications_html/bonnet/www.eecg.toronto.edu/parallel/pubs_abs.html version [aa83589fa9].
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 |
<!---------------------------------------------------------------------> <HR><A NAME="Ravi_Stumm_JIEICE96">.</A><HR> <B>Title:</B> <A HREF="./../../manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Ravi_Stumm_JIEICE96.ps.Z"> A Comparison of Blocking and Non-blocking Packet Switching Techniques in Hierarchical Ring Networks </A> <P> <B>Authors:</B> G. Ravindran and M. Stumm <P> <B>Where:</B> IEICE Trans. Inf. & Syst., vol. E79-D, No. 8, August 1996 <P> <B>Keywords:</B> Networks, Switching, Wormhole, Virtual Cut-through, Hierarchical Ring Networks, Slotted Rings <P> <B>Abstract:</B> This paper presents the results of a simulation study of blocking and non-blocking switching for hierarchical ring networks. The switching techniques include wormhole, virtual cut-through, and slotted ring. We conclude that slotted ring network performs better than the more popular wormhole and virtual cut-through networks. We also show that the size of the node buffers is an important parameter and that choosing them too large can hurt performance in some cases. Slotted rings have the advantage that the choice of buffer size is easier in that larger than necessary buffers do not hurt performance and hence a single choice of buffer size performs well for all system configurations. In contrast, the optimal buffer size for virtual cut-through and wormhole switching nodes varies depending on the system configuration and the level in the hierarchy in which the switching node lies. <P> <!---------------------------------------------------------------------> <HR><A NAME="Ravi_Stumm_ICPP95">.</A><HR> <B>Title:</B> <A HREF="./../../manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Ravi_Stumm_ICPP95.ps.Z"> Hierarchical Ring Topologies and the effect of their Bisection Bandwidth Constraints</A> <P> <B>Authors:</B> G. Ravindran and M. Stumm <P> <B>Where:</B> Proc. Intl. Conf. on Parallel Processing, pp.I/51-55, 1995 <P> <B>Keywords:</B> Multiprocessor architectures, Interconnection networks, Hierarchical rings, Bisection bandwidth <P> <B>Abstract:</B> Ring-based hierarchical networks are interesting alternatives to popular direct networks such as 2D meshes or tori. They allow for simple router designs, wider communications paths, and faster networks than their direct network counterparts. However, they have a constant bisection bandwidth, regardless of system size. In this paper, we present the results of a simulation study to determine how large hierarchical ring networks can become before their performance deteriorates due to their bisection bandwidth constraint. We show that a system with a maximum of 128 processors can sustain most memory access behaviors, but that larger systems can be sustained, only if their bisection bandwidth is increased. <P> <!---------------------------------------------------------------------> <HR><A NAME="Zhou_Brecht_SM91">.</A><HR> <B>Title:</B> <A HREF="./../../manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Zhou_Brecht_SM91.ps.Z">Processor Pool-Based Scheduling for Large-Scale NUMA Multiprocessors</A> <P> <B>Authors:</B> Songnian Zhou and Timothy Brecht <P> <B>Where:</B> Appears in: Proceedings of the 1991 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, May (1991), pp. 133-142. <P> <B>Keywords:</B> NUMA, Schedulling, multiprocessor performance <P> <B>Abstract:</B> <P> Large-scale Non-Uniform Memory Access (NUMA) multiprocessors are gaining increased attention due to their potential for achieving high performance through the replication of relatively simple components. Because of the complexity of such systems, scheduling algorithms for parallel applications are crucial in realizing the performance potential of these systems. In particular, scheduling methods must consider the scale of the system, with the increased likelihood of creating bottlenecks, along with the NUMA characteristics of the system, and the benefits to be gained by placing threads close to their code and data. <P> We propose a class of scheduling algorithms based on processor pools. A processor pool is a software construct for organizing and managing a large number of processors by dividing them into groups called pools. The parallel threads of a job are run in a single processor pool, unless there are performance advantages for a job to span multiple pools. Several jobs may share one pool. Our simulation experiments show that processor pool-based scheduling may effectively reduce the average job response time. The performance improvements attained by using processor pools increase with the average parallelism of the jobs, the load level of the system, the differentials in memory access costs, and the likelihood of having system bottlenecks. As the system size increases, while maintaining the workload composition and intensity, we observed that processor pools can be used to provide significant performance improvements. We therefore conclude that processor pool-based scheduling may be an effective and efficient technique for scalable systems. <!---------------------------------------------------------------------> <HR><A NAME="Brecht_SEDMS93">.</A><HR> <B>Title:</B> <A HREF="./../../manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Brecht_SEDMS93.ps.Z">On the Importance of Parallel Application Placement in NUMA Multiprocessors</A> <P> <B>Authors:</B> Timothy Brecht <P> <B>Where:</B> Proceedings of the Fourth Symposium on Experiences with Distributed and Multiprocessor Systems (SEDMS IV), San Diego, CA, September, 1993. <P> <B>Keywords:</B> NUMA, multiprocessor scheduling, multiprocessor performance <P> <B>Abstract:</B> <P> The thesis of this paper is that scheduling decisions in large-scale, shared-memory, NUMA (Non-Uniform Memory Access) multiprocessors must consider not only how many processors, but also which processors to allocate to each application. We call the problem of assigning parallel processes of an application to processors application placement. <P> We explore the importance of placement decisions by measuring the execution time of several parallel applications using different placements on a shared-memory NUMA multiprocessor. The results of these experiments lead us to conclude that, as expected, in small- scale mildly NUMA multiprocessors, placement decisions have only a minor affect on the execution time of parallel applications. However, the results also show that placement decisions in large-scale multiprocessors are critical and localization that considers the architectural clusters inherent in these systems is essential. Our experiments also show that the importance of placement decisions increases substantially with the size and NUMAness of the system and that the placement of individual processes of an application within the subset of chosen processors also significantly impacts performance. <!---------------------------------------------------------------------> <HR> <A NAME="Kumar_Kulkarni_ICPP91">.</A> <HR> <B>Title:</B> <A HREF="./../../manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Kumar_Kulkarni_ICPP91.ps.Z">Generalized Unimodular Loop Transformations for Distributed Memory Multiprocessors</A> (does not contain figures) <P> <B>Authors:</B> K G Kumar*, D Kulkarni+ and A Basu <BLOCKQUOTE> Center for Development of Advanced Computing 2/1 Brunton Road, Bangalore 560 025, India<BR> * Now at IBM TJ Watson, York Town Heights, NY 10598<BR> + Now at Dept of Computer Science, University of Toronto, Toronto, ON M5S 1A4<BR> </BLOCKQUOTE> <P> <B>Where:</B> International Conference of Parallel Processing -91 <P> <B>Keywords:</B> Parallelizing Compilers, Restructuring Transformations, Loop Partitioning, Iteration Spaces, Dependence Vectors. <P> <B>Abstract:</B> <P> In this paper, we present a generalized unimodular loop transformation as a simple, systematic and elegant method for partitioning the iteration spaces of nested loops for execution on distributed memory multiprocessors. We present a methodology for deriving the transformations that internalize multiple dependences in a multidimensional iteration space without resulting in a deadlocking situation. We then derive the general expression for the bounds of the transformed loops in terms of the bounds of the original space and the transformation matrix elements. <!---------------------------------------------------------------------> <HR> <A NAME="Kumar_Kulkarni_ICS92">.</A> <HR> <B>Title:</B> <A HREF="./../../manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Kumar_Kulkarni_ICS92.ps.Z">Deriving Good Transformations for Mapping Nested Loops on Hierarchical Parallel Machines in Polynomial Time</A> <P> <B>Authors:</B> K G Kumar*, D Kulkarni+ and A Basu <BLOCKQUOTE> Center for Development of Advanced Computing 2/1 Brunton Road, Bangalore 560 025, India<BR> * Now at IBM TJ Watson, York Town Heights, NY 10598<BR> + Now at Dept of Computer Science, University of Toronto, Toronto, ON M5S 1A4<BR> </BLOCKQUOTE> <P> <B>Where:</B> International Conference on Supercomputing 92 <P> <B>Keywords:</B> Parallelizing Compilers, Restructuring Transformations, Loop Partitioning, Iteration Spaces, Dependence Vectors. <P> <B>Abstract:</B> <P> We present a computationally efficient method for deriving the most appropriate transformation and mapping of a nested loop for a given hierarchical parallel machine. This method is in the context of our systematic and general theory of unimodular loop transformations for the problem of iteration space partitioning \cite{kandk6}. Finding an optimal mapping or an optimal associated unimodular transformation is NP-complete. We present a polynomial time method for obtaining a `good' transformation using a simple parameterized model of the hierarchical machine. We outline a systematic methodology for obtaining the most appropriate mapping. <!---------------------------------------------------------------------> <HR> <A NAME="Li_Tandri_et">.</A> <HR> <B>Title:</B> <A HREF="./../../manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Li_Tandri_et.ps.Z">Locality and Loop Scheduling on Numa Multiprocessors</A> <P> <B>Authors:</B> Hui Li, Sudarsan Tandri Michael Stumm, and Kenneth C. Sevcik <P> <B>Where:</B> International Conference on Parallel Processing 93 <P> <B>Keywords:</B> NUMA multiprocessors, Locality, Scheduling <P> <B>Abstract:</B> <P> An important issue in the parallel execution of loops is how to partition and schedule the loops onto the available processors. While most existing dynamic scheduling algorithms manage load imbalances well, they fail to take locality into account and therefore perform poorly on parallel systems with non-uniform memory access times. In this paper, we propose a new loop scheduling algorithm, Locality-based Dynamic Scheduling (LDS), that exploits locality, and dynamically balances the load. <!---------------------------------------------------------------------> <HR> <A NAME="Sandhu_et_al_PPOPP">.</A> <HR> <B>Title:</B> <A HREF="./../../manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Sandhu_et_al_PPOPP.ps.Z">The shared regions approach to software cache coherence on multiprocessors</A> <P> <B>Authors:</B> Harjinder Sandhu, Benjamin Gamsa and Songnian Zhou <P> <B>Where:</B> Proceedings of the 1993 ACM SIGPLAN Symposium on Principles and Pranctice of Parallel Programming, May (1993). <P> <B>Keywords:</B> NUMA, cache coherence, multiprocessor performance <P> <B>Abstract:</B> <P> The effective management of caches is critical to the performance of applications on shared-memory multiprocessors. In this paper, we discuss a technique for software cache coherence that is based upon the integration of a program-level abstraction for shared data with software cache management. The program-level abstraction, called <EM>Shared Regions</EM>, explicitly relates synchronization objects with the data they protect. Cache coherence algorithms are presented which use the information provided by shared region primitives, and ensure that shared regions are always cacheable by the processors accessing them. Measurements and experiments of the Shared Region approach on a shared-memory multiprocessor are shown. Comparisons with other software based coherence strategies, including a user-controlled strategy and an operating system-based strategy, show that this approach is able to deliver better performance, with relatively low corresponding overhead and only a small increase in the programming effort. Compared to a compiler-based coherence strategy, the Shared Regions approach still performs better than a compiler that can achieve 90\% accuracy in allowing cacheing, as long as the regions are a few hundred bytes or larger, or they are re-used a few times in the cache. <!---------------------------------------------------------------------> <HR> <A NAME="Wilton_Vranesic_SPDP">.</A> <HR> <B>Title:</B> <A HREF="./../../manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Wilton_Vranesic_SPDP.ps.Z">Architectural Support for Block Transfers in a Shared-Memory Multiprocessor</A> <P> <B>Authors:</B> Steven J.E. Wilton and Zvonko G. Vranesic <P> <B>Where:</B> Fifth IEEE Symposium on Parallel and Distributed Processing, Irving, Texas, December 1993 <P> <B>Keywords:</B> Shared-memory multiprocessor, block transfer support <P> <B>Abstract:</B> <P> This paper examines how the performance of a shared-memory multiprocessor can be improved by including hardware support for block transfers. A system similar to the Hector multiprocessor developed at the University of Toronto is used as a base architecture. It is shown that such hardware support can improve the performance of initialization code by as much as 50%, but that the amount of improvement depends on the memory access behavior of the program and the way in which the operating system issues block transfer requests. <!---------------------------------------------------------------------> <HR> <A NAME="Sevcik_Zhou_PERF93">.</A> <HR> <B>Title:</B> <A HREF="./../../manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Sevcik_Zhou_PERF93.ps.Z">Performance Benefits and Limitations of Large NUMA Multiprocessors</A> <P> <B>Authors:</B> Kenneth C. Sevcik and Songnian Zhou <P> <B>Where:</B> Proceedings of Performance '93 , Rome, Italy, September 27 to October 1, 1993, pp. 183-204, Elsevier Science Publ. <P> <B>Abstract:</B> Please see the postscript file. <!---------------------------------------------------------------------> <HR> <A NAME="Harz_Sevcik_SC93">.</A> <HR> <B>Title:</B> <A HREF="./../../manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Harz_Sevcik_SC93.ps.Z">Hot Spot Analysis in Large Scale Shared Memory Multiprocessors</A> <P> <B>Authors:</B> Karim Harzallah and Kenneth C. Sevcik <P> <B>Where:</B> Proceedings of the Supercomputing '93 Conference, November, 1993, Portland, Oregon. <P> <B>Abstract:</B> Please see the postscript file. <!---------------------------------------------------------------------> <HR> <A NAME="Sevcik_JPE">.</A> <HR> <B>Title:</B> <A HREF="./../../manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Sevcik_JPE.ps.Z">Application Scheduling and Processor Allocation in Multiprogrammed Parallel Processing Systems</A> <P> <B>Authors:</B> Kenneth C. Sevcik <P> <B>Where:</B> (Journal of) Performance Evaluation, vol. 19 (1994), pp. 107-140 (Special issue on the performance evaluation of parallel systems) <P> <B>Abstract:</B> Please see the postscript file. <!---------------------------------------------------------------------> <HR> <A NAME="Holliday_Stumm_IEEETC">.</A> <HR> <B>Title:</B> <A HREF="./../../manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Holliday_Stumm_IEEETC.ps.Z">Performance Evaluation of Hierarchical Ring-Based Shared Memory Multiprocessors</A> <P> <B>Authors:</B> <BR> Mark Holliday<BR> Dept. of Computer Science, Duke University, Durham, NC 27706 <P> Michael Stumm<BR> Dept. of Electrical and Computer Engineering<BR> University of Toronto, Toronto, Canada M5S 1A4 <P> <B>Date:</B> November 1992; revised April 1993 <P> <B>Where:</B> Technical Report CS-1992-18, Duke University<BR> IEEE Transactions on Computers <P> <B>Keywords:</B> communication locality; hierarchical ring-based networks; hot spots; large scale parallel systems; memory banks; performance evaluation; prefetching; shared memory multiprocessors; simulation. <P> <B>Abstract:</B> <P> This paper investigates the performance of word-packet, slotted unidirectional ring-based hierarchical direct networks in the context of large-scale shared memory multiprocessors. Slotted unidirectional rings are attractive because their electrical characteristics and simple interfaces allow for fast cycle times and large bandwidths. For large-scale systems, it is necessary to use multiple rings for increased aggregate bandwidth. Hierarchies are attractive because the topology ensures unique paths between nodes, simple node interfaces and simple inter-ring connections. <P> To ensure that a realistic region of the design space is examined, the architecture of the network used in the Hector prototype is adopted as the initial design point. A simulator of that architecture has been developed and validated with measurements from the prototype. The system and workload parameterization reflects conditions expected in the near future. <P> The results of our study show the importance of system balance on performance. Large-scale systems inherently have large communication delays for distant accesses, so processor efficiency will be low, unless the processors can operate with multiple outstanding transactions using techniques such as prefetching, asynchronous writes and multiple hardware contexts. However with multiple outstanding transactions and only one memory bank per processing module, memory quickly saturates. Memory saturation can be alleviated by having multiple memory banks per processing module, but this shifts the bottleneck to the ring subsystem. While the topology of the ring hierarchy affects performance, the ring subsystem will inherently limit the throughput of the system. Hence increasing the number of outstanding transactions per processor beyond a certain point only has a limiting effect on performance, since it causes some of the rings to become congested. An adaptive maximum number of outstanding transactions appears necessary to adjust for the appropriate tradeoff between concurrency and contention as the communication locality changes. We show the relationships between processor, ring and memory speeds, and their effects on performance. Using block transfers for prefetching seems unlikely to be advantageous in that the improvement in the cache hit ratio needed to compensate for the increased network utilization is substantial. <!---------------------------------------------------------------------> <HR> <A NAME="Curran_Stumm_CS">.</A> <HR> <B>Title:</B> <A HREF="./../../manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Curran_Stumm_CS.ps.Z">A Comparison of basic CPU Scheduling Algorithms for Multiprocessor Unix</A> <P> <B>Authors:</B> Stephen Curran and Michael Stumm <P> <B>Where:</B> Computer Systems, 3(4), Oct., 1990, pp. 551--579. <P> <B>Abstract:</B> <P> In this paper, we present the results of a simulation study comparing three basic algorithms that schedule independent tasks in multiprocessor versions of Unix. Two of these algorithms, namely Central Queue and Initial Placement, are obvious extensions to the standard uniprocessor scheduling algorithm and are in use in a number of multiprocessor systems. A third algorithm, Take, is a variation on Initial Placement, where processors are allowed to raid the task queues of the other processors. Our simulation results show the difference between the performance of the three algorithms to be small when scheduling a typical Unix workload running on a small, bus-based, shared memory multiprocessor. They also show that the Take algorithm performs best for those multiprocessors on which tasks incur overhead each time they migrate. In particular, the Take algorithm appears to be more stable than the other two algorithms under extreme conditions. <!---------------------------------------------------------------------> <HR> <A NAME="Stumm_Unrau_Krieger_USENIX92">.</A> <HR> <B>Title:</B> <A HREF="./../../manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Stumm_Unrau_Krieger _USENIX92.ps.Z">Hierarchical Clustering: A Structure for Scalable Multiprocessor Operating System Design</A> <P> <B>Authors:</B> Michael Stumm, Ron Unrau, and Orran Krieger <P> <B>Where:</B> Extended version of Clustering Micro-Kernels for Scalability, Proc. of the Usenix Workshop on Micro-Kernels and Other Kernel Architectures, April, 1992. <P> <B>Abstract:</B> Please see the postscript file. <P> <!---------------------------------------------------------------------> <HR> <A NAME="Stumm_Vranesic_White_IPPS93">.</A> <HR> <B>Title:</B> <A HREF="./../../manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Stumm_Vranesic_White_IPPS93.ps.Z">Experience with the Hector Multiprocessor</A> <P> <B>Authors:</B> Michael Stumm, Zvonko Vranesic, Ron White <P> <B>Where:</B> Extended version of paper with same title in Proc. Intl. Parallel Processing Symposium Parallel Systems Fair, 1993, pp. 9-16. <P> <B>Abstract:</B> Please see the postscript file. <P> <!---------------------------------------------------------------------> <HR> <A NAME="Krieger_etal_IEEEComp94">.</A> <HR> <B>Title:</B> <A HREF="./../../manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Krieger_etal_IEEEComp94.ps.Z">The Alloc Stream Facility: A redesign of application-level Strea m I/O</A> <P> <B>Authors:</B> O. Krieger, M. Stumm, and R. Unrau <P> <B>Where:</B>IEEE Computer, 27(3), March, 1994, pp. 75--83. <P> <B>Abstract:</B> <P> This paper introduces a new application level I/O facility called the Alloc Stream Facility (ASF). ASF has several key advantages. First, performance is substantially improved as a result of a)~the structure of the facility that allows it to take advantage of system specific features like mapped files, and b)~a reduction in data copying and the number of I/O system calls. Second, the facility is designed for multi-threaded applications running on multiprocessors and allows for a high degree of concurrency. Finally, the facility can support a variety of I/O interfaces, including stdio, emulated Unix I/O, ASI, and C++ streams, in a way that allows applications to freely intermix calls to the different interfaces, resulting in improved code re-usability. We show that on several Unix workstation platforms, I/O intensive applications perform substantially better when linked to ASF instead of the native facilities -- in the best case, up to twice as good. Modifying the applications to use a new interface provided with ASF can improve performance even more. <P> <!---------------------------------------------------------------------> <HR> <A NAME="Krieger_Stumm_DAGS93">.</A> <HR> <B>Title:</B> <A HREF="./../../manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Krieger_Stumm_DAGS93.ps.Z">HFS: A Flexible File System for Large-Scale Multiprocessors</A> <P> <B>Authors:</B> Orran Krieger and Michael Stumm <P> <B>Where:</B> Proceedings of the 1993 DAGS/PC Symposium <P> <B>Abstract:</B> <P> The Hurricane File System (HFS) is a new file system being developed for large-scale shared memory multiprocessors with distributed disks. The main goal of this file system is scalability; that is, the file system is designed to handle demands that are expected to grow linearly with the number of processors in the system. To achieve this goal, HFS is designed using a new structuring technique called Hierarchical Clustering. HFS is also designed to be flexible in supporting a variety of policies for managing file data and for managing file system state. This flexibility is necessary to support in a scalable fashion the diverse workloads we expect for a multiprocessor file system. <!---------------------------------------------------------------------> <HR> <A NAME="Krieger_etal_ICPP93">.</A> <HR> <B>Title:</B> <A HREF="./../../manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Krieger_etal_ICPP93.ps.Z">A Fair Fast Scalable Reader-Writer Lock</A> <P> <B>Authors:</B> O. Krieger, M. Stumm, R. Unrau, and J. Hanna <P> <B>Where:</B> Proc. Intl. Conf. on Parallel Processing, 1993. <P> <B>Abstract:</B> <P> A reader-writer lock allows either multiple readers to inspect shared data or a single writer exclusive access to that data. On shared memory multiprocessors, the cost of acquiring and releasing these locks can have a large impact on the performance of parallel applications. Other researchers have shown how to implement scalable locks, that is, locks that can become contended without resulting in memory or interconnection network contention. This paper describes a new algorithm for a reader-writer lock that, while being scalable in the contended case, has a low overhead in the uncontended case. This is important because most parallel applications are written so that locks are typically uncontended. The only atomic operation required by this algorithm is fetch_and_store and hence it can be used on most current multiprocessor systems. Experimental results are provided. <!---------------------------------------------------------------------> <HR> <A NAME="Kulkarni_Stumm_Tut">.</A> <HR> <B>Title:</B> <A HREF="./../../manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Kulkarni_Stumm_Tutorial.ps.Z">Loop and Data Transformations: A tutorial</A> <P> <B>Authors:</B> Dattatraya Kulkarni and Michael Stumm <P> <B>Where:</B> CSRI Tech Report 337, University of Toronto, June 1993. <P> <B>Abstract:</B> <P> Hierarchically structured machines appear to be becoming the dominant parallel computing structure. These systems have non-uniform access times. We address the problem of restructuring a possibly sequential program to execute efficiently on such parallel machines. This restructuring involves transforming and partitioning the loop structures and the data to so as to improve <EM>parallelism</EM>, <EM>static</EM> and <EM>dynamic locality</EM>, and <EM>load balance</EM>. The objective of this paper is to present previous and ongoing work on loop and data transformations and motivate a <EM>unified</EM> framework to restructuring of a sequence of loops and data so as to execute efficiently on parallel machines with several levels of hierarchy. <!---------------------------------------------------------------------> <HR> <A NAME="Baru_Zilio_PADS93">.</A> <HR> <B>Title:</B> <A HREF="./../../manually_copied_ftp_colon_doubleslash_ftp_cs_toronto_edu/parallel/Baru_Zilio_PADS93.ps.Z">Data reorganization in parallel database systems</A> <P> <B>Authors:</B> Chaitanya Baru & Daniel C. Zilio <P> <B>Where:</B> Proc. of the IEEE Workshop on Advances in Parallel and Distributed Systems}, Princeton, NJ, pp.102-107, Oct. 1993. <P> <B>Abstract:</B> <P> Parallel database systems are suitable for use in applications with high capacity and high performance and availability requirements. The trend in such systems is to provide efficient < |