User Tools

Site Tools


spo600:2024_fall_project

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
spo600:2024_fall_project [2024/11/20 09:19] – [Project Stage 2: Clone-Pruning Analysis Pass] chrisspo600:2024_fall_project [2024/12/08 03:35] (current) chris
Line 114: Line 114:
  
 Create a pass for the GCC compiler which analyzes the program being compiled and: Create a pass for the GCC compiler which analyzes the program being compiled and:
-(a) Identifies one or more functions which have been cloned; +  - Identifies one or more functions which have been cloned; 
-(b) Examines the cloned functions to determine if they are substantially the same or different; +  Examines the cloned functions to determine if they are substantially the same or different; 
-(c) Emits a message in the GCC diagnostic dump for the pass that indicates if the functions should be pruned (in the case that they're substantially the same) or not pruned (if they are different).+  Emits a message in the GCC diagnostic dump for the pass that indicates if the functions should be pruned (in the case that they're substantially the same) or not pruned (if they are different).
  
 It is recommended that you proceed in steps: It is recommended that you proceed in steps:
Line 126: Line 126:
  
 To limit complexity, you may make these assumptions: To limit complexity, you may make these assumptions:
-  There is only one cloned function in program +  There is only one cloned function in program 
-  There are only two versions (clones) of that function+  There are only two versions (clones) of that function (ignoring the function resolver)
  
-It is important that you position your compiler pass __late__ in the compilation/optimization process so that any significant optimizations are performed before your analysis. Ideally, it should be one of the last "tree" (gimple) passes performed.+It is important that you position your compiler pass __late__ in the compilation/optimization process so that any significant optimizations, such as vectorization, are performed before your analysis. Ideally, it should be one of the last "tree" (gimple) passes performed.
  
 Note that the gimple code for two identical functions may have slight variations. For example, the names of temporary variables will probably be different (because they are sequentially numbered), and generated labels in the code will probably be different (for the same reason). However, these variations by themselves should not be considered to make the function clones different. Note that the gimple code for two identical functions may have slight variations. For example, the names of temporary variables will probably be different (because they are sequentially numbered), and generated labels in the code will probably be different (for the same reason). However, these variations by themselves should not be considered to make the function clones different.
Line 136: Line 136:
  
 Please use these specific strings in your dump file: Please use these specific strings in your dump file:
-  * ''PRUNE: //name of clone//'' +  * ''PRUNE: //name of base function//'' 
-  * ''NOPRUNE: //name of clone//'' +  * ''NOPRUNE: //name of base function//'' 
-Where //name of clone// is the original name of the function that should (or should not) be pruned.+Where //name of base function// is the original name of the function that should (or should not) be pruned.
  
-=== Demo Files for Creating a GCC Pass ===+Your solution should build and execute successfully on both x86_64 and aarch64 systems, and should take into account the differences between the FMV implementations on those two architectures (for example, the munging algorithm used to create the suffixes for the cloned functions is different).
  
-Each of the [[SPO600 Servers]] has a file ''/public/spo600-gcc-pass-demo.tgz'' which is a tar archive containing modified versions of four files from the current (2024-11-20) GCC development head.+==== Demo Files for Creating a GCC Pass ==== 
 + 
 +Each of the [[SPO600 Servers]] has two files ''/public/spo600-gcc-pass-demo.tgz'' and ''spo600-gcc-pass-demo-2.tgz'' -- each is a tar archive containing modified versions of four files from the current (2024-11-20) GCC development head.
  
 These files are all from the ''gcc'' subdirectory in the source tree: These files are all from the ''gcc'' subdirectory in the source tree:
   * passes.def - One line has been added: ''NEXT_PASS (pass_ctyler);''   * passes.def - One line has been added: ''NEXT_PASS (pass_ctyler);''
   * tree-pass.h - One line has been added: ''extern gimple_opt_passs *make_pass_ctyler (gcc::context *ctxt);''   * tree-pass.h - One line has been added: ''extern gimple_opt_passs *make_pass_ctyler (gcc::context *ctxt);''
-  * tree-ctyler.cc - The actual pass code, loosly modelled on ''tree-nrv.cc''+  * tree-ctyler.cc - The actual pass code, loosly modelled on ''tree-nrv.cc'' - this is the file that is different between the two demo archives
   * Makefile.in - One line has been added to the OBJS definition: ''tree-ctyler.o \''   * Makefile.in - One line has been added to the OBJS definition: ''tree-ctyler.o \''
  
-=== Test Cases for Pruning/No-Pruning ===+Building GCC with these changes will result in a compiler that can output an additional dump, which can be triggered with ''-fdump-tree-ctyler'' (or ''-fdump-tree-all''). 
 + 
 +==== Test Cases for Pruning/No-Pruning ====
  
 Each of the [[SPO600 Servers]] has a file ''/public/spo600-test-clone.tgz'' which is a tar archive containing code to build test cases on x86_64 or aarch64 systems. On each architecture, two binaries will be built, each containing one cloned function. Building these binaries with a copy of GCC that contains your analysis pass should result in a decision to prune (for the binary ''test-clone-//arch//-prune'') or not to prune (for the binary ''test-clone-//arch//-noprune''), where ''//arch//'' is either ''x86'' or ''aarch64''. Each of the [[SPO600 Servers]] has a file ''/public/spo600-test-clone.tgz'' which is a tar archive containing code to build test cases on x86_64 or aarch64 systems. On each architecture, two binaries will be built, each containing one cloned function. Building these binaries with a copy of GCC that contains your analysis pass should result in a decision to prune (for the binary ''test-clone-//arch//-prune'') or not to prune (for the binary ''test-clone-//arch//-noprune''), where ''//arch//'' is either ''x86'' or ''aarch64''.
  
 Refer to the README.txt file within the tgz file for more detail. Refer to the README.txt file within the tgz file for more detail.
 +
 +==== Recommendations for Building GCC in Stage 2 ====
 +
 +A reminder that the ''make'' utility will rebuild a codebase in as few steps as possible. It does this by comparing the timestamps of the dependencies (inputs) for each target (output) to determine which source (or other input files) have changed since the related targets were built, and then rebuilding only those targets.
 +
 +This can effectively cut the build time for a complex project like GCC from hours to minutes. On my development system (a Ryzen 7735HS with 32 GB RAM), a null rebuild (no source changes - make is checking that everything is up-to-date) takes about 8.3 seconds, and a rebuild with edits to one pass source file take 23-30 seconds. On the [[SPO600 Servers]] the rebuild times are similar.
 +
 +To take advantage of this capability, do an initial full build of GCC in a separate build directory as usual, then make whatever required edits to the source code in the source directory. Run ''make'' with appropriate options (including ''-j'' job values) in the build directory.
 +
 +Remember to use [[Screen Tutorial|screen]] (or a similar program such as tmux) when building on remote systems in case your network connection gets interrupted, and it's a good idea to time every build (prepend ''time'' to your ''make'' command) and redirect both stdout and stderr to a log file: ''time make ... |& tee build.log'' if you also want to see the output on the terminal or ''time make ... &> build.log'' if you don't want to see the output.
 +
 +You can do your development work on either architecture, but remember to test your work on both architectures.
  
 ==== Submitting your Project Stage 2 ==== ==== Submitting your Project Stage 2 ====
Line 167: Line 183:
 ==== Due Date ==== ==== Due Date ====
  
-  * Stage 2 is due with the second batch of blog posts on December 1, 2024.+  * Stage 2 is due with the third batch of blog posts on December <s>1</s> 5, 2024. 
 + 
 +===== Project Stage 3: Tidy & Wrap ===== 
 + 
 +Bring your work to a solid conclusion. Take into account feedback received for Stage 2 and provide any documentation that is missing, incomplete, or unclear, as well as easily-testable code clearly showing your work. 
 +  * Tidy up any loose ends from your Stage 2 work. If you have an incomplete experiment or investigation that you can complete, include it in your Stage 3. 
 +  * Provide the source code for the latest version of your project in a form which permits another person to examine, build, and test the changes which you have made. This can be done in any of several different ways (in decreasing order of preference): 
 +    * Provide a public git repository from which your code can be pulled. This should be a clone of a base repository with your changes applied as one or more commits. Clearly document how to access this repository, and which branch(es) and commit(s) are of interest. 
 +    * Or, provide a set of patch files containing your changes. The easiest way to generate this is with the 'git format-patch' command. Clearly document which upstream sources (e.g., GNU git, GitHub mirror, or another upstream such as the Darwin fork) and which commit number on the upstream server the patches should be applied against. 
 +    * Alternatively, provide a tar or zip archive that can be clealy applied to the upsteam sources which provides your updates (like the demo files in ''/public''). Clearly document which upstream sources and which commit number the archive should be applied against. 
 +    * Test the access to your code (make sure that someone who is not you can access the source). 
 +  * Clearly show the state of your work: what you attempted to do, what works and what does not, and what the output (e.g., dump files) look like and how that output should be interpreted. 
 +  * Test your code with the [[#Test Cases for Pruning/No-Pruning]] and document the results. 
 +  * Describe the problems you encountered while writing your code and how you addressed those problems. 
 +  * Describe the next steps that could/should be taken to continue the investigation. 
 +  * Write up your specific reflections on the project, if there are updates since Stage 2. 
 + 
 +**Important note:** some students __did not__ demonstrate that they wrote any code for Stage 2. In order to successfully complete stage 3, you must show that you made some reasonable effort towand the goal of implementing a //Clone-Pruning Analysis Pass//. This must include some experimentation beyond the preliminary steps of building GCC and/or building the demo pass provided to you. 
 + 
 +==== Due Date ==== 
 + 
 +  * Stage 3 is due with the third batch of blog posts on December 11, 2024 (11:59 pm). 
  
spo600/2024_fall_project.1732094348.txt.gz · Last modified: 2024/11/20 09:19 by chris

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki