Industrial Linux: Creating fast, secure, & reliable linux servers

  

Industrial Linux is dedicated to professional linux system administrators who need a portal site for the best linux news, downloads, documentation, and resources for creating fast, secure, & reliable linux servers.

Someday, these banner ads will pay my bills!
 
 

News

Geek News
Linux Security News

Mailing list archives

discuss
devel
announce

Guides

Introduction
References
Post-install customization
Security
Reliability
Performance

MLUG presentations

2000-11-11
Network layer models
History of Ethernet
IP mechanics, including the ever-popular three-way handshake!
Packet filtering strategies
ipchains packet filtering

2000-09-09
Compiling Apache with mod_perl and FastCGI
Virtual servers for Apache
Top 5 security tips for casual Linux user

rpmfind.net search


RSSLite

A perl module that parses dirty
XML from content syndicators.

Get some content from XMLTree,
preferably in OCS format,
then parse it with RSSLite.

LLMS

I built the Lazy LFS Make System
to help automate the (re-)build
of my own linux distro. LFS stands
for Linux From Scratch. Many
thanks to Gerard Beekmans
for creating this fine work!

pgcc

pgcc optimizations
pgcc main site

I got tired of not knowing how far I could push the -O flag in gcc. Did it stop at -O3, like the doc says, or were even higher settings possible? And what might they do? So I cracked open the source for pgcc, and here's what I found (original post located here).

Straight gcc stops at -O3; pgcc goes to -O7. You can specify bigger numbers (like -O9) with no harm or additional effect; it just gives you the highest setting.

For pgcc, there seems to be some schizophrenia surrounding the optimization of instruction scheduling. Normal gcc -O2 turns it on; gcc i386 -O2 then turns it off; pgcc i386 -O2 turns it back on, and pgcc i386 -O5 used to turn it on *again*, but it's commented out with a note that it hurts performance. I guess some tests with -fschedule-insn vs. -fno-schedule-insn are in order. The option is documented thusly:

"If supported for the target machine, attempt to reorder instructions to eliminate execution stalls due to required data being unavailable. This helps machines that have slow floating point or memory load instructions by allowing other instructions to be issued until the result of the load or floating point instruction is required."

pgcc 2.95.2. adds option "sibling_call", which looked interesting. You get it (and unroll_loops) with the max setting, -O7.

For pgcc 2.95.2 on an i386 platform, the following levels of optimization give you the indicated '-f' flags. Don't forget that the -On flag is just shorthand for setting these flags individually. You can use an -On flag and augment it with specific -f flags. Note: higher levels build on the earlier levels:

-O: defer_pop, thread_jumps, delayed_branch, omit_frame_pointer, opt_reg_use, reduce_index_givs
-O2: cse_follow_jumps, cse_skip_blocks, gcse, expensive_optimizations, strength_reduce, rerun_cse_after_loop, rerun_loop_opt, caller_saves, force_mem, regmove, schedule_insns, schedule_insns_after_reload
-O3: inline_functions, jump_back, copy_prop, compare_elim, sftwr_pipe, reg_reg_copy_opt, peep_spills, replace_stack_mem, opt_jumps_out, replace_mem, correct_cse_mistakes, push_load_into_loop, replace_reload_regs, sign_extension_elim, lift_stores
-O4: swap_for_agi, risc, risc_const, interleave_stack_non_stack, schedule_stack_reg_insns
-O5: runtime_lift_stores, omit_frame_pointer
-O6: all_mem_givs, do_offload, risc_mem_dest
-O7: unroll_loops, sibling_call

Here's the relevant source code:

=== toplev.c ===
  if (optimize >= 1)
    {
      flag_defer_pop = 1;
      flag_thread_jumps = 1;
#ifdef DELAY_SLOTS
      flag_delayed_branch = 1;
#endif
#ifdef CAN_DEBUG_WITHOUT_FP
      flag_omit_frame_pointer = 1;
#endif
    }

  if (optimize >= 2)
    {
      flag_cse_follow_jumps = 1;
      flag_cse_skip_blocks = 1;
      flag_gcse = 1;
      flag_expensive_optimizations = 1;
      flag_strength_reduce = 1;
      flag_rerun_cse_after_loop = 1;
      flag_rerun_loop_opt = 1;
      flag_caller_saves = 1;
      flag_force_mem = 1;
#ifdef INSN_SCHEDULING
      flag_schedule_insns = 1;
      flag_schedule_insns_after_reload = 1;
#endif
/*	flag_sibling_call = 1;*//*D*/
      flag_regmove = 1;
    }

  if (optimize >= 3)
    {
      flag_inline_functions = 1;
    }



=== config/i386/i386.c ===
optimization_options (level, size)
     int level;
     int size ATTRIBUTE_UNUSED;
{
  /* For -O2 and beyond, turn off -fschedule-insns by default.	It
tends to
     make the problem with not enough registers even worse.  */
#ifdef INSN_SCHEDULING
  if (level > 1)
    flag_schedule_insns = 0;
#endif
   optimization_options_intel1(level, size);
}

...and...

static void
optimization_options_intel1 (level, size)
     int level;
     int size;
{
  if (level > 0)
    {
      flag_opt_reg_use = 2;
      flag_reduce_index_givs = 2;
    }
  if (level >= 2)
    {
      flag_schedule_insns = 2;
      flag_schedule_insns_after_reload = 2;
    }
  if (level >= 3)
    {
      flag_inline_functions = 2;
      flag_jump_back = 2;
      flag_copy_prop = 2;
      flag_compare_elim = 2;
      flag_sftwr_pipe = 2;
      flag_reg_reg_copy_opt = 2;
      /*flag_opt_reg_stack = 2;*//*D*/
      /*flag_loop_after_global = 2;*//*D*/
      flag_peep_spills = 2;
      flag_replace_stack_mem = 2;
      flag_opt_jumps_out = 2;
      flag_replace_mem = 2;
      flag_correct_cse_mistakes = 2;
      flag_push_load_into_loop = 2;
      flag_replace_reload_regs = 2;
      flag_sign_extension_elim = 2;
      flag_lift_stores = 2;
    }
  if (level >= 4)
    {
      /*flag_combine_222 = 2;*/ /*D*/
#ifdef INSN_SCHEDULING
      flag_schedule_insns_after_reload = 2;
      flag_swap_for_agi = 2;
      flag_risc = 2;
      flag_risc_const = 2;
      /*flag_recombine = 2;*//*D*/ /* ??? actually slows down */
      flag_interleave_stack_non_stack = 2;
      flag_schedule_stack_reg_insns = 2;
#endif
    }
  if (level >= 5)
    {
      flag_runtime_lift_stores = 2; /* big space penalty */
      flag_omit_frame_pointer = 2;
#ifdef INSN_SCHEDULING
      /*flag_schedule_insns = 2;*/ /* hurts performance! */
#endif
    }
  if (level >= 6)
    {
      flag_all_mem_givs = 2;
      flag_do_offload = 2;
      flag_risc_mem_dest = 2;
    }
/*  if (level >= 7)
    {
      flag_unroll_loops = 2;
      flag_sibling_call = 2;
    }*/
}

Copyright 2000 by Scott Thomason
All trademarks property of their respective owners

Development powered by
SourceForge